Diffusers 文档

调度器功能

Diffusers

加入 Hugging Face 社区

并获得增强的文档体验

协作处理模型、数据集和 Spaces

通过加速推理获得更快的示例

在文档主题之间切换

开始使用

调度器功能

调度器是任何扩散模型的重要组成部分，因为它控制着整个去噪（或采样）过程。调度器有很多类型，有些针对速度进行了优化，有些针对质量进行了优化。使用 Diffusers，您可以修改调度器配置以使用自定义噪声计划、sigmas 和重新缩放噪声计划。更改这些参数会对推理质量和速度产生深远的影响。

本指南将演示如何使用这些功能来提高推理质量。

Diffusers 目前仅对一部分调度器和 pipelines 支持 timesteps 和 sigmas 参数。如果您希望将这些参数扩展到当前不支持的调度器和 pipeline，请随时打开功能请求！

Timestep 计划

Timestep 或噪声计划决定了每个采样步骤的噪声量。调度器使用它在每个步骤生成具有相应噪声量的图像。Timestep 计划是从调度器的默认配置生成的，但是您可以自定义调度器以使用 Diffusers 中尚不存在的新的和优化的采样计划。

例如，Align Your Steps (AYS) 是一种优化采样计划的方法，可以在短短 10 步内生成高质量图像。Stable Diffusion XL 的最佳10 步计划是

from diffusers.schedulers import AysSchedules

sampling_schedule = AysSchedules["StableDiffusionXLTimesteps"]
print(sampling_schedule)
"[999, 845, 730, 587, 443, 310, 193, 116, 53, 13]"

您可以通过将 AYS 采样计划传递给 timesteps 参数，在 pipeline 中使用它。

pipeline = StableDiffusionXLPipeline.from_pretrained(
    "SG161222/RealVisXL_V4.0",
    torch_dtype=torch.float16,
    variant="fp16",
).to("cuda")
pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config, algorithm_type="sde-dpmsolver++")

prompt = "A cinematic shot of a cute little rabbit wearing a jacket and doing a thumbs up"
generator = torch.Generator(device="cpu").manual_seed(2487854446)
image = pipeline(
    prompt=prompt,
    negative_prompt="",
    generator=generator,
    timesteps=sampling_schedule,
).images[0]

AYS timestep 计划 10 步

线性间隔 timestep 计划 10 步

线性间隔 timestep 计划 25 步

Timestep 间距

在计划中选择采样步骤的方式会影响生成图像的质量，尤其是在重新缩放噪声计划方面，这可以使模型生成更亮或更暗的图像。Diffusers 提供了三种 timestep 间距方法

leading 创建均匀间隔的步骤
linspace 包括第一个和最后一个步骤，并均匀选择剩余的中间步骤
trailing 仅包括最后一个步骤，并均匀选择从末尾开始的剩余中间步骤

建议使用 trailing 间距方法，因为当采样步骤较少时，它可以生成更高质量、更多细节的图像。但是，对于更标准的采样步骤值，质量差异并不明显。

import torch
from diffusers import StableDiffusionXLPipeline, DPMSolverMultistepScheduler

pipeline = StableDiffusionXLPipeline.from_pretrained(
    "SG161222/RealVisXL_V4.0",
    torch_dtype=torch.float16,
    variant="fp16",
).to("cuda")
pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config, timestep_spacing="trailing")

prompt = "A cinematic shot of a cute little black cat sitting on a pumpkin at night"
generator = torch.Generator(device="cpu").manual_seed(2487854446)
image = pipeline(
    prompt=prompt,
    negative_prompt="",
    generator=generator,
    num_inference_steps=5,
).images[0]
image

5 步后的 trailing 间距

5 步后的 leading 间距

Sigmas

sigmas 参数是根据 timestep 计划在每个 timestep 添加的噪声量。与 timesteps 参数类似，您可以自定义 sigmas 参数以控制每个步骤添加多少噪声。当您使用自定义 sigmas 值时，timesteps 将根据自定义 sigmas 值计算，并且默认的调度器配置将被忽略。

例如，您可以手动传递sigmas，例如之前的 10 步 AYS 计划到 pipeline。

import torch

from diffusers import DiffusionPipeline, EulerDiscreteScheduler

model_id = "stabilityai/stable-diffusion-xl-base-1.0"
pipeline = DiffusionPipeline.from_pretrained(
  "stabilityai/stable-diffusion-xl-base-1.0",
  torch_dtype=torch.float16,
  variant="fp16",
).to("cuda")
pipeline.scheduler = EulerDiscreteScheduler.from_config(pipeline.scheduler.config)

sigmas = [14.615, 6.315, 3.771, 2.181, 1.342, 0.862, 0.555, 0.380, 0.234, 0.113, 0.0]
prompt = "anthropomorphic capybara wearing a suit and working with a computer"
generator = torch.Generator(device='cuda').manual_seed(123)
image = pipeline(
    prompt=prompt,
    num_inference_steps=10,
    sigmas=sigmas,
    generator=generator
).images[0]

当您查看调度器的 timesteps 参数时，您会看到它与 AYS timestep 计划相同，因为 timestep 计划是根据 sigmas 计算的。

print(f" timesteps: {pipe.scheduler.timesteps}")
"timesteps: tensor([999., 845., 730., 587., 443., 310., 193., 116.,  53.,  13.], device='cuda:0')"

Karras sigmas

有关支持 Karras sigmas 的调度器列表，请参阅调度器 API 概述。

Karras sigmas 不应用于未使用它们训练的模型。例如，基本的 Stable Diffusion XL 模型不应使用 Karras sigmas，但 DreamShaperXL 模型可以使用，因为它们是使用 Karras sigmas 训练的。

Karras 调度器使用 Elucidating the Design Space of Diffusion-Based Generative Models 论文中的 timestep 计划和 sigmas。与其他调度器相比，此调度器变体在接近采样过程结束时每步应用的噪声量较小，并且可以提高生成图像的细节水平。

通过在调度器中设置 use_karras_sigmas=True 来启用 Karras sigmas。

import torch
from diffusers import StableDiffusionXLPipeline, DPMSolverMultistepScheduler

pipeline = StableDiffusionXLPipeline.from_pretrained(
    "SG161222/RealVisXL_V4.0",
    torch_dtype=torch.float16,
    variant="fp16",
).to("cuda")
pipeline.scheduler = DPMSolverMultistepScheduler.from_config(pipeline.scheduler.config, algorithm_type="sde-dpmsolver++", use_karras_sigmas=True)

prompt = "A cinematic shot of a cute little rabbit wearing a jacket and doing a thumbs up"
generator = torch.Generator(device="cpu").manual_seed(2487854446)
image = pipeline(
    prompt=prompt,
    negative_prompt="",
    generator=generator,
).images[0]

Karras sigmas 已启用

Karras sigmas 已禁用

重新缩放噪声计划

在Common Diffusion Noise Schedules and Sample Steps are Flawed 论文中，作者发现常见的噪声计划允许一些信号泄漏到最后一个 timestep 中。推理时的这种信号泄漏可能会导致模型仅生成中等亮度的图像。通过为 timestep 计划强制执行零信噪比 (SNR) 并从最后一个 timestep 采样，可以改进模型以生成非常亮或非常暗的图像。

对于推理，您需要一个使用 v_prediction 训练的模型。要使用 v_prediction 训练您自己的模型，请将以下标志添加到 train_text_to_image.py 或 train_text_to_image_lora.py 脚本。

--prediction_type="v_prediction"

例如，加载使用 v_prediction 训练的 ptx0/pseudo-journey-v2 检查点和 DDIMScheduler。在 DDIMScheduler 中配置以下参数

rescale_betas_zero_snr=True 以将噪声计划重新缩放到零 SNR
timestep_spacing="trailing" 以从最后一个 timestep 开始采样

在 pipeline 中设置 guidance_rescale 以防止过度曝光。较低的值会增加亮度，但某些细节可能会显得模糊。

from diffusers import DiffusionPipeline, DDIMScheduler

pipeline = DiffusionPipeline.from_pretrained("ptx0/pseudo-journey-v2", use_safetensors=True)

pipeline.scheduler = DDIMScheduler.from_config(
    pipeline.scheduler.config, rescale_betas_zero_snr=True, timestep_spacing="trailing"
)
pipeline.to("cuda")
prompt = "cinematic photo of a snowy mountain at night with the northern lights aurora borealis overhead, 35mm photograph, film, professional, 4k, highly detailed"
generator = torch.Generator(device="cpu").manual_seed(23)
image = pipeline(prompt, guidance_rescale=0.7, generator=generator).images[0]
image

默认的 Stable Diffusion v2-1 图像

启用零 SNR 和 trailing timestep 间距的图像

< > 在 GitHub 上更新

←合并 LoRA Pipeline 回调→