Diffusers

加入 Hugging Face 社区

并获得增强的文档体验

在模型、数据集和 Spaces 上协作

通过加速推理获得更快的示例

在文档主题之间切换

开始使用

DEISMultistepScheduler

Diffusion Exponential Integrator Sampler (DEIS) 在 Fast Sampling of Diffusion Models with Exponential Integrator (Qinsheng Zhang 和 Yongxin Chen 提出) 中被提出。DEISMultistepScheduler 是用于扩散常微分方程 (ODE) 的快速高阶求解器。

此实现修改了对数 rho 空间中的多项式拟合公式，而不是 DEIS 论文中原始的线性 t 空间。此修改享受指数多步更新的闭式系数，而不是依赖于数值求解器。

论文摘要如下：

过去几年，扩散模型 (DM) 在生成建模任务中生成高保真样本方面取得了巨大成功。 DM 的主要限制是其众所周知的缓慢采样过程，这通常需要数百到数千个时间离散化步骤才能达到所需的精度。我们的目标是开发一种用于 DM 的快速采样方法，以更少的步骤数量，同时保持高样本质量。为此，我们系统地分析了 DM 中的采样过程，并确定了影响样本质量的关键因素，其中离散化方法最为关键。通过仔细检查学习到的扩散过程，我们提出了扩散指数积分器采样器 (DEIS)。它基于为离散化常微分方程 (ODE) 而设计的指数积分器，并利用学习到的扩散过程的半线性结构来减少离散化误差。所提出的方法可以应用于任何 DM，并且可以在少至 10 步中生成高保真样本。在我们的实验中，在一个 A6000 GPU 上生成来自 CIFAR10 的 50k 图像大约需要 3 分钟。此外，通过直接使用预训练的 DM，当分数函数评估 (NFE) 的数量有限时，我们实现了最先进的采样性能，例如，在 CIFAR10 上，10 个 NFE 时 FID 为 4.17，仅 15 个 NFE 时 FID 为 3.37 和 IS 为 9.74。代码可在此 https URL 中获得。

提示

建议将 solver_order 设置为 2 或 3，而 solver_order=1 等效于 DDIMScheduler。

支持来自 Imagen 的动态阈值，对于像素空间扩散模型，您可以设置 thresholding=True 以使用动态阈值。

DEISMultistepScheduler

class diffusers.DEISMultistepScheduler

< source >

( num_train_timesteps: int = 1000 beta_start: float = 0.0001 beta_end: float = 0.02 beta_schedule: str = 'linear' trained_betas: typing.Optional[numpy.ndarray] = None solver_order: int = 2 prediction_type: str = 'epsilon' thresholding: bool = False dynamic_thresholding_ratio: float = 0.995 sample_max_value: float = 1.0 algorithm_type: str = 'deis' solver_type: str = 'logrho' lower_order_final: bool = True use_karras_sigmas: typing.Optional[bool] = False use_exponential_sigmas: typing.Optional[bool] = False use_beta_sigmas: typing.Optional[bool] = False use_flow_sigmas: typing.Optional[bool] = False flow_shift: typing.Optional[float] = 1.0 timestep_spacing: str = 'linspace' steps_offset: int = 0 )

参数

num_train_timesteps (int, 默认为 1000) — 用于训练模型的扩散步数。
beta_start (float, 默认为 0.0001) — 推理的起始 beta 值。
beta_end (float, 默认为 0.02) — 最终的 beta 值。
beta_schedule (str, 默认为 "linear") — beta 时间表，从 beta 范围到用于步进模型的 beta 序列的映射。从 linear、scaled_linear 或 squaredcos_cap_v2 中选择。
trained_betas (np.ndarray, 可选) — 直接将 beta 数组传递给构造函数以绕过 beta_start 和 beta_end。
solver_order (int, 默认为 2) — DEIS 阶数，可以是 1 或 2 或 3。建议引导采样使用 solver_order=2，无条件采样使用 solver_order=3。
prediction_type (str, 默认为 epsilon) — 调度器函数的预测类型；可以是 epsilon（预测扩散过程的噪声）、sample（直接预测噪声样本）或 v_prediction（参见 Imagen Video 论文的第 2.4 节）。
thresholding (bool, 默认为 False) — 是否使用“动态阈值”方法。这不适用于潜在空间扩散模型，例如 Stable Diffusion。
dynamic_thresholding_ratio (float, 默认为 0.995) — 动态阈值方法的比率。仅当 thresholding=True 时有效。
sample_max_value (float, 默认为 1.0) — 动态阈值的阈值。仅当 thresholding=True 时有效。
algorithm_type (str, 默认为 deis) — 求解器的算法类型。
lower_order_final (bool, 默认为 True) — 是否在最后步骤中使用较低阶的求解器。仅对 < 15 个推理步骤有效。
use_karras_sigmas (bool, 可选, 默认为 False) — 是否在采样过程中对噪声时间表中的步长使用 Karras sigmas。如果为 True，则根据噪声水平序列 {σi} 确定 sigmas。
use_exponential_sigmas (bool, 可选, 默认为 False) — 是否在采样过程中对噪声时间表中的步长使用指数 sigmas。
use_beta_sigmas (bool, 可选, 默认为 False) — 是否在采样过程中对噪声时间表中的步长使用 beta sigmas。有关更多信息，请参阅 Beta Sampling is All You Need。
timestep_spacing (str, 默认为 "linspace") — 应该如何缩放时间步长。有关更多信息，请参阅 Common Diffusion Noise Schedules and Sample Steps are Flawed 的表 2。
steps_offset (int, 默认为 0) — 添加到推理步骤的偏移量，某些模型系列需要这样做。