Diffusers

加入 Hugging Face 社区

并获得增强的文档体验

协作处理模型、数据集和 Spaces

通过加速推理获得更快的示例

切换文档主题

开始使用

DDIM

Denoising Diffusion Implicit Models (DDIM)，作者：Jiaming Song、Chenlin Meng 和 Stefano Ermon。

论文摘要如下：

去噪扩散概率模型 (DDPM) 在没有对抗训练的情况下实现了高质量的图像生成，但它们需要模拟马尔可夫链的许多步骤才能生成样本。为了加速采样，我们提出了去噪扩散隐式模型 (DDIM)，这是一类更有效的迭代隐式概率模型，其训练过程与 DDPM 相同。在 DDPM 中，生成过程被定义为马尔可夫扩散过程的逆过程。我们构建了一类非马尔可夫扩散过程，这些过程产生相同的训练目标，但其逆过程可以更快地从中采样。我们通过经验证明，与 DDPM 相比，DDIM 可以在挂钟时间方面以快 10 倍到 50 倍的速度生成高质量样本，使我们能够在计算和样本质量之间进行权衡，并且可以直接在潜在空间中执行语义上有意义的图像插值。

原始代码库可以在 ermongroup/ddim 找到。

DDIMPipeline

class diffusers.DDIMPipeline

< source >

( unet scheduler )

参数

unet (UNet2DModel) — 用于对编码后的图像潜在空间进行去噪的 UNet2DModel。
scheduler (SchedulerMixin) — 与 unet 结合使用的调度器，用于对编码后的图像进行去噪。可以是 DDPMScheduler 或 DDIMScheduler 之一。

用于图像生成的 Pipeline。

此模型继承自 DiffusionPipeline。请查看超类文档，了解为所有 pipelines 实现的通用方法（下载、保存、在特定设备上运行等）。

call

< source >

( batch_size: int = 1 generator: typing.Union[torch._C.Generator, typing.List[torch._C.Generator], NoneType] = None eta: float = 0.0 num_inference_steps: int = 50 use_clipped_model_output: typing.Optional[bool] = None output_type: typing.Optional[str] = 'pil' return_dict: bool = True ) → ImagePipelineOutput or tuple

参数

batch_size (int, 可选，默认为 1) — 要生成的图像数量。
generator (torch.Generator, 可选) — 用于使生成结果具有确定性的 torch.Generator。
eta (float, 可选，默认为 0.0) — 对应于 DDIM 论文中的参数 eta (η)。仅适用于 DDIMScheduler，在其他调度器中将被忽略。值 0 对应于 DDIM，值 1 对应于 DDPM。
num_inference_steps (int, 可选，默认为 50) — 去噪步骤的数量。更多去噪步骤通常会带来更高质量的图像，但会牺牲推理速度。
use_clipped_model_output (bool, 可选，默认为 None) — 如果为 True 或 False，请参阅 DDIMScheduler.step() 的文档。如果为 None，则不会将任何内容向下传递给调度器（对于不支持此参数的调度器，请使用 None）。
output_type (str, 可选，默认为 "pil") — 生成图像的输出格式。在 PIL.Image 或 np.array 之间选择。
return_dict (bool, 可选，默认为 True) — 是否返回 ImagePipelineOutput 而不是普通元组。

返回值

ImagePipelineOutput 或 tuple

如果 return_dict 为 True，则返回 ImagePipelineOutput，否则返回 tuple，其中第一个元素是包含生成图像的列表

用于生成 pipeline 的调用函数。

示例

>>> from diffusers import DDIMPipeline
>>> import PIL.Image
>>> import numpy as np

>>> # load model and scheduler
>>> pipe = DDIMPipeline.from_pretrained("fusing/ddim-lsun-bedroom")

>>> # run pipeline in inference (sample random noise and denoise)
>>> image = pipe(eta=0.0, num_inference_steps=50)

>>> # process image to PIL
>>> image_processed = image.cpu().permute(0, 2, 3, 1)
>>> image_processed = (image_processed + 1.0) * 127.5
>>> image_processed = image_processed.numpy().astype(np.uint8)
>>> image_pil = PIL.Image.fromarray(image_processed[0])

>>> # save image
>>> image_pil.save("test.png")

ImagePipelineOutput

class diffusers.ImagePipelineOutput

< source >

( images: typing.Union[typing.List[PIL.Image.Image], numpy.ndarray] )

参数

images (List[PIL.Image.Image] 或 np.ndarray) — 长度为 batch_size 的去噪 PIL 图像列表，或形状为 (batch_size, height, width, num_channels) 的 NumPy 数组。

图像 pipelines 的输出类。

< > 在 GitHub 上更新

←Dance Diffusion DDPM→

Diffusers

DDIM

DDIMPipeline

class diffusers.DDIMPipeline

__call__

ImagePipelineOutput

class diffusers.ImagePipelineOutput

call