Diffusers 文档

扰动注意力引导

Diffusers

加入 Hugging Face 社区

并获取增强的文档体验

在模型、数据集和 Spaces 上协作

通过加速推理获得更快的示例

切换文档主题

开始使用

扰动注意力引导

扰动注意力引导 (PAG) 是一种新的扩散采样引导方法，可以提高无条件和条件设置下的样本质量，无需进一步训练或集成外部模块即可实现这一点。

PAG 在 Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance 中被提出，作者是 Donghoon Ahn, Hyoungwon Cho, Jaewon Min, Wooseok Jang, Jungwoo Kim, SeonHwa Kim, Hyun Hee Park, Kyong Hwan Jin 和 Seungryong Kim。

该论文的摘要如下

最近的研究表明，扩散模型能够生成高质量的样本，但其质量在很大程度上取决于采样引导技术，例如分类器引导 (CG) 和无分类器引导 (CFG)。这些技术通常不适用于无条件生成或各种下游任务，例如图像恢复。在本文中，我们提出了一种新颖的采样引导，称为扰动注意力引导 (PAG)，它可以提高无条件和条件设置下的扩散样本质量，无需额外的训练或集成外部模块即可实现这一点。 PAG 旨在在整个去噪过程中逐步增强样本的结构。它涉及通过用单位矩阵替换扩散 U-Net 中选定的自注意力图来生成结构退化的中间样本，考虑到自注意力机制捕获结构信息的能力，并引导去噪过程远离这些退化的样本。在 ADM 和 Stable Diffusion 中，PAG 出人意料地提高了条件甚至无条件场景中的样本质量。此外，PAG 显着提高了现有引导（如 CG 或 CFG）无法充分利用的各种下游任务中的基线性能，包括带有空 prompts 的 ControlNet 和图像恢复，例如图像修复和去模糊。

可以通过在实例化 PAG pipeline 时将 pag_applied_layers 指定为参数来使用 PAG。它可以是单个字符串或字符串列表。每个字符串可以是唯一的层标识符或用于标识一个或多个层的正则表达式。

作为普通字符串的完整标识符：down_blocks.2.attentions.0.transformer_blocks.0.attn1.processor
作为 RegEx 的完整标识符：down_blocks.2.(attentions|motion_modules).0.transformer_blocks.0.attn1.processor
作为 RegEx 的部分标识符：down_blocks.2 或 attn1
标识符列表（可以是字符串和 ReGex 的组合）：["blocks.1", "blocks.(14|20)", r"down_blocks\.(2,3)"]

由于 RegEx 被支持作为匹配层标识符的方式，因此正确使用它至关重要，否则可能会出现意外行为。推荐使用 PAG 的方法是将层指定为 blocks.{layer_index} 和 blocks.({layer_index_1|layer_index_2|...})。以任何其他方式使用它，虽然可行，但可能会绕过我们的基本验证检查并给您带来意外的结果。

Diffusers

扰动注意力引导

AnimateDiffPAGPipeline

class diffusers.AnimateDiffPAGPipeline

__call__

encode_prompt

HunyuanDiTPAGPipeline

class diffusers.HunyuanDiTPAGPipeline

__call__

encode_prompt

KolorsPAGPipeline

class diffusers.KolorsPAGPipeline

__call__

encode_prompt

get_guidance_scale_embedding

StableDiffusionPAGInpaintPipeline

class diffusers.StableDiffusionPAGInpaintPipeline

__call__

encode_prompt

get_guidance_scale_embedding

StableDiffusionPAGPipeline

类 diffusers.StableDiffusionPAGPipeline

__call__

encode_prompt

get_guidance_scale_embedding

StableDiffusionPAGImg2ImgPipeline

class diffusers.StableDiffusionPAGImg2ImgPipeline

__call__

encode_prompt

get_guidance_scale_embedding

StableDiffusionControlNetPAGPipeline

class diffusers.StableDiffusionControlNetPAGPipeline

encode_prompt

get_guidance_scale_embedding

StableDiffusionControlNetPAGInpaintPipeline

class diffusers.StableDiffusionControlNetPAGInpaintPipeline

__call__

encode_prompt

get_guidance_scale_embedding

StableDiffusionXLPAGPipeline

class diffusers.StableDiffusionXLPAGPipeline

__call__

encode_prompt

get_guidance_scale_embedding

StableDiffusionXLPAGImg2ImgPipeline

class diffusers.StableDiffusionXLPAGImg2ImgPipeline

__call__

encode_prompt

get_guidance_scale_embedding

StableDiffusionXLPAGInpaintPipeline

class diffusers.StableDiffusionXLPAGInpaintPipeline

__call__

encode_prompt

get_guidance_scale_embedding

StableDiffusionXLControlNetPAGPipeline

class diffusers.StableDiffusionXLControlNetPAGPipeline

__call__

encode_prompt

get_guidance_scale_embedding

StableDiffusionXLControlNetPAGImg2ImgPipeline

class diffusers.StableDiffusionXLControlNetPAGImg2ImgPipeline

__call__

encode_prompt

StableDiffusion3PAGPipeline

class diffusers.StableDiffusion3PAGPipeline

__call__

encode_prompt

StableDiffusion3PAGImg2ImgPipeline

class diffusers.StableDiffusion3PAGImg2ImgPipeline

__call__

encode_prompt

PixArtSigmaPAGPipeline

class diffusers.PixArtSigmaPAGPipeline

__call__

encode_prompt

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call