Diffusers 文档

aMUSEd

Diffusers

加入 Hugging Face 社区

并获得增强的文档体验

在模型、数据集和 Spaces 上协作

通过加速推理获得更快的示例

切换文档主题

开始使用

aMUSEd

aMUSEd 在 aMUSEd: An Open MUSE Reproduction 中被介绍，作者是 Suraj Patil、William Berman、Robin Rombach 和 Patrick von Platen。

Amused 是一个基于 MUSE 架构的轻量级文本到图像模型。Amused 在需要轻量级和快速模型的应用中特别有用，例如快速一次生成许多图像。

Amused 是一个基于 vqvae token 的 transformer，与许多扩散模型相比，它可以用更少的前向传递生成图像。与 muse 相比，它使用较小的文本编码器 CLIP-L/14 而不是 t5-xxl。由于其参数计数较少且前向传递生成过程较少，amused 可以快速生成许多图像。这种优势在较大的批次大小中尤其明显。

该论文的摘要是

我们提出了 aMUSEd，这是一个开源、轻量级的掩码图像模型 (MIM)，用于基于 MUSE 的文本到图像生成。aMUSEd 的参数只有 MUSE 的 10%，专注于快速图像生成。我们认为，与潜在扩散（文本到图像生成的主流方法）相比，MIM 尚未得到充分探索。与潜在扩散相比，MIM 需要更少的推理步骤，并且更易于解释。此外，MIM 可以进行微调，仅使用单张图像即可学习额外的风格。我们希望通过展示 MIM 在大规模文本到图像生成方面的有效性并发布可重现的训练代码，来鼓励进一步探索 MIM。我们还发布了两个模型的检查点，这两个模型直接生成 256x256 和 512x512 分辨率的图像。

模型	参数
amused-256	603M
amused-512	608M

AmusedPipeline

类 diffusers.AmusedPipeline

< 源代码 >

( vqvae: VQModel tokenizer: CLIPTokenizer text_encoder: CLIPTextModelWithProjection transformer: UVit2DModel scheduler: AmusedScheduler )

call

< 源代码 >

( prompt: typing.Union[str, typing.List[str], NoneType] = None height: typing.Optional[int] = None width: typing.Optional[int] = None num_inference_steps: int = 12 guidance_scale: float = 10.0 negative_prompt: typing.Union[str, typing.List[str], NoneType] = None num_images_per_prompt: typing.Optional[int] = 1 generator: typing.Optional[torch._C.Generator] = None latents: typing.Optional[torch.IntTensor] = None prompt_embeds: typing.Optional[torch.Tensor] = None encoder_hidden_states: typing.Optional[torch.Tensor] = None negative_prompt_embeds: typing.Optional[torch.Tensor] = None negative_encoder_hidden_states: typing.Optional[torch.Tensor] = None output_type = 'pil' return_dict: bool = True callback: typing.Optional[typing.Callable[[int, int, torch.Tensor], NoneType]] = None callback_steps: int = 1 cross_attention_kwargs: typing.Optional[typing.Dict[str, typing.Any]] = None micro_conditioning_aesthetic_score: int = 6 micro_conditioning_crop_coord: typing.Tuple[int, int] = (0, 0) temperature: typing.Union[int, typing.Tuple[int, int], typing.List[int]] = (2, 0) ) → ImagePipelineOutput 或 tuple

参数

prompt (str 或 List[str], 可选的) — 用于引导图像生成的提示语。如果未定义，你需要传递 prompt_embeds。
height (int, 可选的, 默认为 self.transformer.config.sample_size * self.vae_scale_factor) — 生成图像的像素高度。
width (int, 可选的, 默认为 self.unet.config.sample_size * self.vae_scale_factor) — 生成图像的像素宽度。
num_inference_steps (int, 可选的, 默认为 16) — 去噪步骤的数量。更多的去噪步骤通常会以较慢的推理速度为代价， menghasilkan 更高质量的图像。
guidance_scale (float, 可选的, 默认为 10.0) — 更高的 guidance scale 值会鼓励模型生成与文本 prompt 紧密相关的图像，但这会以降低图像质量为代价。当 guidance_scale > 1 时，guidance scale 被启用。
negative_prompt (str 或 List[str], 可选的) — 用于引导图像生成中不应包含的内容的提示语。如果未定义，你需要传递 negative_prompt_embeds。当不使用 guidance 时（guidance_scale < 1）将被忽略。
num_images_per_prompt (int, 可选的, 默认为 1) — 每个提示语要生成的图像数量。
generator (torch.Generator, 可选的) — 用于使生成过程具有确定性的 torch.Generator。
latents (torch.IntTensor, 可选的) — 预生成的 tokens，代表 self.vqvae 中的潜在向量，用作图像生成的输入。如果未提供，起始 latents 将完全被遮蔽。
prompt_embeds (torch.Tensor, 可选的) — 预生成的文本嵌入。可用于轻松调整文本输入（提示语权重）。如果未提供，则文本嵌入将从 prompt 输入参数生成。来自池化和投影的最终隐藏状态的单个向量。
encoder_hidden_states (torch.Tensor, 可选的) — 来自文本编码器的预生成的倒数第二层隐藏状态，提供额外的文本条件。
negative_prompt_embeds (torch.Tensor, 可选的) — 预生成的负面文本嵌入。可用于轻松调整文本输入（提示语权重）。如果未提供，则 negative_prompt_embeds 将从 negative_prompt 输入参数生成。
negative_encoder_hidden_states (torch.Tensor, 可选的) — 类似于正面提示语的 encoder_hidden_states。
output_type (str, 可选的, 默认为 "pil") — 生成图像的输出格式。在 PIL.Image 或 np.array 之间选择。
return_dict (bool, 可选的, 默认为 True) — 是否返回 StableDiffusionPipelineOutput 而不是普通元组。
callback (Callable, 可选的) — 一个在推理期间每 callback_steps 步调用的函数。该函数使用以下参数调用：callback(step: int, timestep: int, latents: torch.Tensor)。
callback_steps (int, 可选的, 默认为 1) — 调用 callback 函数的频率。如果未指定，则在每一步都调用 callback。
cross_attention_kwargs (dict, 可选的) — 一个 kwargs 字典，如果指定，则会传递给 self.processor 中定义的 AttentionProcessor。
micro_conditioning_aesthetic_score (int, 可选的, 默认为 6) — 根据 laion aesthetic 分类器，目标美学评分。请参阅 https://laion.ai/blog/laion-aesthetics/ 和 https://arxiv.org/abs/2307.01952 的微调部分。
micro_conditioning_crop_coord (Tuple[int], 可选的, 默认为 (0, 0)) — 目标高度、宽度裁剪坐标。请参阅 https://arxiv.org/abs/2307.01952 的微调部分。
temperature (Union[int, Tuple[int, int], List[int]], 可选, 默认为 (2, 0)) — 配置 self.scheduler 上的 temperature 调度器，请参阅 AmusedScheduler#set_timesteps。

返回值

ImagePipelineOutput 或 tuple

如果 return_dict 为 True，则返回 ImagePipelineOutput，否则返回 tuple，其中第一个元素是包含生成图像的列表。

调用管道进行生成的功能。

示例

>>> import torch
>>> from diffusers import AmusedPipeline

>>> pipe = AmusedPipeline.from_pretrained("amused/amused-512", variant="fp16", torch_dtype=torch.float16)
>>> pipe = pipe.to("cuda")

>>> prompt = "a photo of an astronaut riding a horse on mars"
>>> image = pipe(prompt).images[0]

enable_xformers_memory_efficient_attention

< source >

( attention_op: typing.Optional[typing.Callable] = None )

参数

attention_op (Callable, 可选) — 重写默认的 None 运算符，用作 xFormers 的 memory_efficient_attention() 函数的 op 参数。

启用来自 xFormers 的内存高效注意力。启用此选项后，您应该会观察到更低的 GPU 内存使用率和潜在的推理加速。不保证训练期间的加速。

⚠️ 当内存高效注意力与切片注意力都启用时，内存高效注意力优先。

示例

>>> import torch
>>> from diffusers import DiffusionPipeline
>>> from xformers.ops import MemoryEfficientAttentionFlashAttentionOp

>>> pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1", torch_dtype=torch.float16)
>>> pipe = pipe.to("cuda")
>>> pipe.enable_xformers_memory_efficient_attention(attention_op=MemoryEfficientAttentionFlashAttentionOp)
>>> # Workaround for not accepting attention shape using VAE for Flash Attention
>>> pipe.vae.enable_xformers_memory_efficient_attention(attention_op=None)

disable_xformers_memory_efficient_attention

< source >

( )

禁用来自 xFormers 的内存高效注意力。

class diffusers.AmusedImg2ImgPipeline

< source >

( vqvae: VQModel tokenizer: CLIPTokenizer text_encoder: CLIPTextModelWithProjection transformer: UVit2DModel scheduler: AmusedScheduler )

call

< source >

( prompt: typing.Union[str, typing.List[str], NoneType] = None image: typing.Union[PIL.Image.Image, numpy.ndarray, torch.Tensor, typing.List[PIL.Image.Image], typing.List[numpy.ndarray], typing.List[torch.Tensor]] = None strength: float = 0.5 num_inference_steps: int = 12 guidance_scale: float = 10.0 negative_prompt: typing.Union[str, typing.List[str], NoneType] = None num_images_per_prompt: typing.Optional[int] = 1 generator: typing.Optional[torch._C.Generator] = None prompt_embeds: typing.Optional[torch.Tensor] = None encoder_hidden_states: typing.Optional[torch.Tensor] = None negative_prompt_embeds: typing.Optional[torch.Tensor] = None negative_encoder_hidden_states: typing.Optional[torch.Tensor] = None output_type = 'pil' return_dict: bool = True callback: typing.Optional[typing.Callable[[int, int, torch.Tensor], NoneType]] = None callback_steps: int = 1 cross_attention_kwargs: typing.Optional[typing.Dict[str, typing.Any]] = None micro_conditioning_aesthetic_score: int = 6 micro_conditioning_crop_coord: typing.Tuple[int, int] = (0, 0) temperature: typing.Union[int, typing.Tuple[int, int], typing.List[int]] = (2, 0) ) → ImagePipelineOutput 或 tuple

参数

prompt (str 或 List[str], 可选) — 用于引导图像生成的提示或提示列表。如果未定义，则需要传递 prompt_embeds。
image (torch.Tensor, PIL.Image.Image, np.ndarray, List[torch.Tensor], List[PIL.Image.Image], 或 List[np.ndarray]) — Image、numpy 数组或张量，表示用作起点的图像批次。对于 numpy 数组和 pytorch 张量，期望值范围在 [0, 1] 之间。如果是张量或张量列表，则期望形状应为 (B, C, H, W) 或 (C, H, W)。如果是 numpy 数组或数组列表，则期望形状应为 (B, H, W, C) 或 (H, W, C)。它也可以接受图像潜在表示作为 image，但如果直接传递潜在表示，则不会再次编码。
strength (float, 可选, 默认为 0.5) — 指示转换参考 image 的程度。必须介于 0 和 1 之间。image 用作起点，strength 越高，添加的噪声越多。去噪步骤的数量取决于最初添加的噪声量。当 strength 为 1 时，添加的噪声最大，并且去噪过程运行指定在 num_inference_steps 中的完整迭代次数。值为 1 本质上忽略 image。
num_inference_steps (int, 可选, 默认为 12) — 去噪步骤的数量。更多的去噪步骤通常会带来更高质量的图像，但代价是推理速度较慢。
guidance_scale (float, 可选, 默认为 10.0) — 较高的 guidance scale 值会鼓励模型生成与文本 prompt 紧密相关的图像，但会牺牲图像质量。当 guidance_scale > 1 时，guidance scale 生效。
negative_prompt (str 或 List[str], 可选) — 用于引导图像生成中不应包含的内容的提示或提示列表。如果未定义，则需要传递 negative_prompt_embeds 代替。当不使用 guidance 时（guidance_scale < 1），将被忽略。
num_images_per_prompt (int, 可选, 默认为 1) — 每个 prompt 生成的图像数量。
generator (torch.Generator, 可选) — 用于使生成具有确定性的 torch.Generator。
prompt_embeds (torch.Tensor, 可选) — 预生成的文本嵌入。可用于轻松调整文本输入（提示权重）。如果未提供，则从 prompt 输入参数生成文本嵌入。来自池化和投影的最终隐藏状态的单个向量。
encoder_hidden_states (torch.Tensor, 可选) — 来自文本编码器的预生成的倒数第二层隐藏状态，提供额外的文本条件控制。
negative_prompt_embeds (torch.Tensor, 可选) — 预生成的负面文本嵌入。可用于轻松调整文本输入（提示权重）。如果未提供，则从 negative_prompt 输入参数生成 negative_prompt_embeds。
negative_encoder_hidden_states (torch.Tensor, 可选) — 类似于正面提示的 encoder_hidden_states。
output_type (str, 可选, 默认为 "pil") — 生成图像的输出格式。在 PIL.Image 或 np.array 之间选择。
return_dict (bool, 可选, 默认为 True) — 是否返回 StableDiffusionPipelineOutput 而不是普通的 tuple。
callback (Callable, 可选) — 在推理期间每 callback_steps 步调用的函数。该函数使用以下参数调用：callback(step: int, timestep: int, latents: torch.Tensor)。
callback_steps (int, 可选, 默认为 1) — 调用 callback 函数的频率。如果未指定，则在每个步骤都调用回调。
cross_attention_kwargs (dict, 可选) — 一个 kwargs 字典，如果指定，则会传递给 self.processor 中定义的 AttentionProcessor。
micro_conditioning_aesthetic_score (int, 可选, 默认为 6) — 根据 laion aesthetic 分类器的目标美学分数。请参阅 https://laion.ai/blog/laion-aesthetics/ 和 https://arxiv.org/abs/2307.01952 的 micro-conditioning 部分。
micro_conditioning_crop_coord (Tuple[int], 可选, 默认为 (0, 0)) — 目标高度、宽度裁剪坐标。请参阅 https://arxiv.org/abs/2307.01952 的微调条件部分。
temperature (Union[int, Tuple[int, int], List[int]], 可选, 默认为 (2, 0)) — 在 self.scheduler 上配置温度调度器，请参阅 AmusedScheduler#set_timesteps。

返回值

ImagePipelineOutput 或 tuple

如果 return_dict 为 True，则返回 ImagePipelineOutput，否则返回 tuple，其中第一个元素是包含生成图像的列表。

调用管道进行生成的功能。

示例

>>> import torch
>>> from diffusers import AmusedImg2ImgPipeline
>>> from diffusers.utils import load_image

>>> pipe = AmusedImg2ImgPipeline.from_pretrained(
...     "amused/amused-512", variant="fp16", torch_dtype=torch.float16
... )
>>> pipe = pipe.to("cuda")

>>> prompt = "winter mountains"
>>> input_image = (
...     load_image(
...         "https://huggingface.co/datasets/diffusers/docs-images/resolve/main/open_muse/mountains.jpg"
...     )
...     .resize((512, 512))
...     .convert("RGB")
... )
>>> image = pipe(prompt, input_image).images[0]

enable_xformers_memory_efficient_attention

< source >

( attention_op: typing.Optional[typing.Callable] = None )

参数

attention_op (Callable, 可选) — 重写默认的 None 运算符，用作 xFormers 的 memory_efficient_attention() 函数的 op 参数。

启用来自 xFormers 的内存高效注意力。启用此选项后，您应该会观察到更低的 GPU 内存使用率和潜在的推理加速。不保证训练期间的加速。

⚠️ 当内存高效注意力与切片注意力都启用时，内存高效注意力优先。

示例

>>> import torch
>>> from diffusers import DiffusionPipeline
>>> from xformers.ops import MemoryEfficientAttentionFlashAttentionOp

>>> pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1", torch_dtype=torch.float16)
>>> pipe = pipe.to("cuda")
>>> pipe.enable_xformers_memory_efficient_attention(attention_op=MemoryEfficientAttentionFlashAttentionOp)
>>> # Workaround for not accepting attention shape using VAE for Flash Attention
>>> pipe.vae.enable_xformers_memory_efficient_attention(attention_op=None)

disable_xformers_memory_efficient_attention

< source >

( )

禁用来自 xFormers 的内存高效注意力。

class diffusers.AmusedInpaintPipeline

< source >

( vqvae: VQModel tokenizer: CLIPTokenizer text_encoder: CLIPTextModelWithProjection transformer: UVit2DModel scheduler: AmusedScheduler )

call

< source >

( prompt: typing.Union[str, typing.List[str], NoneType] = None image: typing.Union[PIL.Image.Image, numpy.ndarray, torch.Tensor, typing.List[PIL.Image.Image], typing.List[numpy.ndarray], typing.List[torch.Tensor]] = None mask_image: typing.Union[PIL.Image.Image, numpy.ndarray, torch.Tensor, typing.List[PIL.Image.Image], typing.List[numpy.ndarray], typing.List[torch.Tensor]] = None strength: float = 1.0 num_inference_steps: int = 12 guidance_scale: float = 10.0 negative_prompt: typing.Union[str, typing.List[str], NoneType] = None num_images_per_prompt: typing.Optional[int] = 1 generator: typing.Optional[torch._C.Generator] = None prompt_embeds: typing.Optional[torch.Tensor] = None encoder_hidden_states: typing.Optional[torch.Tensor] = None negative_prompt_embeds: typing.Optional[torch.Tensor] = None negative_encoder_hidden_states: typing.Optional[torch.Tensor] = None output_type = 'pil' return_dict: bool = True callback: typing.Optional[typing.Callable[[int, int, torch.Tensor], NoneType]] = None callback_steps: int = 1 cross_attention_kwargs: typing.Optional[typing.Dict[str, typing.Any]] = None micro_conditioning_aesthetic_score: int = 6 micro_conditioning_crop_coord: typing.Tuple[int, int] = (0, 0) temperature: typing.Union[int, typing.Tuple[int, int], typing.List[int]] = (2, 0) ) → ImagePipelineOutput or tuple

参数

prompt (str 或 List[str], 可选) — 用于引导图像生成的提示或提示列表。如果未定义，则需要传递 prompt_embeds。
image (torch.Tensor, PIL.Image.Image, np.ndarray, List[torch.Tensor], List[PIL.Image.Image], 或 List[np.ndarray]) — Image，numpy 数组或张量，表示要用作起点的图像批次。对于 numpy 数组和 pytorch 张量，预期值范围在 [0, 1] 之间。如果是张量或张量列表，则预期形状应为 (B, C, H, W) 或 (C, H, W)。如果是 numpy 数组或数组列表，则预期形状应为 (B, H, W, C) 或 (H, W, C)。它也可以接受图像潜在表示作为 image，但如果直接传递潜在表示，则不会再次编码。
mask_image (torch.Tensor, PIL.Image.Image, np.ndarray, List[torch.Tensor], List[PIL.Image.Image], 或 List[np.ndarray]) — Image，numpy 数组或张量，表示要遮罩 image 的图像批次。蒙版中的白色像素会被重新绘制，而黑色像素会被保留。如果 mask_image 是 PIL 图像，则在使用前会将其转换为单通道（亮度）。如果它是 numpy 数组或 pytorch 张量，则应包含一个颜色通道 (L) 而不是 3 个，因此 pytorch 张量的预期形状为 (B, 1, H, W), (B, H, W), (1, H, W), (H, W)。而对于 numpy 数组，则为 (B, H, W, 1), (B, H, W), (H, W, 1), 或 (H, W)。
strength (float, 可选, 默认为 1.0) — 指示转换参考 image 的程度。必须介于 0 和 1 之间。image 用作起点，strength 越高，添加的噪声越多。去噪步骤的数量取决于最初添加的噪声量。当 strength 为 1 时，添加的噪声最大，去噪过程将运行 num_inference_steps 中指定的完整迭代次数。值 1 实质上会忽略 image。
num_inference_steps (int, 可选, 默认为 12) — 去噪步骤的数量。更多的去噪步骤通常会带来更高质量的图像，但代价是推理速度较慢。
guidance_scale (float, 可选, 默认为 10.0) — 较高的 guidance scale 值会鼓励模型生成与文本 prompt 紧密相关的图像，但会降低图像质量。当 guidance_scale > 1 时，guidance scale 启用。
negative_prompt (str 或 List[str], 可选) — 用于引导图像生成中不应包含的内容的提示或提示列表。如果未定义，则需要传递 negative_prompt_embeds 代替。当不使用 guidance 时（guidance_scale < 1），将被忽略。
num_images_per_prompt (int, 可选, 默认为 1) — 每个提示要生成的图像数量。
generator (torch.Generator, 可选) — 用于使生成具有确定性的 torch.Generator。
prompt_embeds (torch.Tensor, 可选) — 预生成的文本嵌入。可用于轻松调整文本输入（提示权重）。如果未提供，则文本嵌入从 prompt 输入参数生成。来自池化和投影的最终隐藏状态的单个向量。
encoder_hidden_states (torch.Tensor, 可选) — 来自文本编码器的预生成的倒数第二层隐藏状态，提供额外的文本条件。
negative_prompt_embeds (torch.Tensor, 可选) — 预生成的负面文本嵌入。可用于轻松调整文本输入（提示权重）。如果未提供，则 negative_prompt_embeds 从 negative_prompt 输入参数生成。
negative_encoder_hidden_states (torch.Tensor, 可选) — 类似于正向提示的 encoder_hidden_states。
output_type (str, 可选, 默认为 "pil") — 生成图像的输出格式。在 PIL.Image 或 np.array 之间选择。
return_dict (bool, 可选, 默认为 True) — 是否返回 StableDiffusionPipelineOutput 而不是普通元组。
callback (Callable, 可选) — 在推理期间每 callback_steps 步调用一次的函数。该函数使用以下参数调用： callback(step: int, timestep: int, latents: torch.Tensor)。
callback_steps (int, 可选, 默认为 1) — 调用 callback 函数的频率。如果未指定，则在每个步骤中调用回调。
cross_attention_kwargs (dict, 可选) — 一个 kwargs 字典，如果指定，则会传递给 self.processor 中定义的 AttentionProcessor。
micro_conditioning_aesthetic_score (int, 可选, 默认为 6) — 根据 laion aesthetic 分类器的目标美学评分。请参阅 https://laion.ai/blog/laion-aesthetics/ 以及 https://arxiv.org/abs/2307.01952 的微调条件部分。
micro_conditioning_crop_coord (Tuple[int], 可选, 默认为 (0, 0)) — 目标高度、宽度裁剪坐标。请参阅 https://arxiv.org/abs/2307.01952 的微调理 (micro-conditioning) 部分。
temperature (Union[int, Tuple[int, int], List[int]], 可选, 默认为 (2, 0)) — 配置 self.scheduler 上的温度调度器，请参阅 AmusedScheduler#set_timesteps。

返回值

ImagePipelineOutput 或 tuple

如果 return_dict 为 True，则返回 ImagePipelineOutput，否则返回 tuple，其中第一个元素是包含生成图像的列表。

调用管道进行生成的功能。

示例

>>> import torch
>>> from diffusers import AmusedInpaintPipeline
>>> from diffusers.utils import load_image

>>> pipe = AmusedInpaintPipeline.from_pretrained(
...     "amused/amused-512", variant="fp16", torch_dtype=torch.float16
... )
>>> pipe = pipe.to("cuda")

>>> prompt = "fall mountains"
>>> input_image = (
...     load_image(
...         "https://huggingface.co/datasets/diffusers/docs-images/resolve/main/open_muse/mountains_1.jpg"
...     )
...     .resize((512, 512))
...     .convert("RGB")
... )
>>> mask = (
...     load_image(
...         "https://huggingface.co/datasets/diffusers/docs-images/resolve/main/open_muse/mountains_1_mask.png"
...     )
...     .resize((512, 512))
...     .convert("L")
... )
>>> pipe(prompt, input_image, mask).images[0].save("out.png")

enable_xformers_memory_efficient_attention

< source >

( attention_op: typing.Optional[typing.Callable] = None )

参数

attention_op (Callable, 可选) — 覆盖默认的 None 运算符，用作 xFormers 的 memory_efficient_attention() 函数的 op 参数。

启用来自 xFormers 的内存高效注意力。启用此选项后，您应该会观察到更低的 GPU 内存使用率和潜在的推理加速。不保证训练期间的加速。

⚠️ 当内存高效注意力与切片注意力都启用时，内存高效注意力优先。

示例

>>> import torch
>>> from diffusers import DiffusionPipeline
>>> from xformers.ops import MemoryEfficientAttentionFlashAttentionOp

>>> pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-2-1", torch_dtype=torch.float16)
>>> pipe = pipe.to("cuda")
>>> pipe.enable_xformers_memory_efficient_attention(attention_op=MemoryEfficientAttentionFlashAttentionOp)
>>> # Workaround for not accepting attention shape using VAE for Flash Attention
>>> pipe.vae.enable_xformers_memory_efficient_attention(attention_op=None)

disable_xformers_memory_efficient_attention

< source >

( )

禁用来自 xFormers 的内存高效注意力。

< > 在 GitHub 上更新

←Allegro AnimateDiff→

Diffusers

aMUSEd

AmusedPipeline

类 diffusers.AmusedPipeline

__call__

enable_xformers_memory_efficient_attention

disable_xformers_memory_efficient_attention

class diffusers.AmusedImg2ImgPipeline

__call__

enable_xformers_memory_efficient_attention

disable_xformers_memory_efficient_attention

class diffusers.AmusedInpaintPipeline

__call__

enable_xformers_memory_efficient_attention

disable_xformers_memory_efficient_attention

call

call

call