Diffusers 文档

Chroma

Diffusers

加入 Hugging Face 社区

并获得增强的文档体验

在模型、数据集和 Spaces 上进行协作

通过加速推理获得更快的示例

切换文档主题

开始使用

Chroma

Chroma 是一个基于 Flux 的文本到图像生成模型。

Chroma 的原始模型检查点可以在此处找到。

Chroma 可以使用与 Flux 相同的所有优化。

推理

Diffusers 版 Chroma 基于原始模型的unlocked-v37版本，可在Chroma 仓库中获取。

import torch
from diffusers import ChromaPipeline

pipe = ChromaPipeline.from_pretrained("lodestones/Chroma", torch_dtype=torch.bfloat16)
pipe.enabe_model_cpu_offload()

prompt = [
    "A high-fashion close-up portrait of a blonde woman in clear sunglasses. The image uses a bold teal and red color split for dramatic lighting. The background is a simple teal-green. The photo is sharp and well-composed, and is designed for viewing with anaglyph 3D glasses for optimal effect. It looks professionally done."
]
negative_prompt =  ["low quality, ugly, unfinished, out of focus, deformed, disfigure, blurry, smudged, restricted palette, flat colors"]

image = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    generator=torch.Generator("cpu").manual_seed(433),
    num_inference_steps=40,
    guidance_scale=3.0,
    num_images_per_prompt=1,
).images[0]
image.save("chroma.png")

从单个文件加载

要使用未采用 Diffusers 格式的更新模型检查点，可以使用 ChromaTransformer2DModel 类从原始格式的单个文件加载模型。当尝试加载社区发布的微调或量化版模型时，这同样有用。

以下示例演示了如何从单个文件运行 Chroma。

然后运行以下示例

import torch
from diffusers import ChromaTransformer2DModel, ChromaPipeline

model_id = "lodestones/Chroma"
dtype = torch.bfloat16

transformer = ChromaTransformer2DModel.from_single_file("https://huggingface.co/lodestones/Chroma/blob/main/chroma-unlocked-v37.safetensors", torch_dtype=dtype)

pipe = ChromaPipeline.from_pretrained(model_id, transformer=transformer, torch_dtype=dtype)
pipe.enable_model_cpu_offload()

prompt = [
    "A high-fashion close-up portrait of a blonde woman in clear sunglasses. The image uses a bold teal and red color split for dramatic lighting. The background is a simple teal-green. The photo is sharp and well-composed, and is designed for viewing with anaglyph 3D glasses for optimal effect. It looks professionally done."
]
negative_prompt =  ["low quality, ugly, unfinished, out of focus, deformed, disfigure, blurry, smudged, restricted palette, flat colors"]

image = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    generator=torch.Generator("cpu").manual_seed(433),
    num_inference_steps=40,
    guidance_scale=3.0,
).images[0]

image.save("chroma-single-file.png")

ChromaPipeline

class diffusers.ChromaPipeline

< 来源 >

( scheduler: FlowMatchEulerDiscreteScheduler vae: AutoencoderKL text_encoder: T5EncoderModel tokenizer: T5TokenizerFast transformer: ChromaTransformer2DModel image_encoder: CLIPVisionModelWithProjection = None feature_extractor: CLIPImageProcessor = None )

参数

transformer (ChromaTransformer2DModel) — 用于去噪编码图像潜在值的条件 Transformer (MMDiT) 架构。
scheduler (FlowMatchEulerDiscreteScheduler) — 用于与 transformer 结合去噪编码图像潜在值的调度器。
vae (AutoencoderKL) — 用于将图像编码和解码为潜在表示的变分自编码器（VAE）模型
text_encoder (T5EncoderModel) — T5，特别是google/t5-v1_1-xxl 变体。
tokenizer (T5TokenizerFast) — T5TokenizerFast 类的第二个分词器。

用于文本到图像生成的 Chroma 管道。

参考：https://huggingface.co/lodestones/Chroma/

call

< 来源 >

( prompt: typing.Union[str, typing.List[str]] = None negative_prompt: typing.Union[str, typing.List[str]] = None height: typing.Optional[int] = None width: typing.Optional[int] = None num_inference_steps: int = 35 sigmas: typing.Optional[typing.List[float]] = None guidance_scale: float = 5.0 num_images_per_prompt: typing.Optional[int] = 1 generator: typing.Union[torch._C.Generator, typing.List[torch._C.Generator], NoneType] = None latents: typing.Optional[torch.Tensor] = None prompt_embeds: typing.Optional[torch.Tensor] = None ip_adapter_image: typing.Union[PIL.Image.Image, numpy.ndarray, torch.Tensor, typing.List[PIL.Image.Image], typing.List[numpy.ndarray], typing.List[torch.Tensor], NoneType] = None ip_adapter_image_embeds: typing.Optional[typing.List[torch.Tensor]] = None negative_ip_adapter_image: typing.Union[PIL.Image.Image, numpy.ndarray, torch.Tensor, typing.List[PIL.Image.Image], typing.List[numpy.ndarray], typing.List[torch.Tensor], NoneType] = None negative_ip_adapter_image_embeds: typing.Optional[typing.List[torch.Tensor]] = None negative_prompt_embeds: typing.Optional[torch.Tensor] = None prompt_attention_mask: typing.Optional[torch.Tensor] = None negative_prompt_attention_mask: typing.Optional[torch.Tensor] = None output_type: typing.Optional[str] = 'pil' return_dict: bool = True joint_attention_kwargs: typing.Optional[typing.Dict[str, typing.Any]] = None callback_on_step_end: typing.Optional[typing.Callable[[int, int, typing.Dict], NoneType]] = None callback_on_step_end_tensor_inputs: typing.List[str] = ['latents'] max_sequence_length: int = 512 ) → ~pipelines.chroma.ChromaPipelineOutput 或 tuple

参数

prompt (str 或 List[str], 可选) — 用于引导图像生成的提示词。如果未定义，则必须传递 prompt_embeds。
negative_prompt (str 或 List[str], 可选) — 不用于引导图像生成的提示词。如果未定义，则必须传递 negative_prompt_embeds。当不使用引导时（即，如果 guidance_scale 不大于 1），此参数将被忽略。
height (int, 可选, 默认为 self.unet.config.sample_size * self.vae_scale_factor) — 生成图像的高度（像素）。为获得最佳效果，此参数默认为 1024。
width (int, 可选, 默认为 self.unet.config.sample_size * self.vae_scale_factor) — 生成图像的宽度（像素）。为获得最佳效果，此参数默认为 1024。
num_inference_steps (int, 可选, 默认为 50) — 去噪步数。更多的去噪步数通常会带来更高质量的图像，但推理速度会变慢。
sigmas (List[float], 可选) — 用于去噪过程的自定义 sigmas，适用于支持其 set_timesteps 方法中 sigmas 参数的调度器。如果未定义，将使用传递 num_inference_steps 时的默认行为。
guidance_scale (float, 可选, 默认为 3.5) — Classifier-Free Diffusion Guidance 中定义的引导比例。guidance_scale 定义为 Imagen 论文中公式 2 的 w。通过设置 guidance_scale > 1 启用引导比例。较高的引导比例鼓励生成与文本 prompt 紧密相关的图像，但通常会降低图像质量。
num_images_per_prompt (int, 可选, 默认为 1) — 每个提示词生成的图像数量。
generator (torch.Generator 或 List[torch.Generator], 可选) — 一个或多个 torch 生成器，用于使生成具有确定性。
latents (torch.Tensor, 可选) — 预生成的带噪声的潜在值，从高斯分布中采样，用作图像生成的输入。可用于使用不同提示词调整同一生成。如果未提供，将使用提供的随机 generator 采样生成潜在张量。
prompt_embeds (torch.Tensor, 可选) — 预生成的文本嵌入。可用于轻松调整文本输入，例如提示词权重。如果未提供，将从 prompt 输入参数生成文本嵌入。
ip_adapter_image — (PipelineImageInput, 可选): 与 IP 适配器配合使用的可选图像输入。
ip_adapter_image_embeds (List[torch.Tensor], 可选) — IP-Adapter 的预生成图像嵌入。它应该是一个列表，长度与 IP 适配器数量相同。每个元素应该是一个形状为 (batch_size, num_images, emb_dim) 的张量。如果未提供，嵌入将从 ip_adapter_image 输入参数计算。
negative_ip_adapter_image — （PipelineImageInput，可选）：与IP Adapter配合使用的可选图像输入。
negative_ip_adapter_image_embeds (List[torch.Tensor]，可选) — 预生成的IP-Adapter图像嵌入。它应该是一个列表，长度与IP-adapter的数量相同。每个元素应该是一个形状为(batch_size, num_images, emb_dim)的张量。如果未提供，则根据ip_adapter_image输入参数计算嵌入。
negative_prompt_embeds (torch.Tensor，可选) — 预生成的负文本嵌入。可用于轻松调整文本输入，例如提示词权重。如果未提供，将根据negative_prompt输入参数生成negative_prompt_embeds。
prompt_attention_mask (torch.Tensor，可选) — 提示词嵌入的注意力掩码。用于遮盖提示词序列中的填充标记。Chroma要求一个填充标记保持未遮盖状态。请参阅https://huggingface.co/lodestones/Chroma#tldr-masking-t5-padding-tokens-enhanced-fidelity-and-increased-stability-during-training
negative_prompt_attention_mask (torch.Tensor，可选) — 负提示词嵌入的注意力掩码。用于遮盖负提示词序列中的填充标记。Chroma要求一个填充标记保持未遮盖状态。请参阅https://huggingface.co/lodestones/Chroma#tldr-masking-t5-padding-tokens-enhanced-fidelity-and-increased-stability-during-training
output_type (str，可选，默认为"pil") — 生成图像的输出格式。在PIL: PIL.Image.Image或np.array之间选择。
return_dict (bool，可选，默认为True) — 是否返回~pipelines.flux.ChromaPipelineOutput而不是普通元组。
joint_attention_kwargs (dict，可选) — 如果指定，则将作为kwargs字典传递给AttentionProcessor，其定义在diffusers.models.attention_processor中的self.processor下。
callback_on_step_end (Callable，可选) — 一个在推理过程中每个去噪步骤结束时调用的函数。该函数将使用以下参数调用：callback_on_step_end(self: DiffusionPipeline, step: int, timestep: int, callback_kwargs: Dict)。callback_kwargs将包含callback_on_step_end_tensor_inputs中指定的所有张量列表。
callback_on_step_end_tensor_inputs (List，可选) — callback_on_step_end函数的张量输入列表。列表中指定的张量将作为callback_kwargs参数传递。您只能包含在管道类的._callback_tensor_inputs属性中列出的变量。
max_sequence_length (int，默认为512) — 与prompt一起使用的最大序列长度。

~pipelines.chroma.ChromaPipelineOutput 或 tuple

如果return_dict为True，则为~pipelines.chroma.ChromaPipelineOutput，否则为tuple。当返回元组时，第一个元素是生成的图像列表。

调用管道进行生成时调用的函数。

示例

>>> import torch
>>> from diffusers import ChromaPipeline

>>> model_id = "lodestones/Chroma"
>>> ckpt_path = "https://huggingface.co/lodestones/Chroma/blob/main/chroma-unlocked-v37.safetensors"
>>> transformer = ChromaTransformer2DModel.from_single_file(ckpt_path, torch_dtype=torch.bfloat16)
>>> pipe = ChromaPipeline.from_pretrained(
...     model_id,
...     transformer=transformer,
...     torch_dtype=torch.bfloat16,
... )
>>> pipe.enable_model_cpu_offload()
>>> prompt = [
...     "A high-fashion close-up portrait of a blonde woman in clear sunglasses. The image uses a bold teal and red color split for dramatic lighting. The background is a simple teal-green. The photo is sharp and well-composed, and is designed for viewing with anaglyph 3D glasses for optimal effect. It looks professionally done."
... ]
>>> negative_prompt = [
...     "low quality, ugly, unfinished, out of focus, deformed, disfigure, blurry, smudged, restricted palette, flat colors"
... ]
>>> image = pipe(prompt, negative_prompt=negative_prompt).images[0]
>>> image.save("chroma.png")

disable_vae_slicing

< 源 >

( )

禁用切片 VAE 解码。如果之前启用了 enable_vae_slicing，此方法将返回一步计算解码。

disable_vae_tiling

< 源 >

( )

禁用平铺 VAE 解码。如果之前启用了 enable_vae_tiling，此方法将恢复一步计算解码。

enable_vae_slicing

< 源 >

( )

启用切片 VAE 解码。启用此选项后，VAE 会将输入张量分片，分步计算解码。这有助于节省一些内存并允许更大的批次大小。

enable_vae_tiling

< 源 >

( )

启用平铺 VAE 解码。启用此选项后，VAE 将把输入张量分割成瓦片，分多步计算编码和解码。这对于节省大量内存和处理更大的图像非常有用。

encode_prompt

< 源 >

( prompt: typing.Union[str, typing.List[str]] negative_prompt: typing.Union[str, typing.List[str]] = None device: typing.Optional[torch.device] = None num_images_per_prompt: int = 1 prompt_embeds: typing.Optional[torch.Tensor] = None negative_prompt_embeds: typing.Optional[torch.Tensor] = None prompt_attention_mask: typing.Optional[torch.Tensor] = None negative_prompt_attention_mask: typing.Optional[torch.Tensor] = None do_classifier_free_guidance: bool = True max_sequence_length: int = 512 lora_scale: typing.Optional[float] = None )

参数

prompt (str 或 List[str]，可选) — 要编码的提示词
negative_prompt (str 或 List[str]，可选) — 不用于引导图像生成的提示词。如果未定义，则必须传递negative_prompt_embeds。当不使用引导时（即，如果guidance_scale小于1），则忽略此参数。
device — (torch.device)：torch设备
num_images_per_prompt (int) — 每个提示词应生成的图像数量
prompt_embeds (torch.Tensor，可选) — 预生成的文本嵌入。可用于轻松调整文本输入，例如提示词权重。如果未提供，将根据prompt输入参数生成文本嵌入。
lora_scale (float，可选) — 应用于文本编码器所有LoRA层的LoRA比例，如果已加载LoRA层。

ChromaImg2ImgPipeline

class diffusers.ChromaImg2ImgPipeline

< 源 >

参数

transformer (ChromaTransformer2DModel) — 用于去噪编码图像潜在表示的条件Transformer（MMDiT）架构。
scheduler (FlowMatchEulerDiscreteScheduler) — 与transformer结合使用以去噪编码图像潜在表示的调度器。
vae (AutoencoderKL) — 用于编码和解码图像到潜在表示的变分自编码器（VAE）模型
text_encoder (T5EncoderModel) — T5，特别是google/t5-v1_1-xxl变体。
tokenizer (T5TokenizerFast) — T5TokenizerFast类的第二个分词器。

Chroma图像到图像生成管道。

参考：https://huggingface.co/lodestones/Chroma/

call

< 源 >

( prompt: typing.Union[str, typing.List[str]] = None negative_prompt: typing.Union[str, typing.List[str]] = None image: typing.Union[PIL.Image.Image, numpy.ndarray, torch.Tensor, typing.List[PIL.Image.Image], typing.List[numpy.ndarray], typing.List[torch.Tensor]] = None height: typing.Optional[int] = None width: typing.Optional[int] = None num_inference_steps: int = 35 sigmas: typing.Optional[typing.List[float]] = None guidance_scale: float = 5.0 strength: float = 0.9 num_images_per_prompt: typing.Optional[int] = 1 generator: typing.Union[torch._C.Generator, typing.List[torch._C.Generator], NoneType] = None latents: typing.Optional[torch.Tensor] = None prompt_embeds: typing.Optional[torch.Tensor] = None ip_adapter_image: typing.Union[PIL.Image.Image, numpy.ndarray, torch.Tensor, typing.List[PIL.Image.Image], typing.List[numpy.ndarray], typing.List[torch.Tensor], NoneType] = None ip_adapter_image_embeds: typing.Optional[typing.List[torch.Tensor]] = None negative_ip_adapter_image: typing.Union[PIL.Image.Image, numpy.ndarray, torch.Tensor, typing.List[PIL.Image.Image], typing.List[numpy.ndarray], typing.List[torch.Tensor], NoneType] = None negative_ip_adapter_image_embeds: typing.Optional[typing.List[torch.Tensor]] = None negative_prompt_embeds: typing.Optional[torch.Tensor] = None prompt_attention_mask: typing.Optional[torch.Tensor] = None negative_prompt_attention_mask: typing.Optional[<built-in method tensor of type object at 0x7fc54b7d6f40>] = None output_type: typing.Optional[str] = 'pil' return_dict: bool = True joint_attention_kwargs: typing.Optional[typing.Dict[str, typing.Any]] = None callback_on_step_end: typing.Optional[typing.Callable[[int, int, typing.Dict], NoneType]] = None callback_on_step_end_tensor_inputs: typing.List[str] = ['latents'] max_sequence_length: int = 512 ) → ~pipelines.chroma.ChromaPipelineOutput 或 tuple

参数

prompt (str 或 List[str]，可选) — 引导图像生成的提示词。如果未定义，则必须传递prompt_embeds。
negative_prompt (str 或 List[str]，可选) — 不用于引导图像生成的提示词。如果未定义，则必须传递negative_prompt_embeds。当不使用引导时（即，如果guidance_scale不大于1），则忽略此参数。
height (int，可选，默认为self.unet.config.sample_size * self.vae_scale_factor) — 生成图像的像素高度。为获得最佳效果，默认设置为1024。
width (int，可选，默认为self.unet.config.sample_size * self.vae_scale_factor) — 生成图像的像素宽度。为获得最佳效果，默认设置为1024。
num_inference_steps (int，可选，默认为35) — 去噪步骤的数量。更多的去噪步骤通常会带来更高质量的图像，但会以较慢的推理速度为代价。
sigmas (List[float]，可选) — 用于去噪过程的自定义sigmas，适用于支持其set_timesteps方法中sigmas参数的调度器。如果未定义，将使用传递num_inference_steps时的默认行为。
guidance_scale (float，可选，默认为5.0) — Classifier-Free Diffusion Guidance中定义的引导比例。guidance_scale定义为Imagen Paper方程2中的w。通过将guidance_scale设置为大于1来启用引导比例。更高的引导比例鼓励生成与文本prompt密切相关的图像，通常以较低的图像质量为代价。
strength (`float，可选，默认为0.9) — 概念上，表示转换参考图像的程度。必须介于0和1之间。图像将作为起点，强度越大，添加的噪声越多。去噪步骤的数量取决于最初添加的噪声量。当强度为1时，添加的噪声将最大，去噪过程将运行在num_inference_steps中指定的完整迭代次数。因此，值为1基本上会忽略图像。
num_images_per_prompt (int，可选，默认为1) — 每个提示词生成的图像数量。
generator (torch.Generator 或 List[torch.Generator]，可选) — 一个或多个torch生成器，用于使生成具有确定性。
latents (torch.Tensor，可选) — 预生成的嘈杂潜在表示，从高斯分布中采样，用作图像生成的输入。可用于使用不同的提示词调整相同的生成。如果未提供，将通过使用提供的随机generator采样来生成潜在张量。
prompt_embeds (torch.Tensor，可选) — 预生成的文本嵌入。可用于轻松调整文本输入，例如提示词权重。如果未提供，将根据prompt输入参数生成文本嵌入。
ip_adapter_image — (PipelineImageInput，可选)：与IP Adapter配合使用的可选图像输入。
ip_adapter_image_embeds (List[torch.Tensor]，可选) — 预生成的IP-Adapter图像嵌入。它应该是一个列表，长度与IP-adapter的数量相同。每个元素应该是一个形状为(batch_size, num_images, emb_dim)的张量。如果未提供，则根据ip_adapter_image输入参数计算嵌入。
negative_ip_adapter_image — （PipelineImageInput，可选）：与IP Adapter配合使用的可选图像输入。
negative_ip_adapter_image_embeds (List[torch.Tensor]，可选) — 预生成的IP-Adapter图像嵌入。它应该是一个列表，长度与IP-adapter的数量相同。每个元素应该是一个形状为(batch_size, num_images, emb_dim)的张量。如果未提供，则根据ip_adapter_image输入参数计算嵌入。
negative_prompt_embeds (torch.Tensor，可选) — 预生成的负文本嵌入。可用于轻松调整文本输入，例如提示词权重。如果未提供，将根据negative_prompt输入参数生成negative_prompt_embeds。
prompt_attention_mask (torch.Tensor，可选) — 提示词嵌入的注意力掩码。用于遮盖提示词序列中的填充标记。Chroma要求一个填充标记保持未遮盖状态。请参阅https://huggingface.co/lodestones/Chroma#tldr-masking-t5-padding-tokens-enhanced-fidelity-and-increased-stability-during-training
negative_prompt_attention_mask (torch.Tensor，可选) — 负提示词嵌入的注意力掩码。用于遮盖负提示词序列中的填充标记。Chroma要求一个填充标记保持未遮盖状态。请参阅https://huggingface.co/lodestones/Chroma#tldr-masking-t5-padding-tokens-enhanced-fidelity-and-increased-stability-during-training
output_type (str，可选，默认为"pil") — 生成图像的输出格式。在PIL: PIL.Image.Image或np.array之间选择。
return_dict (bool，可选，默认为True) — 是否返回~pipelines.flux.ChromaPipelineOutput而不是普通元组。
joint_attention_kwargs (dict, 可选) — 一个 kwargs 字典，如果指定，将作为参数传递给 self.processor 中定义的 AttentionProcessor，参见 diffusers.models.attention_processor。
callback_on_step_end (Callable, 可选) — 一个在推理过程中每个去噪步骤结束时调用的函数。该函数以以下参数调用：callback_on_step_end(self: DiffusionPipeline, step: int, timestep: int, callback_kwargs: Dict)。callback_kwargs 将包含 callback_on_step_end_tensor_inputs 中指定的所有张量列表。
callback_on_step_end_tensor_inputs (List, 可选) — callback_on_step_end 函数的张量输入列表。列表中指定的张量将作为 callback_kwargs 参数传递。你只能包含流水线类 ._callback_tensor_inputs 属性中列出的变量。
max_sequence_length (int，默认为 512) — 与 prompt 一起使用的最大序列长度。

~pipelines.chroma.ChromaPipelineOutput 或 tuple

如果return_dict为True，则为~pipelines.chroma.ChromaPipelineOutput，否则为tuple。当返回元组时，第一个元素是生成的图像列表。

调用管道进行生成时调用的函数。

示例

>>> import torch
>>> from diffusers import ChromaTransformer2DModel, ChromaImg2ImgPipeline

>>> model_id = "lodestones/Chroma"
>>> ckpt_path = "https://huggingface.co/lodestones/Chroma/blob/main/chroma-unlocked-v37.safetensors"
>>> pipe = ChromaImg2ImgPipeline.from_pretrained(
...     model_id,
...     transformer=transformer,
...     torch_dtype=torch.bfloat16,
... )
>>> pipe.enable_model_cpu_offload()
>>> init_image = load_image(
...     "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
... )
>>> prompt = "a scenic fastasy landscape with a river and mountains in the background, vibrant colors, detailed, high resolution"
>>> negative_prompt = "low quality, ugly, unfinished, out of focus, deformed, disfigure, blurry, smudged, restricted palette, flat colors"
>>> image = pipe(prompt, image=init_image, negative_prompt=negative_prompt).images[0]
>>> image.save("chroma-img2img.png")

disable_vae_slicing

< 源 >

( )

禁用切片 VAE 解码。如果之前启用了 enable_vae_slicing，此方法将返回一步计算解码。

disable_vae_tiling

< 源 >

( )

禁用平铺 VAE 解码。如果之前启用了 enable_vae_tiling，此方法将恢复一步计算解码。

enable_vae_slicing

< 源 >

( )

启用切片 VAE 解码。启用此选项后，VAE 会将输入张量分片，分步计算解码。这有助于节省一些内存并允许更大的批次大小。

enable_vae_tiling

< 源 >

( )

启用平铺 VAE 解码。启用此选项后，VAE 将把输入张量分割成瓦片，分多步计算编码和解码。这对于节省大量内存和处理更大的图像非常有用。

encode_prompt

< 源 >

参数

prompt (str 或 List[str], 可选) — 待编码的提示词
negative_prompt (str 或 List[str], 可选) — 不用于引导图像生成的提示词。如果未定义，则必须传递 negative_prompt_embeds。当不使用引导时（即，如果 guidance_scale 小于 1 时），此参数将被忽略。
device — (torch.device): torch 设备
num_images_per_prompt (int) — 每个提示词应生成的图像数量
prompt_embeds (torch.Tensor, 可选) — 预生成的文本嵌入。可用于轻松调整文本输入，例如提示词权重。如果未提供，将根据 prompt 输入参数生成文本嵌入。
lora_scale (float, 可选) — 应用于文本编码器所有 LoRA 层的 LoRA 比例（如果 LoRA 层已加载）。

< > 在 GitHub 上更新

←BLIP-Diffusion CogVideoX→

Diffusers

Chroma

推理

从单个文件加载

ChromaPipeline

class diffusers.ChromaPipeline

__call__

disable_vae_slicing

disable_vae_tiling

enable_vae_slicing

enable_vae_tiling

encode_prompt

ChromaImg2ImgPipeline

class diffusers.ChromaImg2ImgPipeline

__call__

disable_vae_slicing

disable_vae_tiling

enable_vae_slicing

enable_vae_tiling

encode_prompt

call

call