Diffusers 文档
T2I-Adapter
加入 Hugging Face 社区
并获得增强的文档体验
开始使用
T2I-Adapter
T2I-Adapter 是一种适配器,可实现类似 ControlNet 的可控生成。T2I-Adapter 通过学习控制信号(例如,深度图)与预训练模型内部知识之间的*映射*来工作。适配器插入到基础模型中,以便在生成过程中根据控制信号提供额外指导。
加载一个以特定控制(如 Canny 边缘)为条件的 T2I-Adapter,并将其传递给 from_pretrained() 中的流水线。
import torch
from diffusers import T2IAdapter, StableDiffusionXLAdapterPipeline, AutoencoderKL
t2i_adapter = T2IAdapter.from_pretrained(
"TencentARC/t2i-adapter-canny-sdxl-1.0",
torch_dtype=torch.float16,
)
使用 opencv-python 生成 Canny 图像。
import cv2
import numpy as np
from PIL import Image
from diffusers.utils import load_image
original_image = load_image(
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/non-enhanced-prompt.png"
)
image = np.array(original_image)
low_threshold = 100
high_threshold = 200
image = cv2.Canny(image, low_threshold, high_threshold)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
canny_image = Image.fromarray(image)
将 Canny 图像传递给流水线以生成图像。
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipeline = StableDiffusionXLAdapterPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
adapter=t2i_adapter,
vae=vae,
torch_dtype=torch.float16,
).to("cuda")
prompt = """
A photorealistic overhead image of a cat reclining sideways in a flamingo pool floatie holding a margarita.
The cat is floating leisurely in the pool and completely relaxed and happy.
"""
pipeline(
prompt,
image=canny_image,
num_inference_steps=100,
guidance_scale=10,
).images[0]



MultiAdapter
您可以使用 MultiAdapter
类组合多个控制,例如 Canny 图像和深度图。
下面的示例组合了 Canny 图像和深度图。
将控制图像和 T2I-Adapter 作为列表加载。
import torch
from diffusers.utils import load_image
from diffusers import StableDiffusionXLAdapterPipeline, AutoencoderKL, MultiAdapter, T2IAdapter
canny_image = load_image(
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/canny-cat.png"
)
depth_image = load_image(
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/sdxl_depth_image.png"
)
controls = [canny_image, depth_image]
prompt = ["""
a relaxed rabbit sitting on a striped towel next to a pool with a tropical drink nearby,
bright sunny day, vacation scene, 35mm photograph, film, professional, 4k, highly detailed
"""]
adapters = MultiAdapter(
[
T2IAdapter.from_pretrained("TencentARC/t2i-adapter-canny-sdxl-1.0", torch_dtype=torch.float16),
T2IAdapter.from_pretrained("TencentARC/t2i-adapter-depth-midas-sdxl-1.0", torch_dtype=torch.float16),
]
)
将适配器、提示和控制图像传递给 StableDiffusionXLAdapterPipeline。使用 adapter_conditioning_scale
参数来确定每个控制的权重。
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipeline = StableDiffusionXLAdapterPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
torch_dtype=torch.float16,
vae=vae,
adapter=adapters,
).to("cuda")
pipeline(
prompt,
image=controls,
height=1024,
width=1024,
adapter_conditioning_scale=[0.7, 0.7]
).images[0]


