Optimum 文档

使用 Diffusion 模型生成图像

您正在查看 main 版本,该版本需要从源代码安装。如果您想要常规 pip 安装,请查看最新的稳定版本 (v1.24.0)。
Hugging Face's logo
加入 Hugging Face 社区

并获得增强的文档体验

开始使用

使用 Diffusion 模型生成图像

Stable Diffusion

Stable Diffusion 模型也可以在与 OpenVINO 运行推理时使用。当 Stable Diffusion 模型导出为 OpenVINO 格式时,它们会被分解为不同的组件,这些组件稍后在推理期间组合在一起

  • 文本编码器
  • U-NET
  • VAE 编码器
  • VAE 解码器
任务 Auto Class
text-to-image OVStableDiffusionPipeline
image-to-image OVStableDiffusionImg2ImgPipeline
inpaint OVStableDiffusionInpaintPipeline

Text-to-Image

这是一个关于如何加载 OpenVINO Stable Diffusion 模型并使用 OpenVINO Runtime 运行推理的示例

from optimum.intel import OVStableDiffusionPipeline

model_id = "echarlaix/stable-diffusion-v1-5-openvino"
pipeline = OVStableDiffusionPipeline.from_pretrained(model_id)
prompt = "sailing ship in storm by Rembrandt"
images = pipeline(prompt).images

要动态加载您的 PyTorch 模型并将其转换为 OpenVINO,您可以设置 export=True

model_id = "runwayml/stable-diffusion-v1-5"
pipeline = OVStableDiffusionPipeline.from_pretrained(model_id, export=True)
# Don't forget to save the exported model
pipeline.save_pretrained("openvino-sd-v1-5")

为了进一步加速推理,可以静态地重塑模型

# Define the shapes related to the inputs and desired outputs
batch_size, num_images, height, width = 1, 1, 512, 512
# Statically reshape the model
pipeline.reshape(batch_size=batch_size, height=height, width=width, num_images_per_prompt=num_images)
# Compile the model before the first inference
pipeline.compile()

# Run inference
images = pipeline(prompt, height=height, width=width, num_images_per_prompt=num_images).images

如果您想更改任何参数,例如输出的高度或宽度,您需要再次静态地重塑您的模型。

Text-to-Image with Textual Inversion

这是一个关于如何加载具有预训练的文本反演嵌入的 OpenVINO Stable Diffusion 模型并使用 OpenVINO Runtime 运行推理的示例

首先,您可以运行不带文本反演的原始 pipeline

from optimum.intel import OVStableDiffusionPipeline
import numpy as np

model_id = "echarlaix/stable-diffusion-v1-5-openvino"
prompt = "A <cat-toy> back-pack"
# Set a random seed for better comparison
np.random.seed(42)

pipeline = OVStableDiffusionPipeline.from_pretrained(model_id, export=False, compile=False)
pipeline.compile()
image1 = pipeline(prompt, num_inference_steps=50).images[0]
image1.save("stable_diffusion_v1_5_without_textual_inversion.png")

然后,您可以加载 sd-concepts-library/cat-toy 文本反演嵌入,并再次使用相同的提示运行 pipeline

# Reset stable diffusion pipeline
pipeline.clear_requests()

# Load textual inversion into stable diffusion pipeline
pipeline.load_textual_inversion("sd-concepts-library/cat-toy", "<cat-toy>")

# Compile the model before the first inference
pipeline.compile()
image2 = pipeline(prompt, num_inference_steps=50).images[0]
image2.save("stable_diffusion_v1_5_with_textual_inversion.png")

左图显示了原始 stable diffusion v1.5 的生成结果,右图显示了带有文本反演的 stable diffusion v1.5 的生成结果。

Image-to-Image

import requests
import torch
from PIL import Image
from io import BytesIO
from optimum.intel import OVStableDiffusionImg2ImgPipeline

model_id = "runwayml/stable-diffusion-v1-5"
pipeline = OVStableDiffusionImg2ImgPipeline.from_pretrained(model_id, export=True)

url = "https://raw.githubusercontent.com/CompVis/stable-diffusion/main/assets/stable-samples/img2img/sketch-mountains-input.jpg"
response = requests.get(url)
init_image = Image.open(BytesIO(response.content)).convert("RGB")
init_image = init_image.resize((768, 512))
prompt = "A fantasy landscape, trending on artstation"
image = pipeline(prompt=prompt, image=init_image, strength=0.75, guidance_scale=7.5).images[0]
image.save("fantasy_landscape.png")

Stable Diffusion XL

任务 Auto Class
text-to-image OVStableDiffusionXLPipeline
image-to-image OVStableDiffusionXLImg2ImgPipeline

Text-to-Image

这是一个关于如何从 stabilityai/stable-diffusion-xl-base-1.0 加载 SDXL OpenVINO 模型并使用 OpenVINO Runtime 运行推理的示例

from optimum.intel import OVStableDiffusionXLPipeline

model_id = "stabilityai/stable-diffusion-xl-base-1.0"
base = OVStableDiffusionXLPipeline.from_pretrained(model_id)
prompt = "train station by Caspar David Friedrich"
image = base(prompt).images[0]
image.save("train_station.png")

Text-to-Image with Textual Inversion

这是一个关于如何从 stabilityai/stable-diffusion-xl-base-1.0 加载具有预训练的文本反演嵌入的 SDXL OpenVINO 模型并使用 OpenVINO Runtime 运行推理的示例

首先,您可以运行不带文本反演的原始 pipeline

from optimum.intel import OVStableDiffusionXLPipeline
import numpy as np

model_id = "stabilityai/stable-diffusion-xl-base-1.0"
prompt = "charturnerv2, multiple views of the same character in the same outfit, a character turnaround wearing a red jacket and black shirt, best quality, intricate details."
# Set a random seed for better comparison
np.random.seed(112)

base = OVStableDiffusionXLPipeline.from_pretrained(model_id, export=False, compile=False)
base.compile()
image1 = base(prompt, num_inference_steps=50).images[0]
image1.save("sdxl_without_textual_inversion.png")

然后,您可以加载 charturnerv2 文本反演嵌入,并再次使用相同的提示运行 pipeline

# Reset stable diffusion pipeline
base.clear_requests()

# Load textual inversion into stable diffusion pipeline
base.load_textual_inversion("./charturnerv2.pt", "charturnerv2")

# Compile the model before the first inference
base.compile()
image2 = base(prompt, num_inference_steps=50).images[0]
image2.save("sdxl_with_textual_inversion.png")

Image-to-Image

这是一个关于如何加载 PyTorch SDXL 模型、动态将其转换为 OpenVINO 并使用 OpenVINO Runtime 运行 image-to-image 推理的示例

from optimum.intel import OVStableDiffusionXLImg2ImgPipeline
from diffusers.utils import load_image

model_id = "stabilityai/stable-diffusion-xl-refiner-1.0"
pipeline = OVStableDiffusionXLImg2ImgPipeline.from_pretrained(model_id, export=True)

url = "https://huggingface.co/datasets/optimum/documentation-images/resolve/main/intel/openvino/sd_xl/castle_friedrich.png"
image = load_image(url).convert("RGB")
prompt = "medieval castle by Caspar David Friedrich"
image = pipeline(prompt, image=image).images[0]
# Don't forget to save your OpenVINO model so that you can load it without exporting it with `export=True`
pipeline.save_pretrained("openvino-sd-xl-refiner-1.0")

优化图像输出

可以通过使用像 stabilityai/stable-diffusion-xl-refiner-1.0 这样的模型来优化图像。在这种情况下,您只需要输出基础模型的 latents。

from optimum.intel import OVStableDiffusionXLImg2ImgPipeline

model_id = "stabilityai/stable-diffusion-xl-refiner-1.0"
refiner = OVStableDiffusionXLImg2ImgPipeline.from_pretrained(model_id, export=True)

image = base(prompt=prompt, output_type="latent").images[0]
image = refiner(prompt=prompt, image=image[None, :]).images[0]

Latent Consistency Models

任务 Auto Class
text-to-image OVLatentConsistencyModelPipeline

Text-to-Image

这是一个关于如何从 SimianLuo/LCM_Dreamshaper_v7 加载 Latent Consistency Model (LCM) 并使用 OpenVINO 运行推理的示例

from optimum.intel import OVLatentConsistencyModelPipeline

model_id = "SimianLuo/LCM_Dreamshaper_v7"
pipeline = OVLatentConsistencyModelPipeline.from_pretrained(model_id, export=True)
prompt = "sailing ship in storm by Leonardo da Vinci"
images = pipeline(prompt, num_inference_steps=4, guidance_scale=8.0).images
< > 在 GitHub 上更新