Diffusers

加入 Hugging Face 社区

并获得增强的文档体验

协作开发模型、数据集和 Spaces

通过加速推理获得更快的示例

在文档主题之间切换

开始使用

无条件图像生成

无条件图像生成模型在训练期间不以文本或图像为条件。它仅生成与其训练数据分布相似的图像。

本指南将探讨 train_unconditional.py 训练脚本，以帮助您熟悉它，以及如何将其调整为适合您自己的用例。

在运行脚本之前，请确保从源代码安装库

git clone https://github.com/huggingface/diffusers
cd diffusers
pip install .

然后导航到包含训练脚本的示例文件夹，并安装所需的依赖项

cd examples/unconditional_image_generation
pip install -r requirements.txt

🤗 Accelerate 是一个库，旨在帮助您在多个 GPU/TPU 上或使用混合精度进行训练。它将根据您的硬件和环境自动配置您的训练设置。请查看 🤗 Accelerate 快速入门以了解更多信息。

初始化 🤗 Accelerate 环境

accelerate config

要设置默认的 🤗 Accelerate 环境而无需选择任何配置

accelerate config default

或者，如果您的环境不支持像 notebook 这样的交互式 shell，您可以使用

from accelerate.utils import write_basic_config

write_basic_config()

最后，如果您想在自己的数据集上训练模型，请查看创建用于训练的数据集指南，了解如何创建与训练脚本配合使用的数据集。

脚本参数

以下部分重点介绍了训练脚本中对于理解如何修改脚本很重要的部分，但并未详细介绍脚本的每个方面。如果您有兴趣了解更多信息，请随时阅读脚本，如果您有任何问题或疑虑，请告诉我们。

训练脚本提供了许多参数来帮助您自定义训练运行。所有参数及其描述都可以在 parse_args() 函数中找到。它为每个参数提供了默认值，例如训练批大小和学习率，但如果您愿意，也可以在训练命令中设置自己的值。

例如，要使用 bf16 格式通过混合精度加速训练，请将 --mixed_precision 参数添加到训练命令中

accelerate launch train_unconditional.py \
  --mixed_precision="bf16"

一些基本且重要的参数包括：

--dataset_name：Hub 上的数据集名称或要训练的数据集的本地路径
--output_dir：保存已训练模型的位置
--push_to_hub：是否将已训练的模型推送到 Hub
--checkpointing_steps：在模型训练时保存检查点的频率；如果训练中断，这将非常有用，您可以通过将 --resume_from_checkpoint 添加到您的训练命令，从该检查点继续训练

带上您的数据集，让训练脚本处理其他一切！

训练脚本

用于预处理数据集和训练循环的代码在 main() 函数中。如果您需要调整训练脚本，则需要在此处进行更改。

如果您不提供模型配置，train_unconditional 脚本会初始化一个 UNet2DModel。如果您愿意，可以在此处配置 UNet

model = UNet2DModel(
    sample_size=args.resolution,
    in_channels=3,
    out_channels=3,
    layers_per_block=2,
    block_out_channels=(128, 128, 256, 256, 512, 512),
    down_block_types=(
        "DownBlock2D",
        "DownBlock2D",
        "DownBlock2D",
        "DownBlock2D",
        "AttnDownBlock2D",
        "DownBlock2D",
    ),
    up_block_types=(
        "UpBlock2D",
        "AttnUpBlock2D",
        "UpBlock2D",
        "UpBlock2D",
        "UpBlock2D",
        "UpBlock2D",
    ),
)

接下来，脚本初始化一个 scheduler 和 optimizer

# Initialize the scheduler
accepts_prediction_type = "prediction_type" in set(inspect.signature(DDPMScheduler.__init__).parameters.keys())
if accepts_prediction_type:
    noise_scheduler = DDPMScheduler(
        num_train_timesteps=args.ddpm_num_steps,
        beta_schedule=args.ddpm_beta_schedule,
        prediction_type=args.prediction_type,
    )
else:
    noise_scheduler = DDPMScheduler(num_train_timesteps=args.ddpm_num_steps, beta_schedule=args.ddpm_beta_schedule)

# Initialize the optimizer
optimizer = torch.optim.AdamW(
    model.parameters(),
    lr=args.learning_rate,
    betas=(args.adam_beta1, args.adam_beta2),
    weight_decay=args.adam_weight_decay,
    eps=args.adam_epsilon,
)

然后它加载数据集，您可以指定如何预处理它

dataset = load_dataset("imagefolder", data_dir=args.train_data_dir, cache_dir=args.cache_dir, split="train")

augmentations = transforms.Compose(
    [
        transforms.Resize(args.resolution, interpolation=transforms.InterpolationMode.BILINEAR),
        transforms.CenterCrop(args.resolution) if args.center_crop else transforms.RandomCrop(args.resolution),
        transforms.RandomHorizontalFlip() if args.random_flip else transforms.Lambda(lambda x: x),
        transforms.ToTensor(),
        transforms.Normalize([0.5], [0.5]),
    ]
)

最后，训练循环处理其他所有事情，例如向图像添加噪声、预测噪声残差、计算损失、在指定的步骤保存检查点，以及保存模型并将其推送到 Hub。如果您想了解有关训练循环如何工作的更多信息，请查看理解 pipelines, models and schedulers 教程，该教程分解了去噪过程的基本模式。

启动脚本

一旦您完成了所有更改，或者您对默认配置感到满意，您就可以启动训练脚本了！🚀

在 4xV100 GPU 上，完整的训练运行需要 2 小时。

单 GPU

多 GPU

训练脚本会在您的存储库中创建并保存一个检查点文件。现在您可以加载和使用您训练好的模型进行推理

from diffusers import DiffusionPipeline
import torch

pipeline = DiffusionPipeline.from_pretrained("anton-l/ddpm-butterflies-128").to("cuda")
image = pipeline().images[0]

< > 在 GitHub 上更新

←调整模型以适应新任务文本到图像→