Diffusers 文档

HiDreamImageTransformer2DModel

Hugging Face's logo
加入 Hugging Face 社区

并获得增强的文档体验

开始使用

HiDreamImageTransformer2DModel

来自 HiDream-I1 的用于图像类数据的 Transformer 模型。

该模型可以通过以下代码片段加载。

from diffusers import HiDreamImageTransformer2DModel

transformer = HiDreamImageTransformer2DModel.from_pretrained("HiDream-ai/HiDream-I1-Full", subfolder="transformer", torch_dtype=torch.bfloat16)

为 HiDream-I1 加载 GGUF 量化检查点

可以使用 ~FromOriginalModelMixin.from_single_file 加载 HiDreamImageTransformer2DModel 的 GGUF 检查点。

import torch
from diffusers import GGUFQuantizationConfig, HiDreamImageTransformer2DModel

ckpt_path = "https://huggingface.co/city96/HiDream-I1-Dev-gguf/blob/main/hidream-i1-dev-Q2_K.gguf"
transformer = HiDreamImageTransformer2DModel.from_single_file(
    ckpt_path,
    quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
    torch_dtype=torch.bfloat16
)

HiDreamImageTransformer2DModel

class diffusers.HiDreamImageTransformer2DModel

< >

( patch_size: typing.Optional[int] = None in_channels: int = 64 out_channels: typing.Optional[int] = None num_layers: int = 16 num_single_layers: int = 32 attention_head_dim: int = 128 num_attention_heads: int = 20 caption_channels: typing.List[int] = None text_emb_dim: int = 2048 num_routed_experts: int = 4 num_activated_experts: int = 2 axes_dims_rope: typing.Tuple[int, int] = (32, 32) max_resolution: typing.Tuple[int, int] = (128, 128) llama_layers: typing.List[int] = None force_inference_output: bool = False )

Transformer2DModelOutput

class diffusers.models.modeling_outputs.Transformer2DModelOutput

< >

( sample: torch.Tensor )

参数

  • sample (形状为 (batch_size, num_channels, height, width)torch.Tensor 或当 Transformer2DModel 为离散时,形状为 (batch size, num_vector_embeds - 1, num_latent_pixels)) — 在 encoder_hidden_states 输入条件下输出的隐藏状态。如果是离散的,则返回未去噪的潜在像素的概率分布。

Transformer2DModel 的输出。

< > 在 GitHub 上更新