Transformers

( )

这是一个通用的分词器类，当使用 AutoTokenizer.from_pretrained() 类方法创建时，它将被实例化为库中的一个分词器类。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_pretrained

( pretrained_model_name_or_path *inputs **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 上的模型仓库中预定义分词器的 模型 ID。
- 一个包含分词器所需词汇文件的目录路径，例如使用 save_pretrained() 方法保存的目录，例如 ./my_model_directory/。
- 当且仅当分词器只需要单个词汇文件时（如 Bert 或 XLNet），可以是一个指向单个已保存词汇文件的路径或 URL，例如：./my_model_directory/vocab.txt。（不适用于所有派生类）
inputs (其他位置参数, 可选) — 将传递给分词器的 `__init__()` 方法。
config (PretrainedConfig, 可选) — 用于确定要实例化的分词器类的配置对象。
cache_dir (str or os.PathLike, 可选) — 当不应使用标准缓存时，下载的预训练模型配置应缓存到的目录路径。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，并覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 `revision` 可以是 git 允许的任何标识符。
subfolder (str, 可选) — 如果相关文件位于 huggingface.co 上的模型仓库的子文件夹中（例如，对于 facebook/rag-token-base），请在此处指定。
use_fast (bool, 可选, 默认为 True) — 如果给定模型支持，则使用基于 Rust 的快速分词器。如果给定模型没有可用的快速分词器，则返回普通的基于 Python 的分词器。
tokenizer_type (str, 可选) — 要加载的分词器类型。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上使用其自定义建模文件定义的模型。此选项只应为您信任且已阅读其代码的仓库设置为 `True`，因为它将在您的本地计算机上执行 Hub 上的代码。
kwargs (其他关键字参数, 可选) — 将传递给分词器的 `__init__()` 方法。可用于设置特殊标记，如 `bos_token`, `eos_token`, `unk_token`, `sep_token`, `pad_token`, `cls_token`, `mask_token`, `additional_special_tokens`。更多详情请参阅 `__init__()` 中的参数。

从预训练模型的词汇表中实例化库中的一个分词器类。

要实例化的分词器类是根据配置对象（作为参数传递或尽可能从 `pretrained_model_name_or_path` 加载）的 `model_type` 属性来选择的，或者当该属性缺失时，则通过对 `pretrained_model_name_or_path` 进行模式匹配来回退选择。

albert — `AlbertTokenizer` 或 AlbertTokenizerFast (ALBERT 模型)
align — BertTokenizer 或 BertTokenizerFast (ALIGN 模型)
arcee — LlamaTokenizer 或 LlamaTokenizerFast (Arcee 模型)
aria — LlamaTokenizer 或 LlamaTokenizerFast (Aria 模型)
aya_vision — CohereTokenizerFast (AyaVision 模型)
bark — BertTokenizer 或 BertTokenizerFast (Bark 模型)
bart — BartTokenizer 或 BartTokenizerFast (BART 模型)
barthez — BarthezTokenizer 或 BarthezTokenizerFast (BARThez 模型)
bartpho — BartphoTokenizer (BARTpho 模型)
bert — BertTokenizer 或 BertTokenizerFast (BERT 模型)
bert-generation — BertGenerationTokenizer (Bert Generation 模型)
bert-japanese — BertJapaneseTokenizer (BertJapanese 模型)
bertweet — BertweetTokenizer (BERTweet 模型)
big_bird — BigBirdTokenizer 或 BigBirdTokenizerFast (BigBird 模型)
bigbird_pegasus — PegasusTokenizer 或 PegasusTokenizerFast (BigBird-Pegasus 模型)
biogpt — BioGptTokenizer (BioGpt 模型)
bitnet — PreTrainedTokenizerFast (BitNet 模型)
blenderbot — BlenderbotTokenizer 或 BlenderbotTokenizerFast (Blenderbot 模型)
blenderbot-small — BlenderbotSmallTokenizer (BlenderbotSmall 模型)
blip — BertTokenizer 或 BertTokenizerFast (BLIP 模型)
blip-2 — GPT2Tokenizer 或 GPT2TokenizerFast (BLIP-2 模型)
bloom — BloomTokenizerFast (BLOOM 模型)
bridgetower — RobertaTokenizer 或 RobertaTokenizerFast (BridgeTower 模型)
bros — BertTokenizer 或 BertTokenizerFast (BROS 模型)
byt5 — ByT5Tokenizer (ByT5 模型)
camembert — CamembertTokenizer 或 CamembertTokenizerFast (CamemBERT 模型)
canine — CanineTokenizer (CANINE 模型)
chameleon — LlamaTokenizer 或 LlamaTokenizerFast (Chameleon 模型)
chinese_clip — BertTokenizer 或 BertTokenizerFast (Chinese-CLIP 模型)
clap — RobertaTokenizer 或 RobertaTokenizerFast (CLAP 模型)
clip — CLIPTokenizer 或 CLIPTokenizerFast (CLIP 模型)
clipseg — CLIPTokenizer 或 CLIPTokenizerFast (CLIPSeg 模型)
clvp — ClvpTokenizer (CLVP 模型)
code_llama — CodeLlamaTokenizer 或 CodeLlamaTokenizerFast (CodeLlama 模型)
codegen — CodeGenTokenizer 或 CodeGenTokenizerFast (CodeGen 模型)
cohere — CohereTokenizerFast (Cohere 模型)
cohere2 — CohereTokenizerFast (Cohere2 模型)
colpali — LlamaTokenizer 或 LlamaTokenizerFast (ColPali 模型)
colqwen2 — Qwen2Tokenizer 或 Qwen2TokenizerFast (ColQwen2 模型)
convbert — ConvBertTokenizer 或 ConvBertTokenizerFast (ConvBERT 模型)
cpm — CpmTokenizer 或 CpmTokenizerFast (CPM 模型)
cpmant — CpmAntTokenizer (CPM-Ant 模型)
ctrl — CTRLTokenizer (CTRL 模型)
data2vec-audio — Wav2Vec2CTCTokenizer (Data2VecAudio 模型)
data2vec-text — RobertaTokenizer 或 RobertaTokenizerFast (Data2VecText 模型)
dbrx — GPT2Tokenizer 或 GPT2TokenizerFast (DBRX 模型)
deberta — DebertaTokenizer 或 DebertaTokenizerFast (DeBERTa 模型)
deberta-v2 — DebertaV2Tokenizer 或 DebertaV2TokenizerFast (DeBERTa-v2 模型)
deepseek_v3 — LlamaTokenizer 或 LlamaTokenizerFast (DeepSeek-V3 模型)
dia — DiaTokenizer (Dia 模型)
diffllama — LlamaTokenizer 或 LlamaTokenizerFast (DiffLlama 模型)
distilbert — DistilBertTokenizer 或 DistilBertTokenizerFast (DistilBERT 模型)
dpr — DPRQuestionEncoderTokenizer 或 DPRQuestionEncoderTokenizerFast (DPR 模型)
electra — ElectraTokenizer 或 ElectraTokenizerFast (ELECTRA 模型)
emu3 — GPT2Tokenizer 或 GPT2TokenizerFast (Emu3 模型)
ernie — BertTokenizer 或 BertTokenizerFast (ERNIE 模型)
ernie_m — ErnieMTokenizer (ErnieM 模型)
esm — EsmTokenizer (ESM 模型)
falcon — PreTrainedTokenizerFast (Falcon 模型)
falcon_mamba — GPTNeoXTokenizerFast (FalconMamba 模型)
fastspeech2_conformer — (FastSpeech2Conformer 模型)
flaubert — FlaubertTokenizer (FlauBERT 模型)
fnet — FNetTokenizer 或 FNetTokenizerFast (FNet 模型)
fsmt — FSMTTokenizer (FairSeq 机器翻译模型)
funnel — FunnelTokenizer 或 FunnelTokenizerFast (Funnel Transformer 模型)
gemma — GemmaTokenizer 或 GemmaTokenizerFast (Gemma 模型)
gemma2 — GemmaTokenizer 或 GemmaTokenizerFast (Gemma2 模型)
gemma3 — GemmaTokenizer 或 GemmaTokenizerFast (Gemma3ForConditionalGeneration 模型)
gemma3_text — GemmaTokenizer 或 GemmaTokenizerFast (Gemma3ForCausalLM 模型)
gemma3n — GemmaTokenizer 或 GemmaTokenizerFast (Gemma3nForConditionalGeneration 模型)
gemma3n_text — GemmaTokenizer 或 GemmaTokenizerFast (Gemma3nForCausalLM 模型)
git — BertTokenizer 或 BertTokenizerFast (GIT 模型)
glm — PreTrainedTokenizerFast (GLM 模型)
glm4 — PreTrainedTokenizerFast (GLM4 模型)
glm4v — PreTrainedTokenizerFast (GLM4V 模型)
gpt-sw3 — GPTSw3Tokenizer (GPT-Sw3 模型)
gpt2 — GPT2Tokenizer 或 GPT2TokenizerFast (OpenAI GPT-2 模型)
gpt_bigcode — GPT2Tokenizer 或 GPT2TokenizerFast (GPTBigCode 模型)
gpt_neo — GPT2Tokenizer 或 GPT2TokenizerFast (GPT Neo 模型)
gpt_neox — GPTNeoXTokenizerFast (GPT NeoX 模型)
gpt_neox_japanese — GPTNeoXJapaneseTokenizer (GPT NeoX Japanese 模型)
gptj — GPT2Tokenizer 或 GPT2TokenizerFast (GPT-J 模型)
gptsan-japanese — GPTSanJapaneseTokenizer (GPTSAN-japanese 模型)
granite — GPT2Tokenizer (Granite 模型)
granitemoe — GPT2Tokenizer (GraniteMoeMoe 模型)
granitemoehybrid — GPT2Tokenizer (GraniteMoeHybrid 模型)
granitemoeshared — GPT2Tokenizer (GraniteMoeSharedMoe 模型)
grounding-dino — BertTokenizer 或 BertTokenizerFast (Grounding DINO 模型)
groupvit — CLIPTokenizer 或 CLIPTokenizerFast (GroupViT 模型)
helium — PreTrainedTokenizerFast (Helium 模型)
herbert — HerbertTokenizer 或 HerbertTokenizerFast (HerBERT 模型)
hubert — Wav2Vec2CTCTokenizer (Hubert 模型)
ibert — RobertaTokenizer 或 RobertaTokenizerFast (I-BERT 模型)
idefics — LlamaTokenizerFast (IDEFICS 模型)
idefics2 — LlamaTokenizer 或 LlamaTokenizerFast (Idefics2 模型)
idefics3 — LlamaTokenizer 或 LlamaTokenizerFast (Idefics3 模型)
instructblip — GPT2Tokenizer 或 GPT2TokenizerFast (InstructBLIP 模型)
instructblipvideo — GPT2Tokenizer 或 GPT2TokenizerFast (InstructBlipVideo 模型)
internvl — Qwen2Tokenizer 或 Qwen2TokenizerFast (InternVL 模型)
jamba — LlamaTokenizer 或 LlamaTokenizerFast (Jamba 模型)
janus — LlamaTokenizerFast (Janus 模型)
jetmoe — LlamaTokenizer 或 LlamaTokenizerFast (JetMoe 模型)
jukebox — JukeboxTokenizer (Jukebox 模型)
kosmos-2 — XLMRobertaTokenizer 或 XLMRobertaTokenizerFast (KOSMOS-2 模型)
layoutlm — LayoutLMTokenizer 或 LayoutLMTokenizerFast (LayoutLM 模型)
layoutlmv2 — LayoutLMv2Tokenizer 或 LayoutLMv2TokenizerFast (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3Tokenizer 或 LayoutLMv3TokenizerFast (LayoutLMv3 模型)
layoutxlm — LayoutXLMTokenizer 或 LayoutXLMTokenizerFast (LayoutXLM 模型)
led — LEDTokenizer 或 LEDTokenizerFast (LED 模型)
lilt — LayoutLMv3Tokenizer 或 LayoutLMv3TokenizerFast (LiLT 模型)
llama — LlamaTokenizer 或 LlamaTokenizerFast (LLaMA 模型)
llama4 — LlamaTokenizer 或 LlamaTokenizerFast (Llama4 模型)
llama4_text — LlamaTokenizer 或 LlamaTokenizerFast (Llama4ForCausalLM 模型)
llava — LlamaTokenizer 或 LlamaTokenizerFast (LLaVa 模型)
llava_next — LlamaTokenizer 或 LlamaTokenizerFast (LLaVA-NeXT 模型)
llava_next_video — LlamaTokenizer 或 LlamaTokenizerFast (LLaVa-NeXT-Video 模型)
llava_onevision — LlamaTokenizer 或 LlamaTokenizerFast (LLaVA-Onevision 模型)
longformer — LongformerTokenizer 或 LongformerTokenizerFast (Longformer 模型)
longt5 — T5Tokenizer 或 T5TokenizerFast (LongT5 模型)
luke — LukeTokenizer (LUKE 模型)
lxmert — LxmertTokenizer 或 LxmertTokenizerFast (LXMERT 模型)
m2m_100 — M2M100Tokenizer (M2M100 模型)
mamba — GPTNeoXTokenizerFast (Mamba 模型)
mamba2 — GPTNeoXTokenizerFast (mamba2 模型)
marian — MarianTokenizer (Marian 模型)
mbart — MBartTokenizer 或 MBartTokenizerFast (mBART 模型)
mbart50 — MBart50Tokenizer 或 MBart50TokenizerFast (mBART-50 模型)
mega — RobertaTokenizer 或 RobertaTokenizerFast (MEGA 模型)
megatron-bert — BertTokenizer 或 BertTokenizerFast (Megatron-BERT 模型)
mgp-str — MgpstrTokenizer (MGP-STR 模型)
minimax — GPT2Tokenizer 或 GPT2TokenizerFast (MiniMax 模型)
mistral — LlamaTokenizer 或 LlamaTokenizerFast (Mistral 模型)
mixtral — LlamaTokenizer 或 LlamaTokenizerFast (Mixtral 模型)
mllama — LlamaTokenizer 或 LlamaTokenizerFast (Mllama 模型)
mluke — MLukeTokenizer (mLUKE 模型)
mobilebert — MobileBertTokenizer 或 MobileBertTokenizerFast (MobileBERT 模型)
modernbert — PreTrainedTokenizerFast (ModernBERT 模型)
moonshine — PreTrainedTokenizerFast (Moonshine 模型)
moshi — PreTrainedTokenizerFast (Moshi 模型)
mpnet — MPNetTokenizer 或 MPNetTokenizerFast (MPNet 模型)
mpt — GPTNeoXTokenizerFast (MPT 模型)
mra — RobertaTokenizer 或 RobertaTokenizerFast (MRA 模型)
mt5 — MT5Tokenizer 或 MT5TokenizerFast (MT5 模型)
musicgen — T5Tokenizer 或 T5TokenizerFast (MusicGen 模型)
musicgen_melody — T5Tokenizer 或 T5TokenizerFast (MusicGen Melody 模型)
mvp — MvpTokenizer 或 MvpTokenizerFast (MVP 模型)
myt5 — MyT5Tokenizer (myt5 模型)
nemotron — PreTrainedTokenizerFast (Nemotron 模型)
nezha — BertTokenizer 或 BertTokenizerFast (Nezha 模型)
nllb — NllbTokenizer 或 NllbTokenizerFast (NLLB 模型)
nllb-moe — NllbTokenizer 或 NllbTokenizerFast (NLLB-MOE 模型)
nystromformer — `AlbertTokenizer` 或 AlbertTokenizerFast (Nyströmformer 模型)
olmo — GPTNeoXTokenizerFast (OLMo 模型)
olmo2 — GPTNeoXTokenizerFast (OLMo2 模型)
olmoe — GPTNeoXTokenizerFast (OLMoE 模型)
omdet-turbo — CLIPTokenizer 或 CLIPTokenizerFast (OmDet-Turbo 模型)
oneformer — CLIPTokenizer 或 CLIPTokenizerFast (OneFormer 模型)
openai-gpt — OpenAIGPTTokenizer 或 OpenAIGPTTokenizerFast (OpenAI GPT 模型)
opt — GPT2Tokenizer 或 GPT2TokenizerFast (OPT 模型)
owlv2 — CLIPTokenizer 或 CLIPTokenizerFast (OWLv2 模型)
owlvit — CLIPTokenizer 或 CLIPTokenizerFast (OWL-ViT 模型)
paligemma — LlamaTokenizer 或 LlamaTokenizerFast (PaliGemma 模型)
pegasus — PegasusTokenizer 或 PegasusTokenizerFast (Pegasus 模型)
pegasus_x — PegasusTokenizer 或 PegasusTokenizerFast (PEGASUS-X 模型)
perceiver — PerceiverTokenizer (Perceiver 模型)
persimmon — LlamaTokenizer 或 LlamaTokenizerFast (Persimmon 模型)
phi — CodeGenTokenizer 或 CodeGenTokenizerFast (Phi 模型)
phi3 — LlamaTokenizer 或 LlamaTokenizerFast (Phi3 模型)
phimoe — LlamaTokenizer 或 LlamaTokenizerFast (Phimoe 模型)
phobert — PhobertTokenizer (PhoBERT 模型)
pix2struct — T5Tokenizer 或 T5TokenizerFast (Pix2Struct 模型)
pixtral — PreTrainedTokenizerFast (Pixtral 模型)
plbart — PLBartTokenizer (PLBart 模型)
prophetnet — ProphetNetTokenizer (ProphetNet 模型)
qdqbert — BertTokenizer 或 BertTokenizerFast (QDQBert 模型)
qwen2 — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen2 模型)
qwen2_5_omni — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen2_5Omni 模型)
qwen2_5_vl — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen2_5_VL 模型)
qwen2_audio — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen2Audio 模型)
qwen2_moe — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen2MoE 模型)
qwen2_vl — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen2VL 模型)
qwen3 — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen3 模型)
qwen3_moe — Qwen2Tokenizer 或 Qwen2TokenizerFast (Qwen3MoE 模型)
rag — RagTokenizer (RAG 模型)
realm — RealmTokenizer 或 RealmTokenizerFast (REALM 模型)
recurrent_gemma — GemmaTokenizer 或 GemmaTokenizerFast (RecurrentGemma 模型)
reformer — ReformerTokenizer 或 ReformerTokenizerFast (Reformer 模型)
rembert — RemBertTokenizer 或 RemBertTokenizerFast (RemBERT 模型)
retribert — RetriBertTokenizer 或 RetriBertTokenizerFast (RetriBERT 模型)
roberta — RobertaTokenizer 或 RobertaTokenizerFast (RoBERTa 模型)
roberta-prelayernorm — RobertaTokenizer 或 RobertaTokenizerFast (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertTokenizer (RoCBert 模型)
roformer — RoFormerTokenizer 或 RoFormerTokenizerFast (RoFormer 模型)
rwkv — GPTNeoXTokenizerFast (RWKV 模型)
seamless_m4t — SeamlessM4TTokenizer 或 SeamlessM4TTokenizerFast (SeamlessM4T 模型)
seamless_m4t_v2 — SeamlessM4TTokenizer 或 SeamlessM4TTokenizerFast (SeamlessM4Tv2 模型)
shieldgemma2 — GemmaTokenizer 或 GemmaTokenizerFast (Shieldgemma2 模型)
siglip — SiglipTokenizer (SigLIP 模型)
siglip2 — GemmaTokenizer 或 GemmaTokenizerFast (SigLIP2 模型)
smollm3 — PreTrainedTokenizerFast (SmolLM3 模型)
speech_to_text — Speech2TextTokenizer (Speech2Text 模型)
speech_to_text_2 — Speech2Text2Tokenizer (Speech2Text2 模型)
speecht5 — SpeechT5Tokenizer (SpeechT5 模型)
splinter — SplinterTokenizer 或 SplinterTokenizerFast (Splinter 模型)
squeezebert — SqueezeBertTokenizer 或 SqueezeBertTokenizerFast (SqueezeBERT 模型)
stablelm — GPTNeoXTokenizerFast (StableLm 模型)
starcoder2 — GPT2Tokenizer 或 GPT2TokenizerFast (Starcoder2 模型)
switch_transformers — T5Tokenizer 或 T5TokenizerFast (SwitchTransformers 模型)
t5 — T5Tokenizer 或 T5TokenizerFast (T5 模型)
t5gemma — GemmaTokenizer 或 GemmaTokenizerFast (T5Gemma 模型)
tapas — TapasTokenizer (TAPAS 模型)
tapex — TapexTokenizer (TAPEX 模型)
transfo-xl — TransfoXLTokenizer (Transformer-XL 模型)
tvp — BertTokenizer 或 BertTokenizerFast (TVP 模型)
udop — UdopTokenizer 或 UdopTokenizerFast (UDOP 模型)
umt5 — T5Tokenizer 或 T5TokenizerFast (UMT5 模型)
video_llava — LlamaTokenizer 或 LlamaTokenizerFast (VideoLlava 模型)
vilt — BertTokenizer 或 BertTokenizerFast (ViLT 模型)
vipllava — LlamaTokenizer 或 LlamaTokenizerFast (VipLlava 模型)
visual_bert — BertTokenizer 或 BertTokenizerFast (VisualBERT 模型)
vits — VitsTokenizer (VITS 模型)
wav2vec2 — Wav2Vec2CTCTokenizer (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2CTCTokenizer (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2CTCTokenizer (Wav2Vec2-Conformer 模型)
wav2vec2_phoneme — Wav2Vec2PhonemeCTCTokenizer (Wav2Vec2Phoneme 模型)
whisper — WhisperTokenizer 或 WhisperTokenizerFast (Whisper 模型)
xclip — CLIPTokenizer 或 CLIPTokenizerFast (X-CLIP 模型)
xglm — XGLMTokenizer 或 XGLMTokenizerFast (XGLM 模型)
xlm — XLMTokenizer (XLM 模型)
xlm-prophetnet — XLMProphetNetTokenizer (XLM-ProphetNet 模型)
xlm-roberta — XLMRobertaTokenizer 或 XLMRobertaTokenizerFast (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaTokenizer 或 XLMRobertaTokenizerFast (XLM-RoBERTa-XL 模型)
xlnet — XLNetTokenizer 或 XLNetTokenizerFast (XLNet 模型)
xmod — XLMRobertaTokenizer 或 XLMRobertaTokenizerFast (X-MOD 模型)
yoso — `AlbertTokenizer` 或 AlbertTokenizerFast (YOSO 模型)
zamba — LlamaTokenizer 或 LlamaTokenizerFast (Zamba 模型)
zamba2 — LlamaTokenizer 或 LlamaTokenizerFast (Zamba2 模型)

示例

>>> from transformers import AutoTokenizer

>>> # Download vocabulary from huggingface.co and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("google-bert/bert-base-uncased")

>>> # Download vocabulary from huggingface.co (user-uploaded) and cache.
>>> tokenizer = AutoTokenizer.from_pretrained("dbmdz/bert-base-german-cased")

>>> # If vocabulary files are in a directory (e.g. tokenizer was saved using *save_pretrained('./test/saved_model/')*)
>>> # tokenizer = AutoTokenizer.from_pretrained("./test/bert_saved_model/")

>>> # Download vocabulary from huggingface.co and define model-specific arguments
>>> tokenizer = AutoTokenizer.from_pretrained("FacebookAI/roberta-base", add_prefix_space=True)

register

( config_class slow_tokenizer_class = None fast_tokenizer_class = None exist_ok = False )

参数

config_class (PretrainedConfig) — 与要注册的模型相对应的配置。
slow_tokenizer_class (PretrainedTokenizer, 可选) — 要注册的慢速分词器。
fast_tokenizer_class (PretrainedTokenizerFast, 可选) — 要注册的快速分词器。

在此映射中注册一个新的分词器。

AutoFeatureExtractor

class transformers.AutoFeatureExtractor

( )

这是一个通用的特征提取器类，当使用 AutoFeatureExtractor.from_pretrained() 类方法创建时，它将被实例化为库中的一个特征提取器类。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_pretrained

( pretrained_model_name_or_path **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 上的模型仓库中预训练特征提取器的 模型 ID。
- 一个包含使用 save_pretrained() 方法保存的特征提取器文件的目录路径，例如，./my_model_directory/。
- 一个指向已保存的特征提取器 JSON 文件的路径或 URL，例如，./my_model_directory/preprocessor_config.json。
cache_dir (str 或 os.PathLike, 可选) — 当不应使用标准缓存时，下载的预训练模型特征提取器应缓存到的目录路径。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载特征提取器文件，并覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers v5 中移除。
proxies (dict[str, str], 可选) — 一个用于按协议或端点指定代理服务器的字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
token (str 或 bool, 可选) — 用于远程文件的 HTTP Bearer 授权的 token。如果为 True，将使用运行 huggingface-cli login 时生成的 token（存储在 ~/.huggingface 中）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
return_unused_kwargs (bool, 可选, 默认为 False) — 如果为 False，则此函数仅返回最终的特征提取器对象。如果为 True，则此函数返回一个 Tuple(feature_extractor, unused_kwargs)，其中 *unused_kwargs* 是一个字典，包含其键不是特征提取器属性的键/值对：即 kwargs 中未用于更新 feature_extractor 且在其他情况下被忽略的部分。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许 Hub 上自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
kwargs (dict[str, Any], 可选) — `kwargs` 中任何键是特征提取器属性的值将用于覆盖加载的值。对于键 *不是* 特征提取器属性的键/值对的行为由 return_unused_kwargs 关键字参数控制。

从预训练模型词汇表中实例化库中的一个特征提取器类。

要实例化的特征提取器类是根据配置对象（作为参数传递或尽可能从 `pretrained_model_name_or_path` 加载）的 `model_type` 属性选择的，或者当它缺失时，通过回退到对 `pretrained_model_name_or_path` 进行模式匹配来选择。

audio-spectrogram-transformer — ASTFeatureExtractor (Audio Spectrogram Transformer 模型)
beit — BeitFeatureExtractor (BEiT 模型)
chinese_clip — ChineseCLIPFeatureExtractor (Chinese-CLIP 模型)
clap — ClapFeatureExtractor (CLAP 模型)
clip — CLIPFeatureExtractor (CLIP 模型)
clipseg — ViTFeatureExtractor (CLIPSeg 模型)
clvp — ClvpFeatureExtractor (CLVP 模型)
conditional_detr — ConditionalDetrFeatureExtractor (Conditional DETR 模型)
convnext — ConvNextFeatureExtractor (ConvNeXT 模型)
cvt — ConvNextFeatureExtractor (CvT 模型)
dac — DacFeatureExtractor (DAC 模型)
data2vec-audio — Wav2Vec2FeatureExtractor (Data2VecAudio 模型)
data2vec-vision — BeitFeatureExtractor (Data2VecVision 模型)
deformable_detr — DeformableDetrFeatureExtractor (Deformable DETR 模型)
deit — DeiTFeatureExtractor (DeiT 模型)
detr — DetrFeatureExtractor (DETR 模型)
dia — DiaFeatureExtractor (Dia 模型)
dinat — ViTFeatureExtractor (DiNAT 模型)
donut-swin — DonutFeatureExtractor (DonutSwin 模型)
dpt — DPTFeatureExtractor (DPT 模型)
encodec — EncodecFeatureExtractor (EnCodec 模型)
flava — FlavaFeatureExtractor (FLAVA 模型)
gemma3n — Gemma3nAudioFeatureExtractor (Gemma3nForConditionalGeneration 模型)
glpn — GLPNFeatureExtractor (GLPN 模型)
granite_speech — GraniteSpeechFeatureExtractor (GraniteSpeech 模型)
groupvit — CLIPFeatureExtractor (GroupViT 模型)
hubert — Wav2Vec2FeatureExtractor (Hubert 模型)
imagegpt — ImageGPTFeatureExtractor (ImageGPT 模型)
kyutai_speech_to_text — KyutaiSpeechToTextFeatureExtractor (KyutaiSpeechToText 模型)
layoutlmv2 — LayoutLMv2FeatureExtractor (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3FeatureExtractor (LayoutLMv3 模型)
levit — LevitFeatureExtractor (LeViT 模型)
maskformer — MaskFormerFeatureExtractor (MaskFormer 模型)
mctct — MCTCTFeatureExtractor (M-CTC-T 模型)
mimi — EncodecFeatureExtractor (Mimi 模型)
mobilenet_v1 — MobileNetV1FeatureExtractor (MobileNetV1 模型)
mobilenet_v2 — MobileNetV2FeatureExtractor (MobileNetV2 模型)
mobilevit — MobileViTFeatureExtractor (MobileViT 模型)
moonshine — Wav2Vec2FeatureExtractor (Moonshine 模型)
moshi — EncodecFeatureExtractor (Moshi 模型)
nat — ViTFeatureExtractor (NAT 模型)
owlvit — OwlViTFeatureExtractor (OWL-ViT 模型)
perceiver — PerceiverFeatureExtractor (Perceiver 模型)
phi4_multimodal — Phi4MultimodalFeatureExtractor (Phi4Multimodal 模型)
poolformer — PoolFormerFeatureExtractor (PoolFormer 模型)
pop2piano — Pop2PianoFeatureExtractor (Pop2Piano 模型)
regnet — ConvNextFeatureExtractor (RegNet 模型)
resnet — ConvNextFeatureExtractor (ResNet 模型)
seamless_m4t — SeamlessM4TFeatureExtractor (SeamlessM4T 模型)
seamless_m4t_v2 — SeamlessM4TFeatureExtractor (SeamlessM4Tv2 模型)
segformer — SegformerFeatureExtractor (SegFormer 模型)
sew — Wav2Vec2FeatureExtractor (SEW 模型)
sew-d — Wav2Vec2FeatureExtractor (SEW-D 模型)
speech_to_text — Speech2TextFeatureExtractor (Speech2Text 模型)
speecht5 — SpeechT5FeatureExtractor (SpeechT5 模型)
swiftformer — ViTFeatureExtractor (SwiftFormer 模型)
swin — ViTFeatureExtractor (Swin Transformer 模型)
swinv2 — ViTFeatureExtractor (Swin Transformer V2 模型)
table-transformer — DetrFeatureExtractor (Table Transformer 模型)
timesformer — VideoMAEFeatureExtractor (TimeSformer 模型)
tvlt — TvltFeatureExtractor (TVLT 模型)
unispeech — Wav2Vec2FeatureExtractor (UniSpeech 模型)
unispeech-sat — Wav2Vec2FeatureExtractor (UniSpeechSat 模型)
univnet — UnivNetFeatureExtractor (UnivNet 模型)
van — ConvNextFeatureExtractor (VAN 模型)
videomae — VideoMAEFeatureExtractor (VideoMAE 模型)
vilt — ViltFeatureExtractor (ViLT 模型)
vit — ViTFeatureExtractor (ViT 模型)
vit_mae — ViTFeatureExtractor (ViTMAE 模型)
vit_msn — ViTFeatureExtractor (ViTMSN 模型)
wav2vec2 — Wav2Vec2FeatureExtractor (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2FeatureExtractor (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2FeatureExtractor (Wav2Vec2-Conformer 模型)
wavlm — Wav2Vec2FeatureExtractor (WavLM 模型)
whisper — WhisperFeatureExtractor (Whisper 模型)
xclip — CLIPFeatureExtractor (X-CLIP 模型)
yolos — YolosFeatureExtractor (YOLOS 模型)

当您想使用私有模型时，需要传递 token=True。

示例

>>> from transformers import AutoFeatureExtractor

>>> # Download feature extractor from huggingface.co and cache.
>>> feature_extractor = AutoFeatureExtractor.from_pretrained("facebook/wav2vec2-base-960h")

>>> # If feature extractor files are in a directory (e.g. feature extractor was saved using *save_pretrained('./test/saved_model/')*)
>>> # feature_extractor = AutoFeatureExtractor.from_pretrained("./test/saved_model/")

register

( config_class feature_extractor_class exist_ok = False )

参数

config_class (PretrainedConfig) — 与要注册的模型相对应的配置。
feature_extractor_class (FeatureExtractorMixin) — 要注册的特征提取器。

为此类注册一个新的特征提取器。

AutoImageProcessor

class transformers.AutoImageProcessor

( )

这是一个通用的图像处理器类，当使用 AutoImageProcessor.from_pretrained() 类方法创建时，它将被实例化为库中的一个图像处理器类。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_pretrained

( pretrained_model_name_or_path *inputs **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 这可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练 image_processor 的 *模型 ID*。
- 一个包含使用 save_pretrained() 方法保存的图像处理器文件的 *目录* 路径，例如 ./my_model_directory/。
- 一个指向已保存的图像处理器 JSON *文件* 的路径或 URL，例如 ./my_model_directory/preprocessor_config.json。
cache_dir (str 或 os.PathLike, 可选) — 当不应使用标准缓存时，将下载的预训练模型图像处理器缓存到的目录路径。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载图像处理器文件并覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers v5 中移除。
proxies (dict[str, str], 可选) — 一个用于按协议或端点指定代理服务器的字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
token (str 或 bool, 可选) — 用于远程文件的 HTTP Bearer 授权的 token。如果为 True，将使用运行 huggingface-cli login 时生成的 token（存储在 ~/.huggingface 中）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
use_fast (bool, 可选, 默认为 False) — 如果给定模型支持，则使用基于 torchvision 的快速图像处理器。如果给定模型没有快速图像处理器，则返回基于 numpy 的普通图像处理器。
return_unused_kwargs (bool, 可选, 默认为 False) — 如果为 False，则此函数仅返回最终的图像处理器对象。如果为 True，则此函数返回一个 Tuple(image_processor, unused_kwargs)，其中 *unused_kwargs* 是一个字典，包含其键不是图像处理器属性的键/值对：即 kwargs 中未用于更新 image_processor 且在其他情况下被忽略的部分。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许 Hub 上自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
image_processor_filename (str, 可选, 默认为 "config.json") — 模型目录中用于图像处理器配置的文件名。
kwargs (dict[str, Any], 可选) — `kwargs` 中任何键是图像处理器属性的值将用于覆盖加载的值。对于键 *不是* 图像处理器属性的键/值对的行为由 return_unused_kwargs 关键字参数控制。

从预训练模型词汇表中实例化库中的一个图像处理器类。

要实例化的图像处理器类是根据配置对象（作为参数传递或尽可能从 `pretrained_model_name_or_path` 加载）的 `model_type` 属性选择的，或者当它缺失时，通过回退到对 `pretrained_model_name_or_path` 进行模式匹配来选择。

align — EfficientNetImageProcessor 或 EfficientNetImageProcessorFast (ALIGN 模型)
aria — A 或 r (Aria 模型)
beit — BeitImageProcessor 或 BeitImageProcessorFast (BEiT 模型)
bit — BitImageProcessor 或 BitImageProcessorFast (BiT 模型)
blip — BlipImageProcessor 或 BlipImageProcessorFast (BLIP 模型)
blip-2 — BlipImageProcessor 或 BlipImageProcessorFast (BLIP-2 模型)
bridgetower — BridgeTowerImageProcessor 或 BridgeTowerImageProcessorFast (BridgeTower 模型)
chameleon — ChameleonImageProcessor (Chameleon 模型)
chinese_clip — ChineseCLIPImageProcessor 或 ChineseCLIPImageProcessorFast (Chinese-CLIP 模型)
clip — CLIPImageProcessor 或 CLIPImageProcessorFast (CLIP 模型)
clipseg — ViTImageProcessor 或 ViTImageProcessorFast (CLIPSeg 模型)
conditional_detr — ConditionalDetrImageProcessor 或 ConditionalDetrImageProcessorFast (Conditional DETR 模型)
convnext — ConvNextImageProcessor 或 ConvNextImageProcessorFast (ConvNeXT 模型)
convnextv2 — ConvNextImageProcessor 或 ConvNextImageProcessorFast (ConvNeXTV2 模型)
cvt — ConvNextImageProcessor 或 ConvNextImageProcessorFast (CvT 模型)
data2vec-vision — BeitImageProcessor 或 BeitImageProcessorFast (Data2VecVision 模型)
deformable_detr — DeformableDetrImageProcessor 或 DeformableDetrImageProcessorFast (Deformable DETR 模型)
deit — DeiTImageProcessor 或 DeiTImageProcessorFast (DeiT 模型)
depth_anything — DPTImageProcessor 或 DPTImageProcessorFast (Depth Anything 模型)
depth_pro — DepthProImageProcessor 或 DepthProImageProcessorFast (DepthPro 模型)
deta — DetaImageProcessor (DETA 模型)
detr — DetrImageProcessor 或 DetrImageProcessorFast (DETR 模型)
dinat — ViTImageProcessor 或 ViTImageProcessorFast (DiNAT 模型)
dinov2 — BitImageProcessor 或 BitImageProcessorFast (DINOv2 模型)
donut-swin — DonutImageProcessor 或 DonutImageProcessorFast (DonutSwin 模型)
dpt — DPTImageProcessor 或 DPTImageProcessorFast (DPT 模型)
efficientformer — EfficientFormerImageProcessor (EfficientFormer 模型)
efficientnet — EfficientNetImageProcessor 或 EfficientNetImageProcessorFast (EfficientNet 模型)
flava — FlavaImageProcessor 或 FlavaImageProcessorFast (FLAVA 模型)
focalnet — BitImageProcessor 或 BitImageProcessorFast (FocalNet 模型)
fuyu — FuyuImageProcessor (Fuyu 模型)
gemma3 — Gemma3ImageProcessor 或 Gemma3ImageProcessorFast (Gemma3ForConditionalGeneration 模型)
gemma3n — SiglipImageProcessor 或 SiglipImageProcessorFast (Gemma3nForConditionalGeneration 模型)
git — CLIPImageProcessor 或 CLIPImageProcessorFast (GIT 模型)
glm4v — Glm4vImageProcessor 或 Glm4vImageProcessorFast (GLM4V 模型)
glpn — GLPNImageProcessor (GLPN 模型)
got_ocr2 — GotOcr2ImageProcessor 或 GotOcr2ImageProcessorFast (GOT-OCR2 模型)
grounding-dino — GroundingDinoImageProcessor 或 GroundingDinoImageProcessorFast (Grounding DINO 模型)
groupvit — CLIPImageProcessor 或 CLIPImageProcessorFast (GroupViT 模型)
hiera — BitImageProcessor 或 BitImageProcessorFast (Hiera 模型)
idefics — IdeficsImageProcessor (IDEFICS 模型)
idefics2 — Idefics2ImageProcessor 或 Idefics2ImageProcessorFast (Idefics2 模型)
idefics3 — Idefics3ImageProcessor 或 Idefics3ImageProcessorFast (Idefics3 模型)
ijepa — ViTImageProcessor 或 ViTImageProcessorFast (I-JEPA 模型)
imagegpt — ImageGPTImageProcessor (ImageGPT 模型)
instructblip — BlipImageProcessor 或 BlipImageProcessorFast (InstructBLIP 模型)
instructblipvideo — InstructBlipVideoImageProcessor (InstructBlipVideo 模型)
janus — J 或 a (Janus 模型)
kosmos-2 — CLIPImageProcessor 或 CLIPImageProcessorFast (KOSMOS-2 模型)
layoutlmv2 — LayoutLMv2ImageProcessor 或 LayoutLMv2ImageProcessorFast (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3ImageProcessor 或 LayoutLMv3ImageProcessorFast (LayoutLMv3 模型)
levit — LevitImageProcessor 或 LevitImageProcessorFast (LeViT 模型)
lightglue — LightGlueImageProcessor (LightGlue 模型)
llama4 — Llama4ImageProcessor 或 Llama4ImageProcessorFast (Llama4 模型)
llava — LlavaImageProcessor 或 LlavaImageProcessorFast (LLaVa 模型)
llava_next — LlavaNextImageProcessor 或 LlavaNextImageProcessorFast (LLaVA-NeXT 模型)
llava_next_video — LlavaNextVideoImageProcessor (LLaVa-NeXT-Video 模型)
llava_onevision — LlavaOnevisionImageProcessor 或 LlavaOnevisionImageProcessorFast (LLaVA-Onevision 模型)
mask2former — Mask2FormerImageProcessor (Mask2Former 模型)
maskformer — MaskFormerImageProcessor (MaskFormer 模型)
mgp-str — ViTImageProcessor 或 ViTImageProcessorFast (MGP-STR 模型)
mistral3 — PixtralImageProcessor 或 PixtralImageProcessorFast (Mistral3 模型)
mlcd — CLIPImageProcessor 或 CLIPImageProcessorFast (MLCD 模型)
mllama — MllamaImageProcessor (Mllama 模型)
mobilenet_v1 — MobileNetV1ImageProcessor 或 MobileNetV1ImageProcessorFast (MobileNetV1 模型)
mobilenet_v2 — MobileNetV2ImageProcessor 或 MobileNetV2ImageProcessorFast (MobileNetV2 模型)
mobilevit — MobileViTImageProcessor (MobileViT 模型)
mobilevitv2 — MobileViTImageProcessor (MobileViTV2 模型)
nat — ViTImageProcessor 或 ViTImageProcessorFast (NAT 模型)
nougat — NougatImageProcessor (Nougat 模型)
oneformer — OneFormerImageProcessor (OneFormer 模型)
owlv2 — Owlv2ImageProcessor (OWLv2 模型)
owlvit — OwlViTImageProcessor 或 OwlViTImageProcessorFast (OWL-ViT 模型)
paligemma — SiglipImageProcessor 或 SiglipImageProcessorFast (PaliGemma 模型)
perceiver — PerceiverImageProcessor 或 PerceiverImageProcessorFast (Perceiver 模型)
phi4_multimodal — Phi4MultimodalImageProcessorFast (Phi4Multimodal 模型)
pix2struct — Pix2StructImageProcessor (Pix2Struct 模型)
pixtral — PixtralImageProcessor 或 PixtralImageProcessorFast (Pixtral 模型)
poolformer — PoolFormerImageProcessor 或 PoolFormerImageProcessorFast (PoolFormer 模型)
prompt_depth_anything — PromptDepthAnythingImageProcessor (PromptDepthAnything 模型)
pvt — PvtImageProcessor 或 PvtImageProcessorFast (PVT 模型)
pvt_v2 — PvtImageProcessor 或 PvtImageProcessorFast (PVTv2 模型)
qwen2_5_vl — Qwen2VLImageProcessor 或 Qwen2VLImageProcessorFast (Qwen2_5_VL 模型)
qwen2_vl — Qwen2VLImageProcessor 或 Qwen2VLImageProcessorFast (Qwen2VL 模型)
regnet — ConvNextImageProcessor 或 ConvNextImageProcessorFast (RegNet 模型)
resnet — ConvNextImageProcessor 或 ConvNextImageProcessorFast (ResNet 模型)
rt_detr — RTDetrImageProcessor 或 RTDetrImageProcessorFast (RT-DETR 模型)
sam — SamImageProcessor (SAM 模型)
sam_hq — SamImageProcessor (SAM-HQ 模型)
segformer — SegformerImageProcessor (SegFormer 模型)
seggpt — SegGptImageProcessor (SegGPT 模型)
shieldgemma2 — Gemma3ImageProcessor 或 Gemma3ImageProcessorFast (Shieldgemma2 模型)
siglip — SiglipImageProcessor 或 SiglipImageProcessorFast (SigLIP 模型)
siglip2 — Siglip2ImageProcessor 或 Siglip2ImageProcessorFast (SigLIP2 模型)
smolvlm — SmolVLMImageProcessor 或 SmolVLMImageProcessorFast (SmolVLM 模型)
superglue — SuperGlueImageProcessor (SuperGlue 模型)
swiftformer — ViTImageProcessor 或 ViTImageProcessorFast (SwiftFormer 模型)
swin — ViTImageProcessor 或 ViTImageProcessorFast (Swin Transformer 模型)
swin2sr — Swin2SRImageProcessor 或 Swin2SRImageProcessorFast (Swin2SR 模型)
swinv2 — ViTImageProcessor 或 ViTImageProcessorFast (Swin Transformer V2 模型)
table-transformer — DetrImageProcessor (Table Transformer 模型)
timesformer — VideoMAEImageProcessor (TimeSformer 模型)
timm_wrapper — TimmWrapperImageProcessor (TimmWrapperModel 模型)
tvlt — TvltImageProcessor (TVLT 模型)
tvp — TvpImageProcessor (TVP 模型)
udop — LayoutLMv3ImageProcessor 或 LayoutLMv3ImageProcessorFast (UDOP 模型)
upernet — SegformerImageProcessor (UPerNet 模型)
van — ConvNextImageProcessor 或 ConvNextImageProcessorFast (VAN 模型)
videomae — VideoMAEImageProcessor (VideoMAE 模型)
vilt — ViltImageProcessor 或 ViltImageProcessorFast (ViLT 模型)
vipllava — CLIPImageProcessor 或 CLIPImageProcessorFast (VipLlava 模型)
vit — ViTImageProcessor 或 ViTImageProcessorFast (ViT 模型)
vit_hybrid — ViTHybridImageProcessor (ViT Hybrid 模型)
vit_mae — ViTImageProcessor 或 ViTImageProcessorFast (ViTMAE 模型)
vit_msn — ViTImageProcessor 或 ViTImageProcessorFast (ViTMSN 模型)
vitmatte — VitMatteImageProcessor 或 VitMatteImageProcessorFast (ViTMatte 模型)
xclip — CLIPImageProcessor 或 CLIPImageProcessorFast (X-CLIP 模型)
yolos — YolosImageProcessor 或 YolosImageProcessorFast (YOLOS 模型)
zoedepth — ZoeDepthImageProcessor 或 ZoeDepthImageProcessorFast (ZoeDepth 模型)

当您想使用私有模型时，需要传递 token=True。

示例

>>> from transformers import AutoImageProcessor

>>> # Download image processor from huggingface.co and cache.
>>> image_processor = AutoImageProcessor.from_pretrained("google/vit-base-patch16-224-in21k")

>>> # If image processor files are in a directory (e.g. image processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # image_processor = AutoImageProcessor.from_pretrained("./test/saved_model/")

register

( config_class image_processor_class = None slow_image_processor_class = None fast_image_processor_class = None exist_ok = False )

参数

config_class (PretrainedConfig) — 与要注册的模型相对应的配置。
image_processor_class (ImageProcessingMixin) — 要注册的图像处理器。

为此类注册一个新的图像处理器。

AutoVideoProcessor

class transformers.AutoVideoProcessor

( )

这是一个通用的视频处理器类，当使用 AutoVideoProcessor.from_pretrained() 类方法创建时，它将被实例化为库中的一个视频处理器类。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_pretrained

( pretrained_model_name_or_path *inputs **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 字符串，即 huggingface.co 上模型仓库中托管的预训练 video_processor 的 model id。
- 包含使用 save_pretrained() 方法保存的视频处理器文件的目录路径，例如 ./my_model_directory/。
- 已保存的视频处理器 JSON 文件的路径或 URL，例如 ./my_model_directory/preprocessor_config.json。
cache_dir (str or os.PathLike, optional) — 如果不应使用标准缓存，则为下载的预训练模型视频处理器应缓存到的目录路径。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载视频处理器文件并覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], optional) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
token (str or bool, optional) — 用于远程文件的 HTTP Bearer 授权的令牌。如果为 True，将使用运行 huggingface-cli login 时生成的令牌（存储在 ~/.huggingface 中）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
return_unused_kwargs (bool, optional, defaults to False) — 如果为 False，则此函数仅返回最终的视频处理器对象。如果为 True，则此函数返回一个 Tuple(video_processor, unused_kwargs)，其中 unused_kwargs 是一个字典，包含其键不是视频处理器属性的键/值对：即 kwargs 中未用于更新 video_processor 且在其他情况下被忽略的部分。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上的自定义模型文件中定义模型。此选项只应为你信任且已阅读其代码的仓库设置为 True，因为它将在你的本地计算机上执行 Hub 上的代码。
kwargs (dict[str, Any], optional) — kwargs 中任何键是视频处理器属性的值将用于覆盖加载的值。对于键不是视频处理器属性的键/值对的行为由 return_unused_kwargs 关键字参数控制。

从预训练模型词汇表中实例化库中的一个视频处理器类。

要实例化的视频处理器类是根据 config 对象的 model_type 属性选择的（可以作为参数传递，也可以从 pretrained_model_name_or_path 加载），或者在缺少该属性时，通过对 pretrained_model_name_or_path 进行模式匹配来回退选择。

glm4v — Glm4vVideoProcessor (GLM4V 模型)
instructblip — InstructBlipVideoVideoProcessor (InstructBLIP 模型)
instructblipvideo — InstructBlipVideoVideoProcessor (InstructBlipVideo 模型)
internvl — InternVLVideoProcessor (InternVL 模型)
llava_next_video — LlavaNextVideoVideoProcessor (LLaVa-NeXT-Video 模型)
llava_onevision — LlavaOnevisionVideoProcessor (LLaVA-Onevision 模型)
qwen2_5_omni — Qwen2VLVideoProcessor (Qwen2_5Omni 模型)
qwen2_5_vl — Qwen2VLVideoProcessor (Qwen2_5_VL 模型)
qwen2_vl — Qwen2VLVideoProcessor (Qwen2VL 模型)
smolvlm — SmolVLMVideoProcessor (SmolVLM 模型)
video_llava — VideoLlavaVideoProcessor (VideoLlava 模型)
vjepa2 — VJEPA2VideoProcessor (VJEPA2Model 模型)

当您想使用私有模型时，需要传递 token=True。

示例

>>> from transformers import AutoVideoProcessor

>>> # Download video processor from huggingface.co and cache.
>>> video_processor = AutoVideoProcessor.from_pretrained("llava-hf/llava-onevision-qwen2-0.5b-ov-hf")

>>> # If video processor files are in a directory (e.g. video processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # video_processor = AutoVideoProcessor.from_pretrained("./test/saved_model/")

register

（ config_class video_processor_class exist_ok = False ）

参数

config_class (PretrainedConfig) — 与要注册的模型相对应的配置。
video_processor_class (BaseVideoProcessor) — 要注册的视频处理器。

为此类注册一个新的视频处理器。

AutoProcessor

class transformers.AutoProcessor

( )

这是一个通用的处理器类，当使用 AutoProcessor.from_pretrained() 类方法创建时，它将被实例化为库中的一个处理器类。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_pretrained

( pretrained_model_name_or_path **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 字符串，即 huggingface.co 上模型仓库中托管的预训练 feature_extractor 的 model id。
- 包含使用 save_pretrained() 方法保存的处理器文件的目录路径，例如 ./my_model_directory/。
cache_dir (str or os.PathLike, optional) — 如果不应使用标准缓存，则为下载的预训练模型特征提取器应缓存到的目录路径。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载特征提取器文件并覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], optional) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
token (str or bool, optional) — 用于远程文件的 HTTP Bearer 授权的令牌。如果为 True，将使用运行 huggingface-cli login 时生成的令牌（存储在 ~/.huggingface 中）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
return_unused_kwargs (bool, optional, defaults to False) — 如果为 False，则此函数仅返回最终的特征提取器对象。如果为 True，则此函数返回一个 Tuple(feature_extractor, unused_kwargs)，其中 unused_kwargs 是一个字典，包含其键不是特征提取器属性的键/值对：即 kwargs 中未用于更新 feature_extractor 且在其他情况下被忽略的部分。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上的自定义模型文件中定义模型。此选项只应为你信任且已阅读其代码的仓库设置为 True，因为它将在你的本地计算机上执行 Hub 上的代码。
kwargs (dict[str, Any], optional) — kwargs 中任何键是特征提取器属性的值将用于覆盖加载的值。对于键不是特征提取器属性的键/值对的行为由 return_unused_kwargs 关键字参数控制。

从预训练模型词汇表中实例化库中的一个处理器类。

要实例化的处理器类是根据 config 对象的 model_type 属性选择的（可以作为参数传递，也可以从 pretrained_model_name_or_path 加载）。

align — AlignProcessor (ALIGN 模型)
altclip — AltCLIPProcessor (AltCLIP 模型)
aria — AriaProcessor (Aria 模型)
aya_vision — AyaVisionProcessor (AyaVision 模型)
bark — BarkProcessor (Bark 模型)
blip — BlipProcessor (BLIP 模型)
blip-2 — Blip2Processor (BLIP-2 模型)
bridgetower — BridgeTowerProcessor (BridgeTower 模型)
chameleon — ChameleonProcessor (Chameleon 模型)
chinese_clip — ChineseCLIPProcessor (Chinese-CLIP 模型)
clap — ClapProcessor (CLAP 模型)
clip — CLIPProcessor (CLIP 模型)
clipseg — CLIPSegProcessor (CLIPSeg 模型)
clvp — ClvpProcessor (CLVP 模型)
colpali — ColPaliProcessor (ColPali 模型)
colqwen2 — ColQwen2Processor (ColQwen2 模型)
dia — DiaProcessor (Dia 模型)
emu3 — Emu3Processor (Emu3 模型)
flava — FlavaProcessor (FLAVA 模型)
fuyu — FuyuProcessor (Fuyu 模型)
gemma3 — Gemma3Processor (Gemma3ForConditionalGeneration 模型)
gemma3n — Gemma3nProcessor (Gemma3nForConditionalGeneration 模型)
git — GitProcessor (GIT 模型)
glm4v — Glm4vProcessor (GLM4V 模型)
got_ocr2 — GotOcr2Processor (GOT-OCR2 模型)
granite_speech — GraniteSpeechProcessor (GraniteSpeech 模型)
grounding-dino — GroundingDinoProcessor (Grounding DINO 模型)
groupvit — CLIPProcessor (GroupViT 模型)
hubert — Wav2Vec2Processor (Hubert 模型)
idefics — IdeficsProcessor (IDEFICS 模型)
idefics2 — Idefics2Processor (Idefics2 模型)
idefics3 — Idefics3Processor (Idefics3 模型)
instructblip — InstructBlipProcessor (InstructBLIP 模型)
instructblipvideo — InstructBlipVideoProcessor (InstructBlipVideo 模型)
internvl — InternVLProcessor (InternVL 模型)
janus — JanusProcessor (Janus 模型)
kosmos-2 — Kosmos2Processor (KOSMOS-2 模型)
kyutai_speech_to_text — KyutaiSpeechToTextProcessor (KyutaiSpeechToText 模型)
layoutlmv2 — LayoutLMv2Processor (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3Processor (LayoutLMv3 模型)
llama4 — Llama4Processor (Llama4 模型)
llava — LlavaProcessor (LLaVa 模型)
llava_next — LlavaNextProcessor (LLaVA-NeXT 模型)
llava_next_video — LlavaNextVideoProcessor (LLaVa-NeXT-Video 模型)
llava_onevision — LlavaOnevisionProcessor (LLaVA-Onevision 模型)
markuplm — MarkupLMProcessor (MarkupLM 模型)
mctct — MCTCTProcessor (M-CTC-T 模型)
mgp-str — MgpstrProcessor (MGP-STR 模型)
mistral3 — PixtralProcessor (Mistral3 模型)
mllama — MllamaProcessor (Mllama 模型)
moonshine — Wav2Vec2Processor (Moonshine 模型)
oneformer — OneFormerProcessor (OneFormer 模型)
owlv2 — Owlv2Processor (OWLv2 模型)
owlvit — OwlViTProcessor (OWL-ViT 模型)
paligemma — PaliGemmaProcessor (PaliGemma 模型)
phi4_multimodal — Phi4MultimodalProcessor (Phi4Multimodal 模型)
pix2struct — Pix2StructProcessor (Pix2Struct 模型)
pixtral — PixtralProcessor (Pixtral 模型)
pop2piano — Pop2PianoProcessor (Pop2Piano 模型)
qwen2_5_omni — Qwen2_5OmniProcessor (Qwen2_5Omni 模型)
qwen2_5_vl — Qwen2_5_VLProcessor (Qwen2_5_VL 模型)
qwen2_audio — Qwen2AudioProcessor (Qwen2Audio 模型)
qwen2_vl — Qwen2VLProcessor (Qwen2VL 模型)
sam — SamProcessor (SAM 模型)
sam_hq — SamHQProcessor (SAM-HQ 模型)
seamless_m4t — SeamlessM4TProcessor (SeamlessM4T 模型)
sew — Wav2Vec2Processor (SEW 模型)
sew-d — Wav2Vec2Processor (SEW-D 模型)
shieldgemma2 — ShieldGemma2Processor (Shieldgemma2 模型)
siglip — SiglipProcessor (SigLIP 模型)
siglip2 — Siglip2Processor (SigLIP2 模型)
smolvlm — SmolVLMProcessor (SmolVLM 模型)
speech_to_text — Speech2TextProcessor (Speech2Text 模型)
speech_to_text_2 — Speech2Text2Processor (Speech2Text2 模型)
speecht5 — SpeechT5Processor (SpeechT5 模型)
trocr — TrOCRProcessor (TrOCR 模型)
tvlt — TvltProcessor (TVLT 模型)
tvp — TvpProcessor (TVP 模型)
udop — UdopProcessor (UDOP 模型)
unispeech — Wav2Vec2Processor (UniSpeech 模型)
unispeech-sat — Wav2Vec2Processor (UniSpeechSat 模型)
video_llava — VideoLlavaProcessor (VideoLlava 模型)
vilt — ViltProcessor (ViLT 模型)
vipllava — LlavaProcessor (VipLlava 模型)
vision-text-dual-encoder — VisionTextDualEncoderProcessor (VisionTextDualEncoder 模型)
wav2vec2 — Wav2Vec2Processor (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2Processor (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2Processor (Wav2Vec2-Conformer 模型)
wavlm — Wav2Vec2Processor (WavLM 模型)
whisper — WhisperProcessor (Whisper 模型)
xclip — XCLIPProcessor (X-CLIP 模型)

当您想使用私有模型时，需要传递 token=True。

示例

>>> from transformers import AutoProcessor

>>> # Download processor from huggingface.co and cache.
>>> processor = AutoProcessor.from_pretrained("facebook/wav2vec2-base-960h")

>>> # If processor files are in a directory (e.g. processor was saved using *save_pretrained('./test/saved_model/')*)
>>> # processor = AutoProcessor.from_pretrained("./test/saved_model/")

register

（ config_class processor_class exist_ok = False ）

参数

config_class (PretrainedConfig) — 与要注册的模型相对应的配置。
processor_class (ProcessorMixin) — 要注册的处理器。

为此类注册一个新的处理器。

通用模型类

以下自动类可用于实例化没有特定头部的基础模型类。

AutoModel

class transformers.AutoModel

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个基础模型类。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- ASTConfig 配置类：ASTModel（音频频谱 Transformer 模型）
- AlbertConfig 配置类：AlbertModel（ALBERT 模型）
- AlignConfig 配置类：AlignModel（ALIGN 模型）
- AltCLIPConfig 配置类：AltCLIPModel（AltCLIP 模型）
- ArceeConfig 配置类：ArceeModel（Arcee 模型）
- AriaConfig 配置类：AriaModel（Aria 模型）
- AriaTextConfig 配置类：AriaTextModel（AriaText 模型）
- AutoformerConfig 配置类：AutoformerModel（Autoformer 模型）
- AyaVisionConfig 配置类：AyaVisionModel（AyaVision 模型）
- BambaConfig 配置类：BambaModel（Bamba 模型）
- BarkConfig 配置类：BarkModel（Bark 模型）
- BartConfig 配置类：BartModel（BART 模型）
- BeitConfig 配置类：BeitModel（BEiT 模型）
- BertConfig 配置类：BertModel（BERT 模型）
- BertGenerationConfig 配置类：BertGenerationEncoder（Bert Generation 模型）
- BigBirdConfig 配置类：BigBirdModel（BigBird 模型）
- BigBirdPegasusConfig 配置类：BigBirdPegasusModel（BigBird-Pegasus 模型）
- BioGptConfig 配置类：BioGptModel（BioGpt 模型）
- BitConfig 配置类：BitModel（BiT 模型）
- BitNetConfig 配置类：BitNetModel（BitNet 模型）
- BlenderbotConfig 配置类：BlenderbotModel（Blenderbot 模型）
- BlenderbotSmallConfig 配置类：BlenderbotSmallModel（BlenderbotSmall 模型）
- Blip2Config 配置类：Blip2Model（BLIP-2 模型）
- Blip2QFormerConfig 配置类：Blip2QFormerModel（BLIP-2 QFormer 模型）
- BlipConfig 配置类：BlipModel（BLIP 模型）
- BloomConfig 配置类：BloomModel（BLOOM 模型）
- BridgeTowerConfig 配置类：BridgeTowerModel（BridgeTower 模型）
- BrosConfig 配置类：BrosModel（BROS 模型）
- CLIPConfig 配置类：CLIPModel（CLIP 模型）
- CLIPSegConfig 配置类：CLIPSegModel（CLIPSeg 模型）
- CLIPTextConfig 配置类：CLIPTextModel（CLIPTextModel 模型）
- CLIPVisionConfig 配置类：CLIPVisionModel（CLIPVisionModel 模型）
- CTRLConfig 配置类：CTRLModel（CTRL 模型）
- CamembertConfig 配置类：CamembertModel（CamemBERT 模型）
- CanineConfig 配置类：CanineModel（CANINE 模型）
- ChameleonConfig 配置类：ChameleonModel（Chameleon 模型）
- ChineseCLIPConfig 配置类：ChineseCLIPModel（Chinese-CLIP 模型）
- ChineseCLIPVisionConfig 配置类：ChineseCLIPVisionModel（ChineseCLIPVisionModel 模型）
- ClapConfig 配置类：ClapModel（CLAP 模型）
- ClvpConfig 配置类：ClvpModelForConditionalGeneration（CLVP 模型）
- CodeGenConfig 配置类：CodeGenModel（CodeGen 模型）
- Cohere2Config 配置类：Cohere2Model（Cohere2 模型）
- CohereConfig 配置类：CohereModel（Cohere 模型）
- ConditionalDetrConfig 配置类：ConditionalDetrModel（Conditional DETR 模型）
- ConvBertConfig 配置类：ConvBertModel（ConvBERT 模型）
- ConvNextConfig 配置类：ConvNextModel（ConvNeXT 模型）
- ConvNextV2Config 配置类：ConvNextV2Model（ConvNeXTV2 模型）
- CpmAntConfig 配置类：CpmAntModel（CPM-Ant 模型）
- CsmConfig 配置类：CsmForConditionalGeneration（CSM 模型）
- CvtConfig 配置类：CvtModel（CvT 模型）
- DFineConfig 配置类：DFineModel（D-FINE 模型）
- DPRConfig 配置类：DPRQuestionEncoder（DPR 模型）
- DPTConfig 配置类：DPTModel（DPT 模型）
- DabDetrConfig 配置类：DabDetrModel（DAB-DETR 模型）
- DacConfig 配置类：DacModel（DAC 模型）
- Data2VecAudioConfig 配置类：Data2VecAudioModel（Data2VecAudio 模型）
- Data2VecTextConfig 配置类：Data2VecTextModel（Data2VecText 模型）
- Data2VecVisionConfig 配置类：Data2VecVisionModel（Data2VecVision 模型）
- DbrxConfig 配置类：DbrxModel（DBRX 模型）
- DebertaConfig 配置类：DebertaModel（DeBERTa 模型）
- DebertaV2Config 配置类：DebertaV2Model（DeBERTa-v2 模型）
- DecisionTransformerConfig 配置类：DecisionTransformerModel（Decision Transformer 模型）
- DeepseekV3Config 配置类：DeepseekV3Model（DeepSeek-V3 模型）
- DeformableDetrConfig 配置类：DeformableDetrModel（Deformable DETR 模型）
- DeiTConfig 配置类：DeiTModel（DeiT 模型）
- DepthProConfig 配置类：DepthProModel（DepthPro 模型）
- DetaConfig 配置类：DetaModel（DETA 模型）
- DetrConfig 配置类：DetrModel（DETR 模型）
- DiaConfig 配置类：DiaModel（Dia 模型）
- DiffLlamaConfig 配置类：DiffLlamaModel（DiffLlama 模型）
- DinatConfig 配置类：DinatModel（DiNAT 模型）
- Dinov2Config 配置类：Dinov2Model（DINOv2 模型）
- Dinov2WithRegistersConfig 配置类：Dinov2WithRegistersModel（DINOv2 with Registers 模型）
- DistilBertConfig 配置类：DistilBertModel（DistilBERT 模型）
- DonutSwinConfig 配置类：DonutSwinModel（DonutSwin 模型）
- Dots1Config 配置类：Dots1Model（dots1 模型）
- EfficientFormerConfig 配置类：EfficientFormerModel（EfficientFormer 模型）
- EfficientNetConfig 配置类：EfficientNetModel（EfficientNet 模型）
- ElectraConfig 配置类：ElectraModel（ELECTRA 模型）
- Emu3Config 配置类：Emu3Model（Emu3 模型）
- EncodecConfig 配置类：EncodecModel（EnCodec 模型）
- ErnieConfig 配置类：ErnieModel（ERNIE 模型）
- ErnieMConfig 配置类：ErnieMModel（ErnieM 模型）
- EsmConfig 配置类：EsmModel（ESM 模型）
- FNetConfig 配置类：FNetModel（FNet 模型）
- FSMTConfig 配置类：FSMTModel（FairSeq 机器翻译模型）
- FalconConfig 配置类：FalconModel（Falcon 模型）
- FalconH1Config 配置类：FalconH1Model（FalconH1 模型）
- FalconMambaConfig 配置类：FalconMambaModel（FalconMamba 模型）
- FastSpeech2ConformerConfig 配置类：FastSpeech2ConformerModel（FastSpeech2Conformer 模型）
- FlaubertConfig 配置类：FlaubertModel（FlauBERT 模型）
- FlavaConfig 配置类：FlavaModel（FLAVA 模型）
- FocalNetConfig 配置类：FocalNetModel（FocalNet 模型）
- FunnelConfig 配置类：FunnelModel 或 FunnelBaseModel（Funnel Transformer 模型）
- FuyuConfig 配置类：FuyuModel（Fuyu 模型）
- GLPNConfig 配置类：GLPNModel（GLPN 模型）
- GPT2Config 配置类：GPT2Model（OpenAI GPT-2 模型）
- GPTBigCodeConfig 配置类：GPTBigCodeModel（GPTBigCode 模型）
- GPTJConfig 配置类：GPTJModel（GPT-J 模型）
- GPTNeoConfig 配置类：GPTNeoModel（GPT Neo 模型）
- GPTNeoXConfig 配置类：GPTNeoXModel（GPT NeoX 模型）
- GPTNeoXJapaneseConfig 配置类：GPTNeoXJapaneseModel（GPT NeoX Japanese 模型）
- GPTSanJapaneseConfig 配置类：GPTSanJapaneseForConditionalGeneration（GPTSAN-japanese 模型）
- Gemma2Config 配置类：Gemma2Model（Gemma2 模型）
- Gemma3Config 配置类：Gemma3Model（Gemma3ForConditionalGeneration 模型）
- Gemma3TextConfig 配置类：Gemma3TextModel（Gemma3ForCausalLM 模型）
- Gemma3nAudioConfig 配置类：Gemma3nAudioEncoder（Gemma3nAudioEncoder 模型）
- Gemma3nConfig 配置类：Gemma3nModel（Gemma3nForConditionalGeneration 模型）
- Gemma3nTextConfig 配置类：Gemma3nTextModel（Gemma3nForCausalLM 模型）
- Gemma3nVisionConfig 配置类：TimmWrapperModel（TimmWrapperModel 模型）
- GemmaConfig 配置类：GemmaModel（Gemma 模型）
- GitConfig 配置类：GitModel（GIT 模型）
- Glm4Config 配置类：Glm4Model（GLM4 模型）
- Glm4vConfig 配置类：Glm4vModel（GLM4V 模型）
- Glm4vTextConfig 配置类：Glm4vTextModel（GLM4V 模型）
- GlmConfig 配置类：GlmModel（GLM 模型）
- GotOcr2Config 配置类：GotOcr2Model（GOT-OCR2 模型）
- GraniteConfig 配置类：GraniteModel（Granite 模型）
- GraniteMoeConfig 配置类：GraniteMoeModel（GraniteMoeMoe 模型）
- GraniteMoeHybridConfig 配置类：GraniteMoeHybridModel（GraniteMoeHybrid 模型）
- GraniteMoeSharedConfig 配置类：GraniteMoeSharedModel（GraniteMoeSharedMoe 模型）
- GraphormerConfig 配置类：GraphormerModel（Graphormer 模型）
- GroundingDinoConfig 配置类：GroundingDinoModel（Grounding DINO 模型）
- GroupViTConfig 配置类：GroupViTModel（GroupViT 模型）
- HGNetV2Config 配置类：HGNetV2Backbone（HGNet-V2 模型）
- HeliumConfig 配置类：HeliumModel（Helium 模型）
- HieraConfig 配置类：HieraModel（Hiera 模型）
- HubertConfig 配置类：HubertModel（Hubert 模型）
- IBertConfig 配置类：IBertModel（I-BERT 模型）
- IJepaConfig 配置类：IJepaModel（I-JEPA 模型）
- Idefics2Config 配置类：Idefics2Model（Idefics2 模型）
- Idefics3Config 配置类：Idefics3Model（Idefics3 模型）
- Idefics3VisionConfig 配置类：Idefics3VisionTransformer（Idefics3VisionTransformer 模型）
- IdeficsConfig 配置类：IdeficsModel（IDEFICS 模型）
- ImageGPTConfig 配置类：ImageGPTModel（ImageGPT 模型）
- InformerConfig 配置类：InformerModel（Informer 模型）
- InstructBlipConfig 配置类：InstructBlipModel（InstructBLIP 模型）
- InstructBlipVideoConfig 配置类：InstructBlipVideoModel（InstructBlipVideo 模型）
- InternVLConfig 配置类：InternVLModel（InternVL 模型）
- InternVLVisionConfig 配置类：InternVLVisionModel（InternVLVision 模型）
- JambaConfig 配置类：JambaModel（Jamba 模型）
- JanusConfig 配置类：JanusModel（Janus 模型）
- JetMoeConfig 配置类：JetMoeModel（JetMoe 模型）
- JukeboxConfig 配置类：JukeboxModel（Jukebox 模型）
- Kosmos2Config 配置类：Kosmos2Model（KOSMOS-2 模型）
- KyutaiSpeechToTextConfig 配置类：KyutaiSpeechToTextModel（KyutaiSpeechToText 模型）
- LEDConfig 配置类：LEDModel（LED 模型）
- LayoutLMConfig 配置类：LayoutLMModel（LayoutLM 模型）
- LayoutLMv2Config 配置类：LayoutLMv2Model（LayoutLMv2 模型）
- LayoutLMv3Config 配置类：LayoutLMv3Model（LayoutLMv3 模型）
- LevitConfig 配置类：LevitModel（LeViT 模型）
- LightGlueConfig 配置类：LightGlueForKeypointMatching（LightGlue 模型）
- LiltConfig 配置类：LiltModel（LiLT 模型）
- Llama4Config 配置类：Llama4ForConditionalGeneration（Llama4 模型）
- Llama4TextConfig 配置类：Llama4TextModel（Llama4ForCausalLM 模型）
- LlamaConfig 配置类：LlamaModel（LLaMA 模型）
- LlavaConfig 配置类：LlavaModel（LLaVa 模型）
- LlavaNextConfig 配置类：LlavaNextModel（LLaVA-NeXT 模型）
- LlavaNextVideoConfig 配置类：LlavaNextVideoModel（LLaVa-NeXT-Video 模型）
- LlavaOnevisionConfig 配置类：LlavaOnevisionModel（LLaVA-Onevision 模型）
- LongT5Config 配置类：LongT5Model（LongT5 模型）
- LongformerConfig 配置类：LongformerModel（Longformer 模型）
- LukeConfig 配置类：LukeModel（LUKE 模型）
- LxmertConfig 配置类：LxmertModel（LXMERT 模型）
- M2M100Config 配置类：M2M100Model（M2M100 模型）
- MBartConfig 配置类：MBartModel（mBART 模型）
- MCTCTConfig 配置类：MCTCTModel（M-CTC-T 模型）
- MLCDVisionConfig 配置类：MLCDVisionModel（MLCD 模型）
- MPNetConfig 配置类：MPNetModel（MPNet 模型）
- MT5Config 配置类：MT5Model（MT5 模型）
- Mamba2Config 配置类：Mamba2Model（mamba2 模型）
- MambaConfig 配置类：MambaModel（Mamba 模型）
- MarianConfig 配置类：MarianModel（Marian 模型）
- MarkupLMConfig 配置类：MarkupLMModel（MarkupLM 模型）
- Mask2FormerConfig 配置类：Mask2FormerModel（Mask2Former 模型）
- MaskFormerConfig 配置类：MaskFormerModel（MaskFormer 模型）
- MaskFormerSwinConfig 配置类：MaskFormerSwinModel（MaskFormerSwin 模型）
- MegaConfig 配置类：MegaModel（MEGA 模型）
- MegatronBertConfig 配置类：MegatronBertModel（Megatron-BERT 模型）
- MgpstrConfig 配置类：MgpstrForSceneTextRecognition（MGP-STR 模型）
- MimiConfig 配置类：MimiModel（Mimi 模型）
- MiniMaxConfig 配置类：MiniMaxModel（MiniMax 模型）
- Mistral3Config 配置类：Mistral3Model（Mistral3 模型）
- MistralConfig 配置类：MistralModel（Mistral 模型）
- MixtralConfig 配置类：MixtralModel（Mixtral 模型）
- MllamaConfig 配置类：MllamaModel（Mllama 模型）
- MobileBertConfig 配置类：MobileBertModel（MobileBERT 模型）
- MobileNetV1Config 配置类：MobileNetV1Model（MobileNetV1 模型）
- MobileNetV2Config 配置类：MobileNetV2Model（MobileNetV2 模型）
- MobileViTConfig 配置类：MobileViTModel（MobileViT 模型）
- MobileViTV2Config 配置类：MobileViTV2Model（MobileViTV2 模型）
- ModernBertConfig 配置类：ModernBertModel（ModernBERT 模型）
- MoonshineConfig 配置类：MoonshineModel（Moonshine 模型）
- MoshiConfig 配置类：MoshiModel（Moshi 模型）
- MptConfig 配置类：MptModel（MPT 模型）
- MraConfig 配置类：MraModel（MRA 模型）
- MusicgenConfig 配置类：MusicgenModel（MusicGen 模型）
- MusicgenMelodyConfig 配置类：MusicgenMelodyModel（MusicGen Melody 模型）
- MvpConfig 配置类：MvpModel（MVP 模型）
- NatConfig 配置类：NatModel（NAT 模型）
- NemotronConfig 配置类：NemotronModel（Nemotron 模型）
- NezhaConfig 配置类：NezhaModel（Nezha 模型）
- NllbMoeConfig 配置类：NllbMoeModel（NLLB-MOE 模型）
- NystromformerConfig 配置类：NystromformerModel（Nyströmformer 模型）
- OPTConfig 配置类：OPTModel（OPT 模型）
- Olmo2Config 配置类：Olmo2Model（OLMo2 模型）
- OlmoConfig 配置类：OlmoModel（OLMo 模型）
- OlmoeConfig 配置类：OlmoeModel（OLMoE 模型）
- OmDetTurboConfig 配置类：OmDetTurboForObjectDetection（OmDet-Turbo 模型）
- OneFormerConfig 配置类：OneFormerModel（OneFormer 模型）
- OpenAIGPTConfig 配置类：OpenAIGPTModel（OpenAI GPT 模型）
- OpenLlamaConfig 配置类：OpenLlamaModel（OpenLlama 模型）
- OwlViTConfig 配置类：OwlViTModel（OWL-ViT 模型）
- Owlv2Config 配置类：Owlv2Model（OWLv2 模型）
- PLBartConfig 配置类：PLBartModel（PLBart 模型）
- PaliGemmaConfig 配置类：PaliGemmaModel（PaliGemma 模型）
- PatchTSMixerConfig 配置类：PatchTSMixerModel（PatchTSMixer 模型）
- PatchTSTConfig 配置类：PatchTSTModel（PatchTST 模型）
- PegasusConfig 配置类：PegasusModel（Pegasus 模型）
- PegasusXConfig 配置类：PegasusXModel（PEGASUS-X 模型）
- PerceiverConfig 配置类：PerceiverModel（Perceiver 模型）
- PersimmonConfig 配置类：PersimmonModel（Persimmon 模型）
- Phi3Config 配置类：Phi3Model（Phi3 模型）
- Phi4MultimodalConfig 配置类：Phi4MultimodalModel（Phi4Multimodal 模型）
- PhiConfig 配置类：PhiModel（Phi 模型）
- PhimoeConfig 配置类：PhimoeModel（Phimoe 模型）
- PixtralVisionConfig 配置类：PixtralVisionModel（Pixtral 模型）
- PoolFormerConfig 配置类：PoolFormerModel（PoolFormer 模型）
- ProphetNetConfig 配置类：ProphetNetModel（ProphetNet 模型）
- PvtConfig 配置类：PvtModel（PVT 模型）
- PvtV2Config 配置类：PvtV2Model（PVTv2 模型）
- QDQBertConfig 配置类：QDQBertModel（QDQBert 模型）
- Qwen2AudioEncoderConfig 配置类：Qwen2AudioEncoder（Qwen2AudioEncoder 模型）
- Qwen2Config 配置类：Qwen2Model（Qwen2 模型）
- Qwen2MoeConfig 配置类：Qwen2MoeModel（Qwen2MoE 模型）
- Qwen2VLConfig 配置类：Qwen2VLModel（Qwen2VL 模型）
- Qwen2VLTextConfig 配置类：Qwen2VLTextModel（Qwen2VL 模型）
- Qwen2_5_VLConfig 配置类：Qwen2_5_VLModel（Qwen2_5_VL 模型）
- Qwen2_5_VLTextConfig 配置类：Qwen2_5_VLTextModel（Qwen2_5_VL 模型）
- Qwen3Config 配置类：Qwen3Model（Qwen3 模型）
- Qwen3MoeConfig 配置类：Qwen3MoeModel（Qwen3MoE 模型）
- RTDetrConfig 配置类：RTDetrModel（RT-DETR 模型）
- RTDetrV2Config 配置类：RTDetrV2Model（RT-DETRv2 模型）
- RecurrentGemmaConfig 配置类：RecurrentGemmaModel（RecurrentGemma 模型）
- ReformerConfig 配置类：ReformerModel（Reformer 模型）
- RegNetConfig 配置类：RegNetModel（RegNet 模型）
- RemBertConfig 配置类：RemBertModel（RemBERT 模型）
- ResNetConfig 配置类：ResNetModel（ResNet 模型）
- RetriBertConfig 配置类：RetriBertModel（RetriBERT 模型）
- RoCBertConfig 配置类：RoCBertModel（RoCBert 模型）
- RoFormerConfig 配置类：RoFormerModel（RoFormer 模型）
- RobertaConfig 配置类：RobertaModel（RoBERTa 模型）
- RobertaPreLayerNormConfig 配置类：RobertaPreLayerNormModel（RoBERTa-PreLayerNorm 模型）
- RwkvConfig 配置类：RwkvModel（RWKV 模型）
- SEWConfig 配置类：SEWModel（SEW 模型）
- SEWDConfig 配置类：SEWDModel（SEW-D 模型）
- SamConfig 配置类：SamModel（SAM 模型）
- SamHQConfig 配置类：SamHQModel（SAM-HQ 模型）
- SamHQVisionConfig 配置类：SamHQVisionModel（SamHQVisionModel 模型）
- SamVisionConfig 配置类：SamVisionModel（SamVisionModel 模型）
- SeamlessM4TConfig 配置类：SeamlessM4TModel（SeamlessM4T 模型）
- SeamlessM4Tv2Config 配置类：SeamlessM4Tv2Model（SeamlessM4Tv2 模型）
- SegGptConfig 配置类：SegGptModel（SegGPT 模型）
- SegformerConfig 配置类：SegformerModel（SegFormer 模型）
- Siglip2Config 配置类：Siglip2Model（SigLIP2 模型）
- SiglipConfig 配置类：SiglipModel（SigLIP 模型）
- SiglipVisionConfig 配置类：SiglipVisionModel（SiglipVisionModel 模型）
- SmolLM3Config 配置类：SmolLM3Model（SmolLM3 模型）
- SmolVLMConfig 配置类：SmolVLMModel（SmolVLM 模型）
- SmolVLMVisionConfig 配置类：SmolVLMVisionTransformer（SmolVLMVisionTransformer 模型）
- Speech2TextConfig 配置类：Speech2TextModel（Speech2Text 模型）
- SpeechT5Config 配置类：SpeechT5Model（SpeechT5 模型）
- SplinterConfig 配置类：SplinterModel（Splinter 模型）
- SqueezeBertConfig 配置类：SqueezeBertModel（SqueezeBERT 模型）
- StableLmConfig 配置类：StableLmModel（StableLm 模型）
- Starcoder2Config 配置类：Starcoder2Model（Starcoder2 模型）
- SuperGlueConfig 配置类：SuperGlueForKeypointMatching（SuperGlue 模型）
- SwiftFormerConfig 配置类：SwiftFormerModel（SwiftFormer 模型）
- Swin2SRConfig 配置类：Swin2SRModel（Swin2SR 模型）
- SwinConfig 配置类：SwinModel（Swin Transformer 模型）
- Swinv2Config 配置类：Swinv2Model（Swin Transformer V2 模型）
- SwitchTransformersConfig 配置类：SwitchTransformersModel（SwitchTransformers 模型）
- T5Config 配置类：T5Model（T5 模型）
- T5GemmaConfig 配置类：T5GemmaModel（T5Gemma 模型）
- TableTransformerConfig 配置类：TableTransformerModel（Table Transformer 模型）
- TapasConfig 配置类：TapasModel（TAPAS 模型）
- TextNetConfig 配置类：TextNetModel（TextNet 模型）
- TimeSeriesTransformerConfig 配置类：TimeSeriesTransformerModel（Time Series Transformer 模型）
- TimesFmConfig 配置类：TimesFmModel（TimesFm 模型）
- TimesformerConfig 配置类：TimesformerModel（TimeSformer 模型）
- TimmBackboneConfig 配置类：TimmBackbone（TimmBackbone 模型）
- TimmWrapperConfig 配置类：TimmWrapperModel（TimmWrapperModel 模型）
- TrajectoryTransformerConfig 配置类：TrajectoryTransformerModel（Trajectory Transformer 模型）
- TransfoXLConfig 配置类：TransfoXLModel（Transformer-XL 模型）
- TvltConfig 配置类：TvltModel（TVLT 模型）
- TvpConfig 配置类：TvpModel（TVP 模型）
- UMT5Config 配置类：UMT5Model（UMT5 模型）
- UdopConfig 配置类：UdopModel（UDOP 模型）
- UniSpeechConfig 配置类：UniSpeechModel（UniSpeech 模型）
- UniSpeechSatConfig 配置类：UniSpeechSatModel（UniSpeechSat 模型）
- UnivNetConfig 配置类：UnivNetModel（UnivNet 模型）
- VJEPA2Config 配置类：VJEPA2Model（VJEPA2Model 模型）
- VanConfig 配置类：VanModel（VAN 模型）
- ViTConfig 配置类：ViTModel（ViT 模型）
- ViTHybridConfig 配置类：ViTHybridModel（ViT Hybrid 模型）
- ViTMAEConfig 配置类：ViTMAEModel（ViTMAE 模型）
- ViTMSNConfig 配置类：ViTMSNModel（ViTMSN 模型）
- VideoLlavaConfig 配置类：VideoLlavaModel（VideoLlava 模型）
- VideoMAEConfig 配置类：VideoMAEModel（VideoMAE 模型）
- ViltConfig 配置类：ViltModel（ViLT 模型）
- VipLlavaConfig 配置类：VipLlavaModel（VipLlava 模型）
- VisionTextDualEncoderConfig 配置类：VisionTextDualEncoderModel（VisionTextDualEncoder 模型）
- VisualBertConfig 配置类：VisualBertModel（VisualBERT 模型）
- VitDetConfig 配置类：VitDetModel（VitDet 模型）
- VitsConfig 配置类：VitsModel（VITS 模型）
- VivitConfig 配置类：VivitModel（ViViT 模型）
- Wav2Vec2BertConfig 配置类：Wav2Vec2BertModel（Wav2Vec2-BERT 模型）
- Wav2Vec2Config 配置类：Wav2Vec2Model（Wav2Vec2 模型）
- Wav2Vec2ConformerConfig 配置类：Wav2Vec2ConformerModel（Wav2Vec2-Conformer 模型）
- WavLMConfig 配置类：WavLMModel（WavLM 模型）
- WhisperConfig 配置类：WhisperModel（Whisper 模型）
- XCLIPConfig 配置类：XCLIPModel（X-CLIP 模型）
- XGLMConfig 配置类：XGLMModel（XGLM 模型）
- XLMConfig 配置类：XLMModel（XLM 模型）
- XLMProphetNetConfig 配置类：XLMProphetNetModel（XLM-ProphetNet 模型）
- XLMRobertaConfig 配置类：XLMRobertaModel（XLM-RoBERTa 模型）
- XLMRobertaXLConfig 配置类：XLMRobertaXLModel（XLM-RoBERTa-XL 模型）
- XLNetConfig 配置类：XLNetModel（XLNet 模型）
- XmodConfig 配置类：XmodModel（X-MOD 模型）
- YolosConfig 配置类：YolosModel（YOLOS 模型）
- YosoConfig 配置类：YosoModel（YOSO 模型）
- Zamba2Config 配置类：Zamba2Model（Zamba2 模型）
- ZambaConfig 配置类：ZambaModel（Zamba 模型）
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention），或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认值为手动的 "eager" 实现。

通过配置实例化一个库中的基础模型类。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModel

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModel.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个指向 *TensorFlow 索引检查点文件*的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，from_tf 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后再加载 PyTorch 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是由库提供的模型（使用预训练模型的*模型 ID* 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供一个本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了一个名为 config.json 的配置 JSON 文件。
state_dict (dict[str, torch.Tensor], 可选) — 用于替代从已保存权重文件加载的状态字典。

如果你想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，你应该检查使用 save_pretrained() 和 from_pretrained() 是否是更简单的选择。
cache_dir (str 或 os.PathLike, 可选) — 当不应使用标准缓存时，用于缓存下载的预训练模型配置的目录路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件加载模型权重（参见 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 用于按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次请求时使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的、存在于其自身建模文件中的自定义模型。此选项只应为受信任且已阅读其代码的仓库设置为 True，因为它将在您的本地机器上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则指定要用于 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如 output_attentions=True）。其行为取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新都已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中与配置属性对应的每个键都将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个基础模型类。

要实例化的模型类是根据配置对象的 model_type 属性选择的（可以作为参数传递，或者如果可能的话从 pretrained_model_name_or_path 加载），或者在缺失时，通过对 pretrained_model_name_or_path 进行模式匹配来选择。

albert — AlbertModel (ALBERT 模型)
align — AlignModel (ALIGN 模型)
altclip — AltCLIPModel (AltCLIP 模型)
arcee — ArceeModel (Arcee 模型)
aria — AriaModel (Aria 模型)
aria_text — AriaTextModel (AriaText 模型)
audio-spectrogram-transformer — ASTModel (Audio Spectrogram Transformer 模型)
autoformer — AutoformerModel (Autoformer 模型)
aya_vision — AyaVisionModel (AyaVision 模型)
bamba — BambaModel (Bamba 模型)
bark — BarkModel (Bark 模型)
bart — BartModel (BART 模型)
beit — BeitModel (BEiT 模型)
bert — BertModel (BERT 模型)
bert-generation — BertGenerationEncoder (Bert Generation 模型)
big_bird — BigBirdModel (BigBird 模型)
bigbird_pegasus — BigBirdPegasusModel (BigBird-Pegasus 模型)
biogpt — BioGptModel (BioGpt 模型)
bit — BitModel (BiT 模型)
bitnet — BitNetModel (BitNet 模型)
blenderbot — BlenderbotModel (Blenderbot 模型)
blenderbot-small — BlenderbotSmallModel (BlenderbotSmall 模型)
blip — BlipModel (BLIP 模型)
blip-2 — Blip2Model (BLIP-2 模型)
blip_2_qformer — Blip2QFormerModel (BLIP-2 QFormer 模型)
bloom — BloomModel (BLOOM 模型)
bridgetower — BridgeTowerModel (BridgeTower 模型)
bros — BrosModel (BROS 模型)
camembert — CamembertModel (CamemBERT 模型)
canine — CanineModel (CANINE 模型)
chameleon — ChameleonModel (Chameleon 模型)
chinese_clip — ChineseCLIPModel (Chinese-CLIP 模型)
chinese_clip_vision_model — ChineseCLIPVisionModel (ChineseCLIPVisionModel 模型)
clap — ClapModel (CLAP 模型)
clip — CLIPModel (CLIP 模型)
clip_text_model — CLIPTextModel (CLIPTextModel 模型)
clip_vision_model — CLIPVisionModel (CLIPVisionModel 模型)
clipseg — CLIPSegModel (CLIPSeg 模型)
clvp — ClvpModelForConditionalGeneration (CLVP 模型)
code_llama — LlamaModel (CodeLlama 模型)
codegen — CodeGenModel (CodeGen 模型)
cohere — CohereModel (Cohere 模型)
cohere2 — Cohere2Model (Cohere2 模型)
conditional_detr — ConditionalDetrModel (Conditional DETR 模型)
convbert — ConvBertModel (ConvBERT 模型)
convnext — ConvNextModel (ConvNeXT 模型)
convnextv2 — ConvNextV2Model (ConvNeXTV2 模型)
cpmant — CpmAntModel (CPM-Ant 模型)
csm — CsmForConditionalGeneration (CSM 模型)
ctrl — CTRLModel (CTRL 模型)
cvt — CvtModel (CvT 模型)
d_fine — DFineModel (D-FINE 模型)
dab-detr — DabDetrModel (DAB-DETR 模型)
dac — DacModel (DAC 模型)
data2vec-audio — Data2VecAudioModel (Data2VecAudio 模型)
data2vec-text — Data2VecTextModel (Data2VecText 模型)
data2vec-vision — Data2VecVisionModel (Data2VecVision 模型)
dbrx — DbrxModel (DBRX 模型)
deberta — DebertaModel (DeBERTa 模型)
deberta-v2 — DebertaV2Model (DeBERTa-v2 模型)
decision_transformer — DecisionTransformerModel (Decision Transformer 模型)
deepseek_v3 — DeepseekV3Model (DeepSeek-V3 模型)
deformable_detr — DeformableDetrModel (Deformable DETR 模型)
deit — DeiTModel (DeiT 模型)
depth_pro — DepthProModel (DepthPro 模型)
deta — DetaModel (DETA 模型)
detr — DetrModel (DETR 模型)
dia — DiaModel (Dia 模型)
diffllama — DiffLlamaModel (DiffLlama 模型)
dinat — DinatModel (DiNAT 模型)
dinov2 — Dinov2Model (DINOv2 模型)
dinov2_with_registers — Dinov2WithRegistersModel (DINOv2 with Registers 模型)
distilbert — DistilBertModel (DistilBERT 模型)
donut-swin — DonutSwinModel (DonutSwin 模型)
dots1 — Dots1Model (dots1 模型)
dpr — DPRQuestionEncoder (DPR 模型)
dpt — DPTModel (DPT 模型)
efficientformer — EfficientFormerModel (EfficientFormer 模型)
efficientnet — EfficientNetModel (EfficientNet 模型)
electra — ElectraModel (ELECTRA 模型)
emu3 — Emu3Model (Emu3 模型)
encodec — EncodecModel (EnCodec 模型)
ernie — ErnieModel (ERNIE 模型)
ernie_m — ErnieMModel (ErnieM 模型)
esm — EsmModel (ESM 模型)
falcon — FalconModel (Falcon 模型)
falcon_h1 — FalconH1Model (FalconH1 模型)
falcon_mamba — FalconMambaModel (FalconMamba 模型)
fastspeech2_conformer — FastSpeech2ConformerModel (FastSpeech2Conformer 模型)
flaubert — FlaubertModel (FlauBERT 模型)
flava — FlavaModel (FLAVA 模型)
fnet — FNetModel (FNet 模型)
focalnet — FocalNetModel (FocalNet 模型)
fsmt — FSMTModel (FairSeq 机器翻译模型)
funnel — FunnelModel 或 FunnelBaseModel (Funnel Transformer 模型)
fuyu — FuyuModel (Fuyu 模型)
gemma — GemmaModel (Gemma 模型)
gemma2 — Gemma2Model (Gemma2 模型)
gemma3 — Gemma3Model (Gemma3ForConditionalGeneration 模型)
gemma3_text — Gemma3TextModel (Gemma3ForCausalLM 模型)
gemma3n — Gemma3nModel (Gemma3nForConditionalGeneration 模型)
gemma3n_audio — Gemma3nAudioEncoder (Gemma3nAudioEncoder 模型)
gemma3n_text — Gemma3nTextModel (Gemma3nForCausalLM 模型)
gemma3n_vision — TimmWrapperModel (TimmWrapperModel 模型)
git — GitModel (GIT 模型)
glm — GlmModel (GLM 模型)
glm4 — Glm4Model (GLM4 模型)
glm4v — Glm4vModel (GLM4V 模型)
glm4v_text — Glm4vTextModel (GLM4V 模型)
glpn — GLPNModel (GLPN 模型)
got_ocr2 — GotOcr2Model (GOT-OCR2 模型)
gpt-sw3 — GPT2Model (GPT-Sw3 模型)
gpt2 — GPT2Model (OpenAI GPT-2 模型)
gpt_bigcode — GPTBigCodeModel (GPTBigCode 模型)
gpt_neo — GPTNeoModel (GPT Neo 模型)
gpt_neox — GPTNeoXModel (GPT NeoX 模型)
gpt_neox_japanese — GPTNeoXJapaneseModel (GPT NeoX Japanese 模型)
gptj — GPTJModel (GPT-J 模型)
gptsan-japanese — GPTSanJapaneseForConditionalGeneration (GPTSAN-japanese 模型)
granite — GraniteModel (Granite 模型)
granitemoe — GraniteMoeModel (GraniteMoeMoe 模型)
granitemoehybrid — GraniteMoeHybridModel (GraniteMoeHybrid 模型)
granitemoeshared — GraniteMoeSharedModel (GraniteMoeSharedMoe 模型)
graphormer — GraphormerModel (Graphormer 模型)
grounding-dino — GroundingDinoModel (Grounding DINO 模型)
groupvit — GroupViTModel (GroupViT 模型)
helium — HeliumModel (Helium 模型)
hgnet_v2 — HGNetV2Backbone (HGNet-V2 模型)
hiera — HieraModel (Hiera 模型)
hubert — HubertModel (Hubert 模型)
ibert — IBertModel (I-BERT 模型)
idefics — IdeficsModel (IDEFICS 模型)
idefics2 — Idefics2Model (Idefics2 模型)
idefics3 — Idefics3Model (Idefics3 模型)
idefics3_vision — Idefics3VisionTransformer (Idefics3VisionTransformer 模型)
ijepa — IJepaModel (I-JEPA 模型)
imagegpt — ImageGPTModel (ImageGPT 模型)
informer — InformerModel (Informer 模型)
instructblip — InstructBlipModel (InstructBLIP 模型)
instructblipvideo — InstructBlipVideoModel (InstructBlipVideo 模型)
internvl — InternVLModel (InternVL 模型)
internvl_vision — InternVLVisionModel (InternVLVision 模型)
jamba — JambaModel (Jamba 模型)
janus — JanusModel (Janus 模型)
jetmoe — JetMoeModel (JetMoe 模型)
jukebox — JukeboxModel (Jukebox 模型)
kosmos-2 — Kosmos2Model (KOSMOS-2 模型)
kyutai_speech_to_text — KyutaiSpeechToTextModel (KyutaiSpeechToText 模型)
layoutlm — LayoutLMModel (LayoutLM 模型)
layoutlmv2 — LayoutLMv2Model (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3Model (LayoutLMv3 模型)
led — LEDModel (LED 模型)
levit — LevitModel (LeViT 模型)
lightglue — LightGlueForKeypointMatching (LightGlue 模型)
lilt — LiltModel (LiLT 模型)
llama — LlamaModel (LLaMA 模型)
llama4 — Llama4ForConditionalGeneration (Llama4 模型)
llama4_text — Llama4TextModel (Llama4ForCausalLM 模型)
llava — LlavaModel (LLaVa 模型)
llava_next — LlavaNextModel (LLaVA-NeXT 模型)
llava_next_video — LlavaNextVideoModel (LLaVa-NeXT-Video 模型)
llava_onevision — LlavaOnevisionModel (LLaVA-Onevision 模型)
longformer — LongformerModel (Longformer 模型)
longt5 — LongT5Model (LongT5 模型)
luke — LukeModel (LUKE 模型)
lxmert — LxmertModel (LXMERT 模型)
m2m_100 — M2M100Model (M2M100 模型)
mamba — MambaModel (Mamba 模型)
mamba2 — Mamba2Model (mamba2 模型)
marian — MarianModel (Marian 模型)
markuplm — MarkupLMModel (MarkupLM 模型)
mask2former — Mask2FormerModel (Mask2Former 模型)
maskformer — MaskFormerModel (MaskFormer 模型)
maskformer-swin — MaskFormerSwinModel (MaskFormerSwin 模型)
mbart — MBartModel (mBART 模型)
mctct — MCTCTModel (M-CTC-T 模型)
mega — MegaModel (MEGA 模型)
megatron-bert — MegatronBertModel (Megatron-BERT 模型)
mgp-str — MgpstrForSceneTextRecognition (MGP-STR 模型)
mimi — MimiModel (Mimi 模型)
minimax — MiniMaxModel (MiniMax 模型)
mistral — MistralModel (Mistral 模型)
mistral3 — Mistral3Model (Mistral3 模型)
mixtral — MixtralModel (Mixtral 模型)
mlcd — MLCDVisionModel (MLCD 模型)
mllama — MllamaModel (Mllama 模型)
mobilebert — MobileBertModel (MobileBERT 模型)
mobilenet_v1 — MobileNetV1Model (MobileNetV1 模型)
mobilenet_v2 — MobileNetV2Model (MobileNetV2 模型)
mobilevit — MobileViTModel (MobileViT 模型)
mobilevitv2 — MobileViTV2Model (MobileViTV2 模型)
modernbert — ModernBertModel (ModernBERT 模型)
moonshine — MoonshineModel (Moonshine 模型)
moshi — MoshiModel (Moshi 模型)
mpnet — MPNetModel (MPNet 模型)
mpt — MptModel (MPT 模型)
mra — MraModel (MRA 模型)
mt5 — MT5Model (MT5 模型)
musicgen — MusicgenModel (MusicGen 模型)
musicgen_melody — MusicgenMelodyModel (MusicGen Melody 模型)
mvp — MvpModel (MVP 模型)
nat — NatModel (NAT 模型)
nemotron — NemotronModel (Nemotron 模型)
nezha — NezhaModel (Nezha 模型)
nllb-moe — NllbMoeModel (NLLB-MOE 模型)
nystromformer — NystromformerModel (Nyströmformer 模型)
olmo — OlmoModel (OLMo 模型)
olmo2 — Olmo2Model (OLMo2 模型)
olmoe — OlmoeModel (OLMoE 模型)
omdet-turbo — OmDetTurboForObjectDetection (OmDet-Turbo 模型)
oneformer — OneFormerModel (OneFormer 模型)
open-llama — OpenLlamaModel (OpenLlama 模型)
openai-gpt — OpenAIGPTModel (OpenAI GPT 模型)
opt — OPTModel (OPT 模型)
owlv2 — Owlv2Model (OWLv2 模型)
owlvit — OwlViTModel (OWL-ViT 模型)
paligemma — PaliGemmaModel (PaliGemma 模型)
patchtsmixer — PatchTSMixerModel (PatchTSMixer 模型)
patchtst — PatchTSTModel (PatchTST 模型)
pegasus — PegasusModel (Pegasus 模型)
pegasus_x — PegasusXModel (PEGASUS-X 模型)
perceiver — PerceiverModel (Perceiver 模型)
persimmon — PersimmonModel (Persimmon 模型)
phi — PhiModel (Phi 模型)
phi3 — Phi3Model (Phi3 模型)
phi4_multimodal — Phi4MultimodalModel (Phi4Multimodal 模型)
phimoe — PhimoeModel (Phimoe 模型)
pixtral — PixtralVisionModel (Pixtral 模型)
plbart — PLBartModel (PLBart 模型)
poolformer — PoolFormerModel (PoolFormer 模型)
prophetnet — ProphetNetModel (ProphetNet 模型)
pvt — PvtModel (PVT 模型)
pvt_v2 — PvtV2Model (PVTv2 模型)
qdqbert — QDQBertModel (QDQBert 模型)
qwen2 — Qwen2Model (Qwen2 模型)
qwen2_5_vl — Qwen2_5_VLModel (Qwen2_5_VL 模型)
qwen2_5_vl_text — Qwen2_5_VLTextModel (Qwen2_5_VL 模型)
qwen2_audio_encoder — Qwen2AudioEncoder (Qwen2AudioEncoder 模型)
qwen2_moe — Qwen2MoeModel (Qwen2MoE 模型)
qwen2_vl — Qwen2VLModel (Qwen2VL 模型)
qwen2_vl_text — Qwen2VLTextModel (Qwen2VL 模型)
qwen3 — Qwen3Model (Qwen3 模型)
qwen3_moe — Qwen3MoeModel (Qwen3MoE 模型)
recurrent_gemma — RecurrentGemmaModel (RecurrentGemma 模型)
reformer — ReformerModel (Reformer 模型)
regnet — RegNetModel (RegNet 模型)
rembert — RemBertModel (RemBERT 模型)
resnet — ResNetModel (ResNet 模型)
retribert — RetriBertModel (RetriBERT 模型)
roberta — RobertaModel (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormModel (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertModel (RoCBert 模型)
roformer — RoFormerModel (RoFormer 模型)
rt_detr — RTDetrModel (RT-DETR 模型)
rt_detr_v2 — RTDetrV2Model (RT-DETRv2 模型)
rwkv — RwkvModel (RWKV 模型)
sam — SamModel (SAM 模型)
sam_hq — SamHQModel (SAM-HQ 模型)
sam_hq_vision_model — SamHQVisionModel (SamHQVisionModel 模型)
sam_vision_model — SamVisionModel (SamVisionModel 模型)
seamless_m4t — SeamlessM4TModel (SeamlessM4T 模型)
seamless_m4t_v2 — SeamlessM4Tv2Model (SeamlessM4Tv2 模型)
segformer — SegformerModel (SegFormer 模型)
seggpt — SegGptModel (SegGPT 模型)
sew — SEWModel (SEW 模型)
sew-d — SEWDModel (SEW-D 模型)
siglip — SiglipModel (SigLIP 模型)
siglip2 — Siglip2Model (SigLIP2 模型)
siglip_vision_model — SiglipVisionModel (SiglipVisionModel 模型)
smollm3 — SmolLM3Model (SmolLM3 模型)
smolvlm — SmolVLMModel (SmolVLM 模型)
smolvlm_vision — SmolVLMVisionTransformer (SmolVLMVisionTransformer 模型)
speech_to_text — Speech2TextModel (Speech2Text 模型)
speecht5 — SpeechT5Model (SpeechT5 模型)
splinter — SplinterModel (Splinter 模型)
squeezebert — SqueezeBertModel (SqueezeBERT 模型)
stablelm — StableLmModel (StableLm 模型)
starcoder2 — Starcoder2Model (Starcoder2 模型)
superglue — SuperGlueForKeypointMatching (SuperGlue 模型)
swiftformer — SwiftFormerModel (SwiftFormer 模型)
swin — SwinModel (Swin Transformer 模型)
swin2sr — Swin2SRModel (Swin2SR 模型)
swinv2 — Swinv2Model (Swin Transformer V2 模型)
switch_transformers — SwitchTransformersModel (SwitchTransformers 模型)
t5 — T5Model (T5 模型)
t5gemma — T5GemmaModel (T5Gemma 模型)
table-transformer — TableTransformerModel (Table Transformer 模型)
tapas — TapasModel (TAPAS 模型)
textnet — TextNetModel (TextNet 模型)
time_series_transformer — TimeSeriesTransformerModel (Time Series Transformer 模型)
timesfm — TimesFmModel (TimesFm 模型)
timesformer — TimesformerModel (TimeSformer 模型)
timm_backbone — TimmBackbone (TimmBackbone 模型)
timm_wrapper — TimmWrapperModel (TimmWrapperModel 模型)
trajectory_transformer — TrajectoryTransformerModel (Trajectory Transformer 模型)
transfo-xl — TransfoXLModel (Transformer-XL 模型)
tvlt — TvltModel (TVLT 模型)
tvp — TvpModel (TVP 模型)
udop — UdopModel (UDOP 模型)
umt5 — UMT5Model (UMT5 模型)
unispeech — UniSpeechModel (UniSpeech 模型)
unispeech-sat — UniSpeechSatModel (UniSpeechSat 模型)
univnet — UnivNetModel (UnivNet 模型)
van — VanModel (VAN 模型)
video_llava — VideoLlavaModel (VideoLlava 模型)
videomae — VideoMAEModel (VideoMAE 模型)
vilt — ViltModel (ViLT 模型)
vipllava — VipLlavaModel (VipLlava 模型)
vision-text-dual-encoder — VisionTextDualEncoderModel (VisionTextDualEncoder 模型)
visual_bert — VisualBertModel (VisualBERT 模型)
vit — ViTModel (ViT 模型)
vit_hybrid — ViTHybridModel (ViT Hybrid 模型)
vit_mae — ViTMAEModel (ViTMAE 模型)
vit_msn — ViTMSNModel (ViTMSN 模型)
vitdet — VitDetModel (VitDet 模型)
vits — VitsModel (VITS 模型)
vivit — VivitModel (ViViT 模型)
vjepa2 — VJEPA2Model (VJEPA2Model 模型)
wav2vec2 — Wav2Vec2Model (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2BertModel (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2ConformerModel (Wav2Vec2-Conformer 模型)
wavlm — WavLMModel (WavLM 模型)
whisper — WhisperModel (Whisper 模型)
xclip — XCLIPModel (X-CLIP 模型)
xglm — XGLMModel (XGLM 模型)
xlm — XLMModel (XLM 模型)
xlm-prophetnet — XLMProphetNetModel (XLM-ProphetNet 模型)
xlm-roberta — XLMRobertaModel (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLModel (XLM-RoBERTa-XL 模型)
xlnet — XLNetModel (XLNet 模型)
xmod — XmodModel (X-MOD 模型)
yolos — YolosModel (YOLOS 模型)
yoso — YosoModel (YOSO 模型)
zamba — ZambaModel (Zamba 模型)
zamba2 — Zamba2Model (Zamba2 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModel.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModel

class transformers.TFAutoModel

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个基础模型类。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- AlbertConfig 配置类：TFAlbertModel (ALBERT 模型)
- BartConfig 配置类：TFBartModel (BART 模型)
- BertConfig 配置类：TFBertModel (BERT 模型)
- BlenderbotConfig 配置类：TFBlenderbotModel (Blenderbot 模型)
- BlenderbotSmallConfig 配置类：TFBlenderbotSmallModel (BlenderbotSmall 模型)
- BlipConfig 配置类：TFBlipModel (BLIP 模型)
- CLIPConfig 配置类：TFCLIPModel (CLIP 模型)
- CTRLConfig 配置类：TFCTRLModel (CTRL 模型)
- CamembertConfig 配置类：TFCamembertModel (CamemBERT 模型)
- ConvBertConfig 配置类：TFConvBertModel (ConvBERT 模型)
- ConvNextConfig 配置类：TFConvNextModel (ConvNeXT 模型)
- ConvNextV2Config 配置类：TFConvNextV2Model (ConvNeXTV2 模型)
- CvtConfig 配置类：TFCvtModel (CvT 模型)
- DPRConfig 配置类：TFDPRQuestionEncoder (DPR 模型)
- Data2VecVisionConfig 配置类：TFData2VecVisionModel (Data2VecVision 模型)
- DebertaConfig 配置类：TFDebertaModel (DeBERTa 模型)
- DebertaV2Config 配置类：TFDebertaV2Model (DeBERTa-v2 模型)
- DeiTConfig 配置类：TFDeiTModel (DeiT 模型)
- DistilBertConfig 配置类：TFDistilBertModel (DistilBERT 模型)
- EfficientFormerConfig 配置类：TFEfficientFormerModel (EfficientFormer 模型)
- ElectraConfig 配置类：TFElectraModel (ELECTRA 模型)
- EsmConfig 配置类：TFEsmModel (ESM 模型)
- FlaubertConfig 配置类：TFFlaubertModel (FlauBERT 模型)
- FunnelConfig 配置类：TFFunnelModel 或 TFFunnelBaseModel (Funnel Transformer 模型)
- GPT2Config 配置类：TFGPT2Model (OpenAI GPT-2 模型)
- GPTJConfig 配置类：TFGPTJModel (GPT-J 模型)
- GroupViTConfig 配置类：TFGroupViTModel (GroupViT 模型)
- HubertConfig 配置类：TFHubertModel (Hubert 模型)
- IdeficsConfig 配置类：TFIdeficsModel (IDEFICS 模型)
- LEDConfig 配置类：TFLEDModel (LED 模型)
- LayoutLMConfig 配置类：TFLayoutLMModel (LayoutLM 模型)
- LayoutLMv3Config 配置类：TFLayoutLMv3Model (LayoutLMv3 模型)
- LongformerConfig 配置类：TFLongformerModel (Longformer 模型)
- LxmertConfig 配置类：TFLxmertModel (LXMERT 模型)
- MBartConfig 配置类：TFMBartModel (mBART 模型)
- MPNetConfig 配置类：TFMPNetModel (MPNet 模型)
- MT5Config 配置类：TFMT5Model (MT5 模型)
- MarianConfig 配置类：TFMarianModel (Marian 模型)
- MistralConfig 配置类：TFMistralModel (Mistral 模型)
- MobileBertConfig 配置类：TFMobileBertModel (MobileBERT 模型)
- MobileViTConfig 配置类：TFMobileViTModel (MobileViT 模型)
- OPTConfig 配置类：TFOPTModel (OPT 模型)
- OpenAIGPTConfig 配置类：TFOpenAIGPTModel (OpenAI GPT 模型)
- PegasusConfig 配置类：TFPegasusModel (Pegasus 模型)
- RegNetConfig 配置类：TFRegNetModel (RegNet 模型)
- RemBertConfig 配置类：TFRemBertModel (RemBERT 模型)
- ResNetConfig 配置类：TFResNetModel (ResNet 模型)
- RoFormerConfig 配置类：TFRoFormerModel (RoFormer 模型)
- RobertaConfig 配置类：TFRobertaModel (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：TFRobertaPreLayerNormModel (RoBERTa-PreLayerNorm 模型)
- SamConfig 配置类：TFSamModel (SAM 模型)
- SamVisionConfig 配置类：TFSamVisionModel (SamVisionModel 模型)
- SegformerConfig 配置类：TFSegformerModel (SegFormer 模型)
- Speech2TextConfig 配置类：TFSpeech2TextModel (Speech2Text 模型)
- SwiftFormerConfig 配置类：TFSwiftFormerModel (SwiftFormer 模型)
- SwinConfig 配置类：TFSwinModel (Swin Transformer 模型)
- T5Config 配置类：TFT5Model (T5 模型)
- TapasConfig 配置类：TFTapasModel (TAPAS 模型)
- TransfoXLConfig 配置类：TFTransfoXLModel (Transformer-XL 模型)
- ViTConfig 配置类：TFViTModel (ViT 模型)
- ViTMAEConfig 配置类：TFViTMAEModel (ViTMAE 模型)
- VisionTextDualEncoderConfig 配置类：TFVisionTextDualEncoderModel (VisionTextDualEncoder 模型)
- Wav2Vec2Config 配置类：TFWav2Vec2Model (Wav2Vec2 模型)
- WhisperConfig 配置类：TFWhisperModel (Whisper 模型)
- XGLMConfig 配置类：TFXGLMModel (XGLM 模型)
- XLMConfig 配置类：TFXLMModel (XLM 模型)
- XLMRobertaConfig 配置类：TFXLMRobertaModel (XLM-RoBERTa 模型)
- XLNetConfig 配置类：TFXLNetModel (XLNet 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（手动的注意力实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

通过配置实例化一个库中的基础模型类。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModel

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModel.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 字符串，即 huggingface.co 上模型仓库中托管的预训练模型的 model id。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个指向 PyTorch state_dict 保存文件的路径或 URL（例如 ./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应通过 config 参数提供一个配置对象。与使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型然后加载 TensorFlow 模型相比，这种加载路径较慢。
model_args (额外的位置参数，可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
cache_dir (str 或 os.PathLike, 可选) — 当不应使用标准缓存时，下载的预训练模型配置应缓存到的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用且被忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应对您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果 Hub 上的代码与模型的其余部分位于不同的仓库中，要使用的特定代码版本。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数，可选) — 可用于更新配置对象（加载后）和初始化模型（例如 output_attentions=True）。其行为取决于是否提供了 config 或自动加载配置：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应于任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个基础模型类。

albert — TFAlbertModel (ALBERT 模型)
bart — TFBartModel (BART 模型)
bert — TFBertModel (BERT 模型)
blenderbot — TFBlenderbotModel (Blenderbot 模型)
blenderbot-small — TFBlenderbotSmallModel (BlenderbotSmall 模型)
blip — TFBlipModel (BLIP 模型)
camembert — TFCamembertModel (CamemBERT 模型)
clip — TFCLIPModel (CLIP 模型)
convbert — TFConvBertModel (ConvBERT 模型)
convnext — TFConvNextModel (ConvNeXT 模型)
convnextv2 — TFConvNextV2Model (ConvNeXTV2 模型)
ctrl — TFCTRLModel (CTRL 模型)
cvt — TFCvtModel (CvT 模型)
data2vec-vision — TFData2VecVisionModel (Data2VecVision 模型)
deberta — TFDebertaModel (DeBERTa 模型)
deberta-v2 — TFDebertaV2Model (DeBERTa-v2 模型)
deit — TFDeiTModel (DeiT 模型)
distilbert — TFDistilBertModel (DistilBERT 模型)
dpr — TFDPRQuestionEncoder (DPR 模型)
efficientformer — TFEfficientFormerModel (EfficientFormer 模型)
electra — TFElectraModel (ELECTRA 模型)
esm — TFEsmModel (ESM 模型)
flaubert — TFFlaubertModel (FlauBERT 模型)
funnel — TFFunnelModel 或 TFFunnelBaseModel (Funnel Transformer 模型)
gpt-sw3 — TFGPT2Model (GPT-Sw3 模型)
gpt2 — TFGPT2Model (OpenAI GPT-2 模型)
gptj — TFGPTJModel (GPT-J 模型)
groupvit — TFGroupViTModel (GroupViT 模型)
hubert — TFHubertModel (Hubert 模型)
idefics — TFIdeficsModel (IDEFICS 模型)
layoutlm — TFLayoutLMModel (LayoutLM 模型)
layoutlmv3 — TFLayoutLMv3Model (LayoutLMv3 模型)
led — TFLEDModel (LED 模型)
longformer — TFLongformerModel (Longformer 模型)
lxmert — TFLxmertModel (LXMERT 模型)
marian — TFMarianModel (Marian 模型)
mbart — TFMBartModel (mBART 模型)
mistral — TFMistralModel (Mistral 模型)
mobilebert — TFMobileBertModel (MobileBERT 模型)
mobilevit — TFMobileViTModel (MobileViT 模型)
mpnet — TFMPNetModel (MPNet 模型)
mt5 — TFMT5Model (MT5 模型)
openai-gpt — TFOpenAIGPTModel (OpenAI GPT 模型)
opt — TFOPTModel (OPT 模型)
pegasus — TFPegasusModel (Pegasus 模型)
regnet — TFRegNetModel (RegNet 模型)
rembert — TFRemBertModel (RemBERT 模型)
resnet — TFResNetModel (ResNet 模型)
roberta — TFRobertaModel (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormModel (RoBERTa-PreLayerNorm 模型)
roformer — TFRoFormerModel (RoFormer 模型)
sam — TFSamModel (SAM 模型)
sam_vision_model — TFSamVisionModel (SamVisionModel 模型)
segformer — TFSegformerModel (SegFormer 模型)
speech_to_text — TFSpeech2TextModel (Speech2Text 模型)
swiftformer — TFSwiftFormerModel (SwiftFormer 模型)
swin — TFSwinModel (Swin Transformer 模型)
t5 — TFT5Model (T5 模型)
tapas — TFTapasModel (TAPAS 模型)
transfo-xl — TFTransfoXLModel (Transformer-XL 模型)
vision-text-dual-encoder — TFVisionTextDualEncoderModel (VisionTextDualEncoder 模型)
vit — TFViTModel (ViT 模型)
vit_mae — TFViTMAEModel (ViTMAE 模型)
wav2vec2 — TFWav2Vec2Model (Wav2Vec2 模型)
whisper — TFWhisperModel (Whisper 模型)
xglm — TFXGLMModel (XGLM 模型)
xlm — TFXLMModel (XLM 模型)
xlm-roberta — TFXLMRobertaModel (XLM-RoBERTa 模型)
xlnet — TFXLNetModel (XLNet 模型)

示例

>>> from transformers import AutoConfig, TFAutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModel.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModel.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModel

class transformers.FlaxAutoModel

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个基础模型类。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- AlbertConfig 配置类：FlaxAlbertModel (ALBERT 模型)
- BartConfig 配置类：FlaxBartModel (BART 模型)
- BeitConfig 配置类：FlaxBeitModel (BEiT 模型)
- BertConfig 配置类：FlaxBertModel (BERT 模型)
- BigBirdConfig 配置类：FlaxBigBirdModel (BigBird 模型)
- BlenderbotConfig 配置类：FlaxBlenderbotModel (Blenderbot 模型)
- BlenderbotSmallConfig 配置类：FlaxBlenderbotSmallModel (BlenderbotSmall 模型)
- BloomConfig 配置类：FlaxBloomModel (BLOOM 模型)
- CLIPConfig 配置类：FlaxCLIPModel (CLIP 模型)
- Dinov2Config 配置类：FlaxDinov2Model (DINOv2 模型)
- DistilBertConfig 配置类：FlaxDistilBertModel (DistilBERT 模型)
- ElectraConfig 配置类：FlaxElectraModel (ELECTRA 模型)
- GPT2Config 配置类：FlaxGPT2Model (OpenAI GPT-2 模型)
- GPTJConfig 配置类：FlaxGPTJModel (GPT-J 模型)
- GPTNeoConfig 配置类：FlaxGPTNeoModel (GPT Neo 模型)
- GemmaConfig 配置类：FlaxGemmaModel (Gemma 模型)
- LlamaConfig 配置类：FlaxLlamaModel (LLaMA 模型)
- LongT5Config 配置类：FlaxLongT5Model (LongT5 模型)
- MBartConfig 配置类：FlaxMBartModel (mBART 模型)
- MT5Config 配置类：FlaxMT5Model (MT5 模型)
- MarianConfig 配置类：FlaxMarianModel (Marian 模型)
- MistralConfig 配置类：FlaxMistralModel (Mistral 模型)
- OPTConfig 配置类：FlaxOPTModel (OPT 模型)
- PegasusConfig 配置类：FlaxPegasusModel (Pegasus 模型)
- RegNetConfig 配置类：FlaxRegNetModel (RegNet 模型)
- ResNetConfig 配置类：FlaxResNetModel (ResNet 模型)
- RoFormerConfig 配置类：FlaxRoFormerModel (RoFormer 模型)
- RobertaConfig 配置类：FlaxRobertaModel (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：FlaxRobertaPreLayerNormModel (RoBERTa-PreLayerNorm 模型)
- T5Config 配置类：FlaxT5Model (T5 模型)
- ViTConfig 配置类：FlaxViTModel (ViT 模型)
- VisionTextDualEncoderConfig 配置类：FlaxVisionTextDualEncoderModel (VisionTextDualEncoder 模型)
- Wav2Vec2Config 配置类：FlaxWav2Vec2Model (Wav2Vec2 模型)
- WhisperConfig 配置类：FlaxWhisperModel (Whisper 模型)
- XGLMConfig 配置类：FlaxXGLMModel (XGLM 模型)
- XLMRobertaConfig 配置类：FlaxXLMRobertaModel (XLM-RoBERTa 模型)
attn_implementation (str, 可选) — 模型中要使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认为手动的 "eager" 实现。

通过配置实例化一个库中的基础模型类。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, FlaxAutoModel

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModel.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 字符串，即 huggingface.co 上模型仓库中托管的预训练模型的 model id。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个指向 PyTorch state_dict 保存文件的路径或 URL（例如 ./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应通过 config 参数提供一个配置对象。与使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型然后加载 TensorFlow 模型相比，这种加载路径较慢。
model_args (额外的位置参数，可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
cache_dir (str 或 os.PathLike, 可选) — 当不应使用标准缓存时，下载的预训练模型配置应缓存到的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用且被忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应对您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果 Hub 上的代码与模型的其余部分位于不同的仓库中，要使用的特定代码版本。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数，可选) — 可用于更新配置对象（加载后）和初始化模型（例如 output_attentions=True）。其行为取决于是否提供了 config 或自动加载配置：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应于任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个基础模型类。

albert — FlaxAlbertModel (ALBERT 模型)
bart — FlaxBartModel (BART 模型)
beit — FlaxBeitModel (BEiT 模型)
bert — FlaxBertModel (BERT 模型)
big_bird — FlaxBigBirdModel (BigBird 模型)
blenderbot — FlaxBlenderbotModel (Blenderbot 模型)
blenderbot-small — FlaxBlenderbotSmallModel (BlenderbotSmall 模型)
bloom — FlaxBloomModel (BLOOM 模型)
clip — FlaxCLIPModel (CLIP 模型)
dinov2 — FlaxDinov2Model (DINOv2 模型)
distilbert — FlaxDistilBertModel (DistilBERT 模型)
electra — FlaxElectraModel (ELECTRA 模型)
gemma — FlaxGemmaModel (Gemma 模型)
gpt-sw3 — FlaxGPT2Model (GPT-Sw3 模型)
gpt2 — FlaxGPT2Model (OpenAI GPT-2 模型)
gpt_neo — FlaxGPTNeoModel (GPT Neo 模型)
gptj — FlaxGPTJModel (GPT-J 模型)
llama — FlaxLlamaModel (LLaMA 模型)
longt5 — FlaxLongT5Model (LongT5 模型)
marian — FlaxMarianModel (Marian 模型)
mbart — FlaxMBartModel (mBART 模型)
mistral — FlaxMistralModel (Mistral 模型)
mt5 — FlaxMT5Model (MT5 模型)
opt — FlaxOPTModel (OPT 模型)
pegasus — FlaxPegasusModel (Pegasus 模型)
regnet — FlaxRegNetModel (RegNet 模型)
resnet — FlaxResNetModel (ResNet 模型)
roberta — FlaxRobertaModel (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormModel (RoBERTa-PreLayerNorm 模型)
roformer — FlaxRoFormerModel (RoFormer 模型)
t5 — FlaxT5Model (T5 模型)
vision-text-dual-encoder — FlaxVisionTextDualEncoderModel (VisionTextDualEncoder 模型)
vit — FlaxViTModel (ViT 模型)
wav2vec2 — FlaxWav2Vec2Model (Wav2Vec2 模型)
whisper — FlaxWhisperModel (Whisper 模型)
xglm — FlaxXGLMModel (XGLM 模型)
xlm-roberta — FlaxXLMRobertaModel (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModel

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModel.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModel.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModel.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

通用预训练类

以下自动类可用于实例化带有预训练头的模型。

AutoModelForPreTraining

class transformers.AutoModelForPreTraining

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有预训练头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 待实例化的模型类是根据配置类选择的：
- AlbertConfig 配置类：AlbertForPreTraining (ALBERT 模型)
- BartConfig 配置类：BartForConditionalGeneration (BART 模型)
- BertConfig 配置类：BertForPreTraining (BERT 模型)
- BigBirdConfig 配置类：BigBirdForPreTraining (BigBird 模型)
- BloomConfig 配置类：BloomForCausalLM (BLOOM 模型)
- CTRLConfig 配置类：CTRLLMHeadModel (CTRL 模型)
- CamembertConfig 配置类：CamembertForMaskedLM (CamemBERT 模型)
- ColPaliConfig 配置类：ColPaliForRetrieval (ColPali 模型)
- ColQwen2Config 配置类：ColQwen2ForRetrieval (ColQwen2 模型)
- Data2VecTextConfig 配置类：Data2VecTextForMaskedLM (Data2VecText 模型)
- DebertaConfig 配置类：DebertaForMaskedLM (DeBERTa 模型)
- DebertaV2Config 配置类：DebertaV2ForMaskedLM (DeBERTa-v2 模型)
- DistilBertConfig 配置类：DistilBertForMaskedLM (DistilBERT 模型)
- ElectraConfig 配置类：ElectraForPreTraining (ELECTRA 模型)
- ErnieConfig 配置类：ErnieForPreTraining (ERNIE 模型)
- FNetConfig 配置类：FNetForPreTraining (FNet 模型)
- FSMTConfig 配置类：FSMTForConditionalGeneration (FairSeq 机器翻译模型)
- FalconMambaConfig 配置类：FalconMambaForCausalLM (FalconMamba 模型)
- FlaubertConfig 配置类：FlaubertWithLMHeadModel (FlauBERT 模型)
- FlavaConfig 配置类：FlavaForPreTraining (FLAVA 模型)
- FunnelConfig 配置类：FunnelForPreTraining (Funnel Transformer 模型)
- GPT2Config 配置类：GPT2LMHeadModel (OpenAI GPT-2 模型)
- GPTBigCodeConfig 配置类：GPTBigCodeForCausalLM (GPTBigCode 模型)
- GPTSanJapaneseConfig 配置类：GPTSanJapaneseForConditionalGeneration (GPTSAN-japanese 模型)
- Gemma3Config 配置类：Gemma3ForConditionalGeneration (Gemma3ForConditionalGeneration 模型)
- HieraConfig 配置类：HieraForPreTraining (Hiera 模型)
- IBertConfig 配置类：IBertForMaskedLM (I-BERT 模型)
- Idefics2Config 配置类：Idefics2ForConditionalGeneration (Idefics2 模型)
- Idefics3Config 配置类：Idefics3ForConditionalGeneration (Idefics3 模型)
- IdeficsConfig 配置类：IdeficsForVisionText2Text (IDEFICS 模型)
- JanusConfig 配置类：JanusForConditionalGeneration (Janus 模型)
- LayoutLMConfig 配置类：LayoutLMForMaskedLM (LayoutLM 模型)
- LlavaConfig 配置类：LlavaForConditionalGeneration (LLaVa 模型)
- LlavaNextConfig 配置类：LlavaNextForConditionalGeneration (LLaVA-NeXT 模型)
- LlavaNextVideoConfig 配置类：LlavaNextVideoForConditionalGeneration (LLaVa-NeXT-Video 模型)
- LlavaOnevisionConfig 配置类：LlavaOnevisionForConditionalGeneration (LLaVA-Onevision 模型)
- LongformerConfig 配置类：LongformerForMaskedLM (Longformer 模型)
- LukeConfig 配置类：LukeForMaskedLM (LUKE 模型)
- LxmertConfig 配置类：LxmertForPreTraining (LXMERT 模型)
- MPNetConfig 配置类：MPNetForMaskedLM (MPNet 模型)
- Mamba2Config 配置类：Mamba2ForCausalLM (mamba2 模型)
- MambaConfig 配置类：MambaForCausalLM (Mamba 模型)
- MegaConfig 配置类：MegaForMaskedLM (MEGA 模型)
- MegatronBertConfig 配置类：MegatronBertForPreTraining (Megatron-BERT 模型)
- Mistral3Config 配置类：Mistral3ForConditionalGeneration (Mistral3 模型)
- MllamaConfig 配置类：MllamaForConditionalGeneration (Mllama 模型)
- MobileBertConfig 配置类：MobileBertForPreTraining (MobileBERT 模型)
- MptConfig 配置类：MptForCausalLM (MPT 模型)
- MraConfig 配置类：MraForMaskedLM (MRA 模型)
- MvpConfig 配置类：MvpForConditionalGeneration (MVP 模型)
- NezhaConfig 配置类：NezhaForPreTraining (Nezha 模型)
- NllbMoeConfig 配置类：NllbMoeForConditionalGeneration (NLLB-MOE 模型)
- OpenAIGPTConfig 配置类：OpenAIGPTLMHeadModel (OpenAI GPT 模型)
- PaliGemmaConfig 配置类：PaliGemmaForConditionalGeneration (PaliGemma 模型)
- Qwen2AudioConfig 配置类：Qwen2AudioForConditionalGeneration (Qwen2Audio 模型)
- RetriBertConfig 配置类：RetriBertModel (RetriBERT 模型)
- RoCBertConfig 配置类：RoCBertForPreTraining (RoCBert 模型)
- RobertaConfig 配置类：RobertaForMaskedLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：RobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
- RwkvConfig 配置类：RwkvForCausalLM (RWKV 模型)
- SplinterConfig 配置类：SplinterForPreTraining (Splinter 模型)
- SqueezeBertConfig 配置类：SqueezeBertForMaskedLM (SqueezeBERT 模型)
- SwitchTransformersConfig 配置类：SwitchTransformersForConditionalGeneration (SwitchTransformers 模型)
- T5Config 配置类：T5ForConditionalGeneration (T5 模型)
- T5GemmaConfig 配置类：T5GemmaForConditionalGeneration (T5Gemma 模型)
- TapasConfig 配置类：TapasForMaskedLM (TAPAS 模型)
- TransfoXLConfig 配置类：TransfoXLLMHeadModel (Transformer-XL 模型)
- TvltConfig 配置类：TvltForPreTraining (TVLT 模型)
- UniSpeechConfig 配置类：UniSpeechForPreTraining (UniSpeech 模型)
- UniSpeechSatConfig 配置类：UniSpeechSatForPreTraining (UniSpeechSat 模型)
- ViTMAEConfig 配置类：ViTMAEForPreTraining (ViTMAE 模型)
- VideoLlavaConfig 配置类：VideoLlavaForConditionalGeneration (VideoLlava 模型)
- VideoMAEConfig 配置类：VideoMAEForPreTraining (VideoMAE 模型)
- VipLlavaConfig 配置类：VipLlavaForConditionalGeneration (VipLlava 模型)
- VisualBertConfig 配置类：VisualBertForPreTraining (VisualBERT 模型)
- Wav2Vec2Config 配置类：Wav2Vec2ForPreTraining (Wav2Vec2 模型)
- Wav2Vec2ConformerConfig 配置类：Wav2Vec2ConformerForPreTraining (Wav2Vec2-Conformer 模型)
- XLMConfig 配置类：XLMWithLMHeadModel (XLM 模型)
- XLMRobertaConfig 配置类：XLMRobertaForMaskedLM (XLM-RoBERTa 模型)
- XLMRobertaXLConfig 配置类：XLMRobertaXLForMaskedLM (XLM-RoBERTa-XL 模型)
- XLNetConfig 配置类：XLNetLMHeadModel (XLNet 模型)
- XmodConfig 配置类：XmodForMaskedLM (X-MOD 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现方式（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有预训练头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForPreTraining

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForPreTraining.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 上的模型仓库中预训练模型的 model id。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个指向 TensorFlow 索引检查点文件的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，应将 from_tf 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后再加载 PyTorch 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在目录中找到了名为 config.json 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 一个状态字典，用于替代从保存的权重文件加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存的目录路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。所有下载现在在可能的情况下默认恢复。将在 Transformers v5 中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每个请求上使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 Git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 Git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上存在的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则用于 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 Git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 Git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为根据是否提供 config 或自动加载而有所不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新都已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中每个与配置属性对应的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有预训练头）。

albert — AlbertForPreTraining (ALBERT 模型)
bart — BartForConditionalGeneration (BART 模型)
bert — BertForPreTraining (BERT 模型)
big_bird — BigBirdForPreTraining (BigBird 模型)
bloom — BloomForCausalLM (BLOOM 模型)
camembert — CamembertForMaskedLM (CamemBERT 模型)
colpali — ColPaliForRetrieval (ColPali 模型)
colqwen2 — ColQwen2ForRetrieval (ColQwen2 模型)
ctrl — CTRLLMHeadModel (CTRL 模型)
data2vec-text — Data2VecTextForMaskedLM (Data2VecText 模型)
deberta — DebertaForMaskedLM (DeBERTa 模型)
deberta-v2 — DebertaV2ForMaskedLM (DeBERTa-v2 模型)
distilbert — DistilBertForMaskedLM (DistilBERT 模型)
electra — ElectraForPreTraining (ELECTRA 模型)
ernie — ErnieForPreTraining (ERNIE 模型)
falcon_mamba — FalconMambaForCausalLM (FalconMamba 模型)
flaubert — FlaubertWithLMHeadModel (FlauBERT 模型)
flava — FlavaForPreTraining (FLAVA 模型)
fnet — FNetForPreTraining (FNet 模型)
fsmt — FSMTForConditionalGeneration (FairSeq 机器翻译模型)
funnel — FunnelForPreTraining (Funnel Transformer 模型)
gemma3 — Gemma3ForConditionalGeneration (Gemma3ForConditionalGeneration 模型)
gpt-sw3 — GPT2LMHeadModel (GPT-Sw3 模型)
gpt2 — GPT2LMHeadModel (OpenAI GPT-2 模型)
gpt_bigcode — GPTBigCodeForCausalLM (GPTBigCode 模型)
gptsan-japanese — GPTSanJapaneseForConditionalGeneration (GPTSAN-japanese 模型)
hiera — HieraForPreTraining (Hiera 模型)
ibert — IBertForMaskedLM (I-BERT 模型)
idefics — IdeficsForVisionText2Text (IDEFICS 模型)
idefics2 — Idefics2ForConditionalGeneration (Idefics2 模型)
idefics3 — Idefics3ForConditionalGeneration (Idefics3 模型)
janus — JanusForConditionalGeneration (Janus 模型)
layoutlm — LayoutLMForMaskedLM (LayoutLM 模型)
llava — LlavaForConditionalGeneration (LLaVa 模型)
llava_next — LlavaNextForConditionalGeneration (LLaVA-NeXT 模型)
llava_next_video — LlavaNextVideoForConditionalGeneration (LLaVa-NeXT-Video 模型)
llava_onevision — LlavaOnevisionForConditionalGeneration (LLaVA-Onevision 模型)
longformer — LongformerForMaskedLM (Longformer 模型)
luke — LukeForMaskedLM (LUKE 模型)
lxmert — LxmertForPreTraining (LXMERT 模型)
mamba — MambaForCausalLM (Mamba 模型)
mamba2 — Mamba2ForCausalLM (mamba2 模型)
mega — MegaForMaskedLM (MEGA 模型)
megatron-bert — MegatronBertForPreTraining (Megatron-BERT 模型)
mistral3 — Mistral3ForConditionalGeneration (Mistral3 模型)
mllama — MllamaForConditionalGeneration (Mllama 模型)
mobilebert — MobileBertForPreTraining (MobileBERT 模型)
mpnet — MPNetForMaskedLM (MPNet 模型)
mpt — MptForCausalLM (MPT 模型)
mra — MraForMaskedLM (MRA 模型)
mvp — MvpForConditionalGeneration (MVP 模型)
nezha — NezhaForPreTraining (Nezha 模型)
nllb-moe — NllbMoeForConditionalGeneration (NLLB-MOE 模型)
openai-gpt — OpenAIGPTLMHeadModel (OpenAI GPT 模型)
paligemma — PaliGemmaForConditionalGeneration (PaliGemma 模型)
qwen2_audio — Qwen2AudioForConditionalGeneration (Qwen2Audio 模型)
retribert — RetriBertModel (RetriBERT 模型)
roberta — RobertaForMaskedLM (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertForPreTraining (RoCBert 模型)
rwkv — RwkvForCausalLM (RWKV 模型)
splinter — SplinterForPreTraining (Splinter 模型)
squeezebert — SqueezeBertForMaskedLM (SqueezeBERT 模型)
switch_transformers — SwitchTransformersForConditionalGeneration (SwitchTransformers 模型)
t5 — T5ForConditionalGeneration (T5 模型)
t5gemma — T5GemmaForConditionalGeneration (T5Gemma 模型)
tapas — TapasForMaskedLM (TAPAS 模型)
transfo-xl — TransfoXLLMHeadModel (Transformer-XL 模型)
tvlt — TvltForPreTraining (TVLT 模型)
unispeech — UniSpeechForPreTraining (UniSpeech 模型)
unispeech-sat — UniSpeechSatForPreTraining (UniSpeechSat 模型)
video_llava — VideoLlavaForConditionalGeneration (VideoLlava 模型)
videomae — VideoMAEForPreTraining (VideoMAE 模型)
vipllava — VipLlavaForConditionalGeneration (VipLlava 模型)
visual_bert — VisualBertForPreTraining (VisualBERT 模型)
vit_mae — ViTMAEForPreTraining (ViTMAE 模型)
wav2vec2 — Wav2Vec2ForPreTraining (Wav2Vec2 模型)
wav2vec2-conformer — Wav2Vec2ConformerForPreTraining (Wav2Vec2-Conformer 模型)
xlm — XLMWithLMHeadModel (XLM 模型)
xlm-roberta — XLMRobertaForMaskedLM (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLForMaskedLM (XLM-RoBERTa-XL 模型)
xlnet — XLNetLMHeadModel (XLNet 模型)
xmod — XmodForMaskedLM (X-MOD 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForPreTraining.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForPreTraining

class transformers.TFAutoModelForPreTraining

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有预训练头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 待实例化的模型类是根据配置类选择的：
- AlbertConfig 配置类：TFAlbertForPreTraining (ALBERT 模型)
- BartConfig 配置类：TFBartForConditionalGeneration (BART 模型)
- BertConfig 配置类：TFBertForPreTraining (BERT 模型)
- CTRLConfig 配置类：TFCTRLLMHeadModel (CTRL 模型)
- CamembertConfig 配置类：TFCamembertForMaskedLM (CamemBERT 模型)
- DistilBertConfig 配置类：TFDistilBertForMaskedLM (DistilBERT 模型)
- ElectraConfig 配置类：TFElectraForPreTraining (ELECTRA 模型)
- FlaubertConfig 配置类：TFFlaubertWithLMHeadModel (FlauBERT 模型)
- FunnelConfig 配置类：TFFunnelForPreTraining (Funnel Transformer 模型)
- GPT2Config 配置类：TFGPT2LMHeadModel (OpenAI GPT-2 模型)
- IdeficsConfig 配置类：TFIdeficsForVisionText2Text (IDEFICS 模型)
- LayoutLMConfig 配置类：TFLayoutLMForMaskedLM (LayoutLM 模型)
- LxmertConfig 配置类：TFLxmertForPreTraining (LXMERT 模型)
- MPNetConfig 配置类：TFMPNetForMaskedLM (MPNet 模型)
- MobileBertConfig 配置类：TFMobileBertForPreTraining (MobileBERT 模型)
- OpenAIGPTConfig 配置类：TFOpenAIGPTLMHeadModel (OpenAI GPT 模型)
- RobertaConfig 配置类：TFRobertaForMaskedLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：TFRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
- T5Config 配置类：TFT5ForConditionalGeneration (T5 模型)
- TapasConfig 配置类：TFTapasForMaskedLM (TAPAS 模型)
- TransfoXLConfig 配置类：TFTransfoXLLMHeadModel (Transformer-XL 模型)
- ViTMAEConfig 配置类：TFViTMAEForPreTraining (ViTMAE 模型)
- XLMConfig 配置类：TFXLMWithLMHeadModel (XLM 模型)
- XLMRobertaConfig 配置类：TFXLMRobertaForMaskedLM (XLM-RoBERTa 模型)
- XLNetConfig 配置类：TFXLNetLMHeadModel (XLNet 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现方式（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有预训练头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForPreTraining

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForPreTraining.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 上的模型仓库中预训练模型的 model id。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个指向 PyTorch state_dict 保存文件的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，应将 from_pt 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型然后再加载 TensorFlow 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在目录中找到了名为 config.json 的配置文件。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。现在所有下载在可能的情况下都默认支持断点续传。此参数将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 一个字典，用于指定按协议或端点使用的代理服务器，例如：{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则指定用于 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载配置：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将被传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有预训练头）。

albert — TFAlbertForPreTraining (ALBERT 模型)
bart — TFBartForConditionalGeneration (BART 模型)
bert — TFBertForPreTraining (BERT 模型)
camembert — TFCamembertForMaskedLM (CamemBERT 模型)
ctrl — TFCTRLLMHeadModel (CTRL 模型)
distilbert — TFDistilBertForMaskedLM (DistilBERT 模型)
electra — TFElectraForPreTraining (ELECTRA 模型)
flaubert — TFFlaubertWithLMHeadModel (FlauBERT 模型)
funnel — TFFunnelForPreTraining (Funnel Transformer 模型)
gpt-sw3 — TFGPT2LMHeadModel (GPT-Sw3 模型)
gpt2 — TFGPT2LMHeadModel (OpenAI GPT-2 模型)
idefics — TFIdeficsForVisionText2Text (IDEFICS 模型)
layoutlm — TFLayoutLMForMaskedLM (LayoutLM 模型)
lxmert — TFLxmertForPreTraining (LXMERT 模型)
mobilebert — TFMobileBertForPreTraining (MobileBERT 模型)
mpnet — TFMPNetForMaskedLM (MPNet 模型)
openai-gpt — TFOpenAIGPTLMHeadModel (OpenAI GPT 模型)
roberta — TFRobertaForMaskedLM (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
t5 — TFT5ForConditionalGeneration (T5 模型)
tapas — TFTapasForMaskedLM (TAPAS 模型)
transfo-xl — TFTransfoXLLMHeadModel (Transformer-XL 模型)
vit_mae — TFViTMAEForPreTraining (ViTMAE 模型)
xlm — TFXLMWithLMHeadModel (XLM 模型)
xlm-roberta — TFXLMRobertaForMaskedLM (XLM-RoBERTa 模型)
xlnet — TFXLNetLMHeadModel (XLNet 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForPreTraining.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForPreTraining

class transformers.FlaxAutoModelForPreTraining

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有预训练头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- AlbertConfig 配置类：FlaxAlbertForPreTraining (ALBERT 模型)
- BartConfig 配置类：FlaxBartForConditionalGeneration (BART 模型)
- BertConfig 配置类：FlaxBertForPreTraining (BERT 模型)
- BigBirdConfig 配置类：FlaxBigBirdForPreTraining (BigBird 模型)
- ElectraConfig 配置类：FlaxElectraForPreTraining (ELECTRA 模型)
- LongT5Config 配置类：FlaxLongT5ForConditionalGeneration (LongT5 模型)
- MBartConfig 配置类：FlaxMBartForConditionalGeneration (mBART 模型)
- MT5Config 配置类：FlaxMT5ForConditionalGeneration (MT5 模型)
- RoFormerConfig 配置类：FlaxRoFormerForMaskedLM (RoFormer 模型)
- RobertaConfig 配置类：FlaxRobertaForMaskedLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：FlaxRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
- T5Config 配置类：FlaxT5ForConditionalGeneration (T5 模型)
- Wav2Vec2Config 配置类：FlaxWav2Vec2ForPreTraining (Wav2Vec2 模型)
- WhisperConfig 配置类：FlaxWhisperForConditionalGeneration (Whisper 模型)
- XLMRobertaConfig 配置类：FlaxXLMRobertaForMaskedLM (XLM-RoBERTa 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有预训练头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForPreTraining

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForPreTraining.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 上的模型仓库中的预训练模型的*模型 ID*。
- 一个指向包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如 ./my_model_directory/。
- 一个指向*PyTorch state_dict 保存文件*的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型然后加载 TensorFlow 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到名为 *config.json* 的配置 JSON 文件。
cache_dir (str or os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。现在所有下载在可能的情况下都默认支持断点续传。此参数将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 一个字典，用于指定按协议或端点使用的代理服务器，例如：{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则指定用于 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载配置：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将被传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有预训练头）。

albert — FlaxAlbertForPreTraining (ALBERT 模型)
bart — FlaxBartForConditionalGeneration (BART 模型)
bert — FlaxBertForPreTraining (BERT 模型)
big_bird — FlaxBigBirdForPreTraining (BigBird 模型)
electra — FlaxElectraForPreTraining (ELECTRA 模型)
longt5 — FlaxLongT5ForConditionalGeneration (LongT5 模型)
mbart — FlaxMBartForConditionalGeneration (mBART 模型)
mt5 — FlaxMT5ForConditionalGeneration (MT5 模型)
roberta — FlaxRobertaForMaskedLM (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
roformer — FlaxRoFormerForMaskedLM (RoFormer 模型)
t5 — FlaxT5ForConditionalGeneration (T5 模型)
wav2vec2 — FlaxWav2Vec2ForPreTraining (Wav2Vec2 模型)
whisper — FlaxWhisperForConditionalGeneration (Whisper 模型)
xlm-roberta — FlaxXLMRobertaForMaskedLM (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForPreTraining

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForPreTraining.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForPreTraining.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

自然语言处理

以下自动类可用于以下自然语言处理任务。

AutoModelForCausalLM

class transformers.AutoModelForCausalLM

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有因果语言建模头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 用于实例化的模型类是根据配置类选择的：
- ArceeConfig 配置类： ArceeForCausalLM (Arcee 模型)
- AriaTextConfig 配置类： AriaTextForCausalLM (AriaText 模型)
- BambaConfig 配置类： BambaForCausalLM (Bamba 模型)
- BartConfig 配置类： BartForCausalLM (BART 模型)
- BertConfig 配置类： BertLMHeadModel (BERT 模型)
- BertGenerationConfig 配置类： BertGenerationDecoder (Bert Generation 模型)
- BigBirdConfig 配置类： BigBirdForCausalLM (BigBird 模型)
- BigBirdPegasusConfig 配置类： BigBirdPegasusForCausalLM (BigBird-Pegasus 模型)
- BioGptConfig 配置类： BioGptForCausalLM (BioGpt 模型)
- BitNetConfig 配置类： BitNetForCausalLM (BitNet 模型)
- BlenderbotConfig 配置类： BlenderbotForCausalLM (Blenderbot 模型)
- BlenderbotSmallConfig 配置类： BlenderbotSmallForCausalLM (BlenderbotSmall 模型)
- BloomConfig 配置类： BloomForCausalLM (BLOOM 模型)
- CTRLConfig 配置类： CTRLLMHeadModel (CTRL 模型)
- CamembertConfig 配置类： CamembertForCausalLM (CamemBERT 模型)
- CodeGenConfig 配置类： CodeGenForCausalLM (CodeGen 模型)
- Cohere2Config 配置类： Cohere2ForCausalLM (Cohere2 模型)
- CohereConfig 配置类： CohereForCausalLM (Cohere 模型)
- CpmAntConfig 配置类： CpmAntForCausalLM (CPM-Ant 模型)
- Data2VecTextConfig 配置类： Data2VecTextForCausalLM (Data2VecText 模型)
- DbrxConfig 配置类： DbrxForCausalLM (DBRX 模型)
- DeepseekV3Config 配置类： DeepseekV3ForCausalLM (DeepSeek-V3 模型)
- DiffLlamaConfig 配置类： DiffLlamaForCausalLM (DiffLlama 模型)
- Dots1Config 配置类： Dots1ForCausalLM (dots1 模型)
- ElectraConfig 配置类： ElectraForCausalLM (ELECTRA 模型)
- Emu3Config 配置类： Emu3ForCausalLM (Emu3 模型)
- ErnieConfig 配置类： ErnieForCausalLM (ERNIE 模型)
- FalconConfig 配置类： FalconForCausalLM (Falcon 模型)
- FalconH1Config 配置类： FalconH1ForCausalLM (FalconH1 模型)
- FalconMambaConfig 配置类： FalconMambaForCausalLM (FalconMamba 模型)
- FuyuConfig 配置类： FuyuForCausalLM (Fuyu 模型)
- GPT2Config 配置类： GPT2LMHeadModel (OpenAI GPT-2 模型)
- GPTBigCodeConfig 配置类： GPTBigCodeForCausalLM (GPTBigCode 模型)
- GPTJConfig 配置类： GPTJForCausalLM (GPT-J 模型)
- GPTNeoConfig 配置类： GPTNeoForCausalLM (GPT Neo 模型)
- GPTNeoXConfig 配置类： GPTNeoXForCausalLM (GPT NeoX 模型)
- GPTNeoXJapaneseConfig 配置类： GPTNeoXJapaneseForCausalLM (GPT NeoX Japanese 模型)
- Gemma2Config 配置类： Gemma2ForCausalLM (Gemma2 模型)
- Gemma3Config 配置类： Gemma3ForConditionalGeneration (Gemma3ForConditionalGeneration 模型)
- Gemma3TextConfig 配置类： Gemma3ForCausalLM (Gemma3ForCausalLM 模型)
- Gemma3nConfig 配置类： Gemma3nForConditionalGeneration (Gemma3nForConditionalGeneration 模型)
- Gemma3nTextConfig 配置类： Gemma3nForCausalLM (Gemma3nForCausalLM 模型)
- GemmaConfig 配置类： GemmaForCausalLM (Gemma 模型)
- GitConfig 配置类： GitForCausalLM (GIT 模型)
- Glm4Config 配置类： Glm4ForCausalLM (GLM4 模型)
- GlmConfig 配置类： GlmForCausalLM (GLM 模型)
- GotOcr2Config 配置类： GotOcr2ForConditionalGeneration (GOT-OCR2 模型)
- GraniteConfig 配置类： GraniteForCausalLM (Granite 模型)
- GraniteMoeConfig 配置类： GraniteMoeForCausalLM (GraniteMoeMoe 模型)
- GraniteMoeHybridConfig 配置类： GraniteMoeHybridForCausalLM (GraniteMoeHybrid 模型)
- GraniteMoeSharedConfig 配置类： GraniteMoeSharedForCausalLM (GraniteMoeSharedMoe 模型)
- HeliumConfig 配置类： HeliumForCausalLM (Helium 模型)
- JambaConfig 配置类： JambaForCausalLM (Jamba 模型)
- JetMoeConfig 配置类： JetMoeForCausalLM (JetMoe 模型)
- Llama4Config 配置类： Llama4ForCausalLM (Llama4 模型)
- Llama4TextConfig 配置类： Llama4ForCausalLM (Llama4ForCausalLM 模型)
- LlamaConfig 配置类： LlamaForCausalLM (LLaMA 模型)
- MBartConfig 配置类： MBartForCausalLM (mBART 模型)
- Mamba2Config 配置类： Mamba2ForCausalLM (mamba2 模型)
- MambaConfig 配置类： MambaForCausalLM (Mamba 模型)
- MarianConfig 配置类： MarianForCausalLM (Marian 模型)
- MegaConfig 配置类： MegaForCausalLM (MEGA 模型)
- MegatronBertConfig 配置类： MegatronBertForCausalLM (Megatron-BERT 模型)
- MiniMaxConfig 配置类： MiniMaxForCausalLM (MiniMax 模型)
- MistralConfig 配置类： MistralForCausalLM (Mistral 模型)
- MixtralConfig 配置类： MixtralForCausalLM (Mixtral 模型)
- MllamaConfig 配置类： MllamaForCausalLM (Mllama 模型)
- MoshiConfig 配置类： MoshiForCausalLM (Moshi 模型)
- MptConfig 配置类： MptForCausalLM (MPT 模型)
- MusicgenConfig 配置类： MusicgenForCausalLM (MusicGen 模型)
- MusicgenMelodyConfig 配置类： MusicgenMelodyForCausalLM (MusicGen Melody 模型)
- MvpConfig 配置类： MvpForCausalLM (MVP 模型)
- NemotronConfig 配置类： NemotronForCausalLM (Nemotron 模型)
- OPTConfig 配置类： OPTForCausalLM (OPT 模型)
- Olmo2Config 配置类： Olmo2ForCausalLM (OLMo2 模型)
- OlmoConfig 配置类： OlmoForCausalLM (OLMo 模型)
- OlmoeConfig 配置类： OlmoeForCausalLM (OLMoE 模型)
- OpenAIGPTConfig 配置类： OpenAIGPTLMHeadModel (OpenAI GPT 模型)
- OpenLlamaConfig 配置类： OpenLlamaForCausalLM (OpenLlama 模型)
- PLBartConfig 配置类： PLBartForCausalLM (PLBart 模型)
- PegasusConfig 配置类： PegasusForCausalLM (Pegasus 模型)
- PersimmonConfig 配置类： PersimmonForCausalLM (Persimmon 模型)
- Phi3Config 配置类： Phi3ForCausalLM (Phi3 模型)
- Phi4MultimodalConfig 配置类： Phi4MultimodalForCausalLM (Phi4Multimodal 模型)
- PhiConfig 配置类： PhiForCausalLM (Phi 模型)
- PhimoeConfig 配置类： PhimoeForCausalLM (Phimoe 模型)
- ProphetNetConfig 配置类： ProphetNetForCausalLM (ProphetNet 模型)
- QDQBertConfig 配置类： QDQBertLMHeadModel (QDQBert 模型)
- Qwen2Config 配置类： Qwen2ForCausalLM (Qwen2 模型)
- Qwen2MoeConfig 配置类： Qwen2MoeForCausalLM (Qwen2MoE 模型)
- Qwen3Config 配置类： Qwen3ForCausalLM (Qwen3 模型)
- Qwen3MoeConfig 配置类： Qwen3MoeForCausalLM (Qwen3MoE 模型)
- RecurrentGemmaConfig 配置类： RecurrentGemmaForCausalLM (RecurrentGemma 模型)
- ReformerConfig 配置类： ReformerModelWithLMHead (Reformer 模型)
- RemBertConfig 配置类： RemBertForCausalLM (RemBERT 模型)
- RoCBertConfig 配置类： RoCBertForCausalLM (RoCBert 模型)
- RoFormerConfig 配置类： RoFormerForCausalLM (RoFormer 模型)
- RobertaConfig 配置类： RobertaForCausalLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类： RobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
- RwkvConfig 配置类： RwkvForCausalLM (RWKV 模型)
- SmolLM3Config 配置类： SmolLM3ForCausalLM (SmolLM3 模型)
- Speech2Text2Config 配置类： Speech2Text2ForCausalLM (Speech2Text2 模型)
- StableLmConfig 配置类： StableLmForCausalLM (StableLm 模型)
- Starcoder2Config 配置类： Starcoder2ForCausalLM (Starcoder2 模型)
- TrOCRConfig 配置类： TrOCRForCausalLM (TrOCR 模型)
- TransfoXLConfig 配置类： TransfoXLLMHeadModel (Transformer-XL 模型)
- WhisperConfig 配置类： WhisperForCausalLM (Whisper 模型)
- XGLMConfig 配置类： XGLMForCausalLM (XGLM 模型)
- XLMConfig 配置类： XLMWithLMHeadModel (XLM 模型)
- XLMProphetNetConfig 配置类： XLMProphetNetForCausalLM (XLM-ProphetNet 模型)
- XLMRobertaConfig 配置类： XLMRobertaForCausalLM (XLM-RoBERTa 模型)
- XLMRobertaXLConfig 配置类： XLMRobertaXLForCausalLM (XLM-RoBERTa-XL 模型)
- XLNetConfig 配置类： XLNetLMHeadModel (XLNet 模型)
- XmodConfig 配置类： XmodForCausalLM (X-MOD 模型)
- Zamba2Config 配置类： Zamba2ForCausalLM (Zamba2 模型)
- ZambaConfig 配置类： ZambaForCausalLM (Zamba 模型)
attn_implementation (str, 可选) — 要在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认值为手动的 "eager" 实现。

从配置实例化库中的一个模型类（带有因果语言建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForCausalLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForCausalLM.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个*tensorflow 索引检查点文件*的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，from_tf 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后再加载 PyTorch 模型要慢。
model_args (其他位置参数, 可选) — 将传递给底层模型 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID* 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供一个本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 *config.json* 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 要使用的状态字典，而不是从保存的权重文件中加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能时都会默认续传。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用于每个请求。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型文件中定义模型。此选项只应为受信任的且您已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则指定 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (其他关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载配置：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有对配置的相关更新已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类初始化函数 (from_pretrained())。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有因果语言建模头）。

arcee — ArceeForCausalLM (Arcee 模型)
aria_text — AriaTextForCausalLM (AriaText 模型)
bamba — BambaForCausalLM (Bamba 模型)
bart — BartForCausalLM (BART 模型)
bert — BertLMHeadModel (BERT 模型)
bert-generation — BertGenerationDecoder (Bert Generation 模型)
big_bird — BigBirdForCausalLM (BigBird 模型)
bigbird_pegasus — BigBirdPegasusForCausalLM (BigBird-Pegasus 模型)
biogpt — BioGptForCausalLM (BioGpt 模型)
bitnet — BitNetForCausalLM (BitNet 模型)
blenderbot — BlenderbotForCausalLM (Blenderbot 模型)
blenderbot-small — BlenderbotSmallForCausalLM (BlenderbotSmall 模型)
bloom — BloomForCausalLM (BLOOM 模型)
camembert — CamembertForCausalLM (CamemBERT 模型)
code_llama — LlamaForCausalLM (CodeLlama 模型)
codegen — CodeGenForCausalLM (CodeGen 模型)
cohere — CohereForCausalLM (Cohere 模型)
cohere2 — Cohere2ForCausalLM (Cohere2 模型)
cpmant — CpmAntForCausalLM (CPM-Ant 模型)
ctrl — CTRLLMHeadModel (CTRL 模型)
data2vec-text — Data2VecTextForCausalLM (Data2VecText 模型)
dbrx — DbrxForCausalLM (DBRX 模型)
deepseek_v3 — DeepseekV3ForCausalLM (DeepSeek-V3 模型)
diffllama — DiffLlamaForCausalLM (DiffLlama 模型)
dots1 — Dots1ForCausalLM (dots1 模型)
electra — ElectraForCausalLM (ELECTRA 模型)
emu3 — Emu3ForCausalLM (Emu3 模型)
ernie — ErnieForCausalLM (ERNIE 模型)
falcon — FalconForCausalLM (Falcon 模型)
falcon_h1 — FalconH1ForCausalLM (FalconH1 模型)
falcon_mamba — FalconMambaForCausalLM (FalconMamba 模型)
fuyu — FuyuForCausalLM (Fuyu 模型)
gemma — GemmaForCausalLM (Gemma 模型)
gemma2 — Gemma2ForCausalLM (Gemma2 模型)
gemma3 — Gemma3ForConditionalGeneration (Gemma3ForConditionalGeneration 模型)
gemma3_text — Gemma3ForCausalLM (Gemma3ForCausalLM 模型)
gemma3n — Gemma3nForConditionalGeneration (Gemma3nForConditionalGeneration 模型)
gemma3n_text — Gemma3nForCausalLM (Gemma3nForCausalLM 模型)
git — GitForCausalLM (GIT 模型)
glm — GlmForCausalLM (GLM 模型)
glm4 — Glm4ForCausalLM (GLM4 模型)
got_ocr2 — GotOcr2ForConditionalGeneration (GOT-OCR2 模型)
gpt-sw3 — GPT2LMHeadModel (GPT-Sw3 模型)
gpt2 — GPT2LMHeadModel (OpenAI GPT-2 模型)
gpt_bigcode — GPTBigCodeForCausalLM (GPTBigCode 模型)
gpt_neo — GPTNeoForCausalLM (GPT Neo 模型)
gpt_neox — GPTNeoXForCausalLM (GPT NeoX 模型)
gpt_neox_japanese — GPTNeoXJapaneseForCausalLM (GPT NeoX Japanese 模型)
gptj — GPTJForCausalLM (GPT-J 模型)
granite — GraniteForCausalLM (Granite 模型)
granitemoe — GraniteMoeForCausalLM (GraniteMoeMoe 模型)
granitemoehybrid — GraniteMoeHybridForCausalLM (GraniteMoeHybrid 模型)
granitemoeshared — GraniteMoeSharedForCausalLM (GraniteMoeSharedMoe 模型)
helium — HeliumForCausalLM (Helium 模型)
jamba — JambaForCausalLM (Jamba 模型)
jetmoe — JetMoeForCausalLM (JetMoe 模型)
llama — LlamaForCausalLM (LLaMA 模型)
llama4 — Llama4ForCausalLM (Llama4 模型)
llama4_text — Llama4ForCausalLM (Llama4ForCausalLM 模型)
mamba — MambaForCausalLM (Mamba 模型)
mamba2 — Mamba2ForCausalLM (mamba2 模型)
marian — MarianForCausalLM (Marian 模型)
mbart — MBartForCausalLM (mBART 模型)
mega — MegaForCausalLM (MEGA 模型)
megatron-bert — MegatronBertForCausalLM (Megatron-BERT 模型)
minimax — MiniMaxForCausalLM (MiniMax 模型)
mistral — MistralForCausalLM (Mistral 模型)
mixtral — MixtralForCausalLM (Mixtral 模型)
mllama — MllamaForCausalLM (Mllama 模型)
moshi — MoshiForCausalLM (Moshi 模型)
mpt — MptForCausalLM (MPT 模型)
musicgen — MusicgenForCausalLM (MusicGen 模型)
musicgen_melody — MusicgenMelodyForCausalLM (MusicGen Melody 模型)
mvp — MvpForCausalLM (MVP 模型)
nemotron — NemotronForCausalLM (Nemotron 模型)
olmo — OlmoForCausalLM (OLMo 模型)
olmo2 — Olmo2ForCausalLM (OLMo2 模型)
olmoe — OlmoeForCausalLM (OLMoE 模型)
open-llama — OpenLlamaForCausalLM (OpenLlama 模型)
openai-gpt — OpenAIGPTLMHeadModel (OpenAI GPT 模型)
opt — OPTForCausalLM (OPT 模型)
pegasus — PegasusForCausalLM (Pegasus 模型)
persimmon — PersimmonForCausalLM (Persimmon 模型)
phi — PhiForCausalLM (Phi 模型)
phi3 — Phi3ForCausalLM (Phi3 模型)
phi4_multimodal — Phi4MultimodalForCausalLM (Phi4Multimodal 模型)
phimoe — PhimoeForCausalLM (Phimoe 模型)
plbart — PLBartForCausalLM (PLBart 模型)
prophetnet — ProphetNetForCausalLM (ProphetNet 模型)
qdqbert — QDQBertLMHeadModel (QDQBert 模型)
qwen2 — Qwen2ForCausalLM (Qwen2 模型)
qwen2_moe — Qwen2MoeForCausalLM (Qwen2MoE 模型)
qwen3 — Qwen3ForCausalLM (Qwen3 模型)
qwen3_moe — Qwen3MoeForCausalLM (Qwen3MoE 模型)
recurrent_gemma — RecurrentGemmaForCausalLM (RecurrentGemma 模型)
reformer — ReformerModelWithLMHead (Reformer 模型)
rembert — RemBertForCausalLM (RemBERT 模型)
roberta — RobertaForCausalLM (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertForCausalLM (RoCBert 模型)
roformer — RoFormerForCausalLM (RoFormer 模型)
rwkv — RwkvForCausalLM (RWKV 模型)
smollm3 — SmolLM3ForCausalLM (SmolLM3 模型)
speech_to_text_2 — Speech2Text2ForCausalLM (Speech2Text2 模型)
stablelm — StableLmForCausalLM (StableLm 模型)
starcoder2 — Starcoder2ForCausalLM (Starcoder2 模型)
transfo-xl — TransfoXLLMHeadModel (Transformer-XL 模型)
trocr — TrOCRForCausalLM (TrOCR 模型)
whisper — WhisperForCausalLM (Whisper 模型)
xglm — XGLMForCausalLM (XGLM 模型)
xlm — XLMWithLMHeadModel (XLM 模型)
xlm-prophetnet — XLMProphetNetForCausalLM (XLM-ProphetNet 模型)
xlm-roberta — XLMRobertaForCausalLM (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLForCausalLM (XLM-RoBERTa-XL 模型)
xlnet — XLNetLMHeadModel (XLNet 模型)
xmod — XmodForCausalLM (X-MOD 模型)
zamba — ZambaForCausalLM (Zamba 模型)
zamba2 — Zamba2ForCausalLM (Zamba2 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForCausalLM.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForCausalLM

class transformers.TFAutoModelForCausalLM

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有因果语言建模头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 用于实例化的模型类是根据配置类选择的：
- BertConfig 配置类： TFBertLMHeadModel (BERT 模型)
- CTRLConfig 配置类： TFCTRLLMHeadModel (CTRL 模型)
- CamembertConfig 配置类： TFCamembertForCausalLM (CamemBERT 模型)
- GPT2Config 配置类： TFGPT2LMHeadModel (OpenAI GPT-2 模型)
- GPTJConfig 配置类： TFGPTJForCausalLM (GPT-J 模型)
- MistralConfig 配置类： TFMistralForCausalLM (Mistral 模型)
- OPTConfig 配置类： TFOPTForCausalLM (OPT 模型)
- OpenAIGPTConfig 配置类： TFOpenAIGPTLMHeadModel (OpenAI GPT 模型)
- RemBertConfig 配置类： TFRemBertForCausalLM (RemBERT 模型)
- RoFormerConfig 配置类： TFRoFormerForCausalLM (RoFormer 模型)
- RobertaConfig 配置类： TFRobertaForCausalLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类： TFRobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
- TransfoXLConfig 配置类： TFTransfoXLLMHeadModel (Transformer-XL 模型)
- XGLMConfig 配置类： TFXGLMForCausalLM (XGLM 模型)
- XLMConfig 配置类： TFXLMWithLMHeadModel (XLM 模型)
- XLMRobertaConfig 配置类： TFXLMRobertaForCausalLM (XLM-RoBERTa 模型)
- XLNetConfig 配置类： TFXLNetLMHeadModel (XLNet 模型)
attn_implementation (str, 可选) — 要在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认值为手动的 "eager" 实现。

从配置实例化库中的一个模型类（带有因果语言建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForCausalLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForCausalLM.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 模型仓库中的预训练模型的 *model id*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个指向 *PyTorch state_dict 保存文件*的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应通过 config 参数提供一个配置对象。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型后再加载 TensorFlow 模型要慢。
model_args (额外的位置参数，可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（通过预训练模型的 *model id* 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录来重新加载。
- 通过将本地目录作为 pretrained_model_name_or_path 提供来加载模型，并且在目录中找到了名为 *config.json* 的配置文件。
cache_dir (str 或 os.PathLike, 可选) — 当不应使用标准缓存时，用于缓存下载的预训练模型配置的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 一个按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为你信任且已阅读其代码的仓库设置为 True，因为它将在你的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则使用 Hub 上代码的特定版本。它可以是分支名、标签名或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数，可选) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中与配置属性对应的每个键都将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有因果语言建模头）。

bert — TFBertLMHeadModel (BERT 模型)
camembert — TFCamembertForCausalLM (CamemBERT 模型)
ctrl — TFCTRLLMHeadModel (CTRL 模型)
gpt-sw3 — TFGPT2LMHeadModel (GPT-Sw3 模型)
gpt2 — TFGPT2LMHeadModel (OpenAI GPT-2 模型)
gptj — TFGPTJForCausalLM (GPT-J 模型)
mistral — TFMistralForCausalLM (Mistral 模型)
openai-gpt — TFOpenAIGPTLMHeadModel (OpenAI GPT 模型)
opt — TFOPTForCausalLM (OPT 模型)
rembert — TFRemBertForCausalLM (RemBERT 模型)
roberta — TFRobertaForCausalLM (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
roformer — TFRoFormerForCausalLM (RoFormer 模型)
transfo-xl — TFTransfoXLLMHeadModel (Transformer-XL 模型)
xglm — TFXGLMForCausalLM (XGLM 模型)
xlm — TFXLMWithLMHeadModel (XLM 模型)
xlm-roberta — TFXLMRobertaForCausalLM (XLM-RoBERTa 模型)
xlnet — TFXLNetLMHeadModel (XLNet 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForCausalLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForCausalLM

class transformers.FlaxAutoModelForCausalLM

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有因果语言建模头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- BartConfig 配置类: FlaxBartForCausalLM (BART 模型)
- BertConfig 配置类: FlaxBertForCausalLM (BERT 模型)
- BigBirdConfig 配置类: FlaxBigBirdForCausalLM (BigBird 模型)
- BloomConfig 配置类: FlaxBloomForCausalLM (BLOOM 模型)
- ElectraConfig 配置类: FlaxElectraForCausalLM (ELECTRA 模型)
- GPT2Config 配置类: FlaxGPT2LMHeadModel (OpenAI GPT-2 模型)
- GPTJConfig 配置类: FlaxGPTJForCausalLM (GPT-J 模型)
- GPTNeoConfig 配置类: FlaxGPTNeoForCausalLM (GPT Neo 模型)
- GemmaConfig 配置类: FlaxGemmaForCausalLM (Gemma 模型)
- LlamaConfig 配置类: FlaxLlamaForCausalLM (LLaMA 模型)
- MistralConfig 配置类: FlaxMistralForCausalLM (Mistral 模型)
- OPTConfig 配置类: FlaxOPTForCausalLM (OPT 模型)
- RobertaConfig 配置类: FlaxRobertaForCausalLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类: FlaxRobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
- XGLMConfig 配置类: FlaxXGLMForCausalLM (XGLM 模型)
- XLMRobertaConfig 配置类: FlaxXLMRobertaForCausalLM (XLM-RoBERTa 模型)
attn_implementation (str, 可选) — 模型中要使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

从配置实例化库中的一个模型类（带有因果语言建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForCausalLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForCausalLM.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 模型仓库中的预训练模型的 *model id*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个指向 *PyTorch state_dict 保存文件*的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应通过 config 参数提供一个配置对象。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型后再加载 TensorFlow 模型要慢。
model_args (额外的位置参数，可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（通过预训练模型的 *model id* 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录来重新加载。
- 通过将本地目录作为 pretrained_model_name_or_path 提供来加载模型，并且在目录中找到了名为 *config.json* 的配置文件。
cache_dir (str 或 os.PathLike, 可选) — 当不应使用标准缓存时，用于缓存下载的预训练模型配置的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 一个按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为你信任且已阅读其代码的仓库设置为 True，因为它将在你的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则使用 Hub 上代码的特定版本。它可以是分支名、标签名或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数，可选) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中与配置属性对应的每个键都将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有因果语言建模头）。

bart — FlaxBartForCausalLM (BART 模型)
bert — FlaxBertForCausalLM (BERT 模型)
big_bird — FlaxBigBirdForCausalLM (BigBird 模型)
bloom — FlaxBloomForCausalLM (BLOOM 模型)
electra — FlaxElectraForCausalLM (ELECTRA 模型)
gemma — FlaxGemmaForCausalLM (Gemma 模型)
gpt-sw3 — FlaxGPT2LMHeadModel (GPT-Sw3 模型)
gpt2 — FlaxGPT2LMHeadModel (OpenAI GPT-2 模型)
gpt_neo — FlaxGPTNeoForCausalLM (GPT Neo 模型)
gptj — FlaxGPTJForCausalLM (GPT-J 模型)
llama — FlaxLlamaForCausalLM (LLaMA 模型)
mistral — FlaxMistralForCausalLM (Mistral 模型)
opt — FlaxOPTForCausalLM (OPT 模型)
roberta — FlaxRobertaForCausalLM (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormForCausalLM (RoBERTa-PreLayerNorm 模型)
xglm — FlaxXGLMForCausalLM (XGLM 模型)
xlm-roberta — FlaxXLMRobertaForCausalLM (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForCausalLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForCausalLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForCausalLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForMaskedLM

class transformers.AutoModelForMaskedLM

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有掩码语言建模头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- AlbertConfig 配置类: AlbertForMaskedLM (ALBERT 模型)
- BartConfig 配置类: BartForConditionalGeneration (BART 模型)
- BertConfig 配置类: BertForMaskedLM (BERT 模型)
- BigBirdConfig 配置类: BigBirdForMaskedLM (BigBird 模型)
- CamembertConfig 配置类: CamembertForMaskedLM (CamemBERT 模型)
- ConvBertConfig 配置类: ConvBertForMaskedLM (ConvBERT 模型)
- Data2VecTextConfig 配置类: Data2VecTextForMaskedLM (Data2VecText 模型)
- DebertaConfig 配置类: DebertaForMaskedLM (DeBERTa 模型)
- DebertaV2Config 配置类: DebertaV2ForMaskedLM (DeBERTa-v2 模型)
- DistilBertConfig 配置类: DistilBertForMaskedLM (DistilBERT 模型)
- ElectraConfig 配置类: ElectraForMaskedLM (ELECTRA 模型)
- ErnieConfig 配置类: ErnieForMaskedLM (ERNIE 模型)
- EsmConfig 配置类: EsmForMaskedLM (ESM 模型)
- FNetConfig 配置类: FNetForMaskedLM (FNet 模型)
- FlaubertConfig 配置类: FlaubertWithLMHeadModel (FlauBERT 模型)
- FunnelConfig 配置类: FunnelForMaskedLM (Funnel Transformer 模型)
- IBertConfig 配置类: IBertForMaskedLM (I-BERT 模型)
- LayoutLMConfig 配置类: LayoutLMForMaskedLM (LayoutLM 模型)
- LongformerConfig 配置类: LongformerForMaskedLM (Longformer 模型)
- LukeConfig 配置类: LukeForMaskedLM (LUKE 模型)
- MBartConfig 配置类: MBartForConditionalGeneration (mBART 模型)
- MPNetConfig 配置类: MPNetForMaskedLM (MPNet 模型)
- MegaConfig 配置类: MegaForMaskedLM (MEGA 模型)
- MegatronBertConfig 配置类: MegatronBertForMaskedLM (Megatron-BERT 模型)
- MobileBertConfig 配置类: MobileBertForMaskedLM (MobileBERT 模型)
- ModernBertConfig 配置类: ModernBertForMaskedLM (ModernBERT 模型)
- MraConfig 配置类: MraForMaskedLM (MRA 模型)
- MvpConfig 配置类: MvpForConditionalGeneration (MVP 模型)
- NezhaConfig 配置类: NezhaForMaskedLM (Nezha 模型)
- NystromformerConfig 配置类: NystromformerForMaskedLM (Nyströmformer 模型)
- PerceiverConfig 配置类: PerceiverForMaskedLM (Perceiver 模型)
- QDQBertConfig 配置类: QDQBertForMaskedLM (QDQBert 模型)
- ReformerConfig 配置类: ReformerForMaskedLM (Reformer 模型)
- RemBertConfig 配置类: RemBertForMaskedLM (RemBERT 模型)
- RoCBertConfig 配置类: RoCBertForMaskedLM (RoCBert 模型)
- RoFormerConfig 配置类: RoFormerForMaskedLM (RoFormer 模型)
- RobertaConfig 配置类: RobertaForMaskedLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类: RobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
- SqueezeBertConfig 配置类: SqueezeBertForMaskedLM (SqueezeBERT 模型)
- TapasConfig 配置类: TapasForMaskedLM (TAPAS 模型)
- Wav2Vec2Config 配置类: Wav2Vec2ForMaskedLM (Wav2Vec2 模型)
- XLMConfig 配置类: XLMWithLMHeadModel (XLM 模型)
- XLMRobertaConfig 配置类: XLMRobertaForMaskedLM (XLM-RoBERTa 模型)
- XLMRobertaXLConfig 配置类: XLMRobertaXLForMaskedLM (XLM-RoBERTa-XL 模型)
- XmodConfig 配置类: XmodForMaskedLM (X-MOD 模型)
- YosoConfig 配置类: YosoForMaskedLM (YOSO 模型)
attn_implementation (str, 可选) — 模型中要使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

从配置中实例化库中的一个模型类（带有掩码语言建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForMaskedLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMaskedLM.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 模型仓库中的预训练模型的 *model id*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个指向 *tensorflow index checkpoint file* 的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，from_tf 应设置为 True，并且应通过 config 参数提供一个配置对象。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型后再加载 PyTorch 模型要慢。
model_args (额外的位置参数, 可选) — 将沿底层模型的 __init__() 方法传递。
config (PretrainedConfig, 可选) — 用于模型的配置，而非自动加载的配置。配置可以在以下情况下自动加载：
- 该模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 该模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 *config.json* 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 用于替代从已保存权重文件加载的状态字典的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，则可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是更简单的选项。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存的目录路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。所有下载现在在可能的情况下默认恢复。将在 Transformers v5 中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每个请求上使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项仅应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上存在的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则要为 Hub 上的代码使用的特定修订版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为根据是否提供了 config 或自动加载而有所不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新都已完成）
- 如果没有提供配置，kwargs 将首先传递给配置类初始化函数 (from_pretrained())。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有掩码语言建模头）。

albert — AlbertForMaskedLM (ALBERT 模型)
bart — BartForConditionalGeneration (BART 模型)
bert — BertForMaskedLM (BERT 模型)
big_bird — BigBirdForMaskedLM (BigBird 模型)
camembert — CamembertForMaskedLM (CamemBERT 模型)
convbert — ConvBertForMaskedLM (ConvBERT 模型)
data2vec-text — Data2VecTextForMaskedLM (Data2VecText 模型)
deberta — DebertaForMaskedLM (DeBERTa 模型)
deberta-v2 — DebertaV2ForMaskedLM (DeBERTa-v2 模型)
distilbert — DistilBertForMaskedLM (DistilBERT 模型)
electra — ElectraForMaskedLM (ELECTRA 模型)
ernie — ErnieForMaskedLM (ERNIE 模型)
esm — EsmForMaskedLM (ESM 模型)
flaubert — FlaubertWithLMHeadModel (FlauBERT 模型)
fnet — FNetForMaskedLM (FNet 模型)
funnel — FunnelForMaskedLM (Funnel Transformer 模型)
ibert — IBertForMaskedLM (I-BERT 模型)
layoutlm — LayoutLMForMaskedLM (LayoutLM 模型)
longformer — LongformerForMaskedLM (Longformer 模型)
luke — LukeForMaskedLM (LUKE 模型)
mbart — MBartForConditionalGeneration (mBART 模型)
mega — MegaForMaskedLM (MEGA 模型)
megatron-bert — MegatronBertForMaskedLM (Megatron-BERT 模型)
mobilebert — MobileBertForMaskedLM (MobileBERT 模型)
modernbert — ModernBertForMaskedLM (ModernBERT 模型)
mpnet — MPNetForMaskedLM (MPNet 模型)
mra — MraForMaskedLM (MRA 模型)
mvp — MvpForConditionalGeneration (MVP 模型)
nezha — NezhaForMaskedLM (Nezha 模型)
nystromformer — NystromformerForMaskedLM (Nyströmformer 模型)
perceiver — PerceiverForMaskedLM (Perceiver 模型)
qdqbert — QDQBertForMaskedLM (QDQBert 模型)
reformer — ReformerForMaskedLM (Reformer 模型)
rembert — RemBertForMaskedLM (RemBERT 模型)
roberta — RobertaForMaskedLM (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertForMaskedLM (RoCBert 模型)
roformer — RoFormerForMaskedLM (RoFormer 模型)
squeezebert — SqueezeBertForMaskedLM (SqueezeBERT 模型)
tapas — TapasForMaskedLM (TAPAS 模型)
wav2vec2 — Wav2Vec2ForMaskedLM (Wav2Vec2 模型)
xlm — XLMWithLMHeadModel (XLM 模型)
xlm-roberta — XLMRobertaForMaskedLM (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLForMaskedLM (XLM-RoBERTa-XL 模型)
xmod — XmodForMaskedLM (X-MOD 模型)
yoso — YosoForMaskedLM (YOSO 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMaskedLM.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForMaskedLM

class transformers.TFAutoModelForMaskedLM

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有掩码语言建模头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类根据配置类选择：
- AlbertConfig 配置类：TFAlbertForMaskedLM (ALBERT 模型)
- BertConfig 配置类：TFBertForMaskedLM (BERT 模型)
- CamembertConfig 配置类：TFCamembertForMaskedLM (CamemBERT 模型)
- ConvBertConfig 配置类：TFConvBertForMaskedLM (ConvBERT 模型)
- DebertaConfig 配置类：TFDebertaForMaskedLM (DeBERTa 模型)
- DebertaV2Config 配置类：TFDebertaV2ForMaskedLM (DeBERTa-v2 模型)
- DistilBertConfig 配置类：TFDistilBertForMaskedLM (DistilBERT 模型)
- ElectraConfig 配置类：TFElectraForMaskedLM (ELECTRA 模型)
- EsmConfig 配置类：TFEsmForMaskedLM (ESM 模型)
- FlaubertConfig 配置类：TFFlaubertWithLMHeadModel (FlauBERT 模型)
- FunnelConfig 配置类：TFFunnelForMaskedLM (Funnel Transformer 模型)
- LayoutLMConfig 配置类：TFLayoutLMForMaskedLM (LayoutLM 模型)
- LongformerConfig 配置类：TFLongformerForMaskedLM (Longformer 模型)
- MPNetConfig 配置类：TFMPNetForMaskedLM (MPNet 模型)
- MobileBertConfig 配置类：TFMobileBertForMaskedLM (MobileBERT 模型)
- RemBertConfig 配置类：TFRemBertForMaskedLM (RemBERT 模型)
- RoFormerConfig 配置类：TFRoFormerForMaskedLM (RoFormer 模型)
- RobertaConfig 配置类：TFRobertaForMaskedLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：TFRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
- TapasConfig 配置类：TFTapasForMaskedLM (TAPAS 模型)
- XLMConfig 配置类：TFXLMWithLMHeadModel (XLM 模型)
- XLMRobertaConfig 配置类：TFXLMRobertaForMaskedLM (XLM-RoBERTa 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

从配置中实例化库中的一个模型类（带有掩码语言建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForMaskedLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForMaskedLM.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如 ./my_model_directory/。
- 一个指向*PyTorch state_dict 保存文件*的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型然后加载 TensorFlow 模型要慢。
model_args (额外的位置参数, 可选) — 将沿底层模型的 __init__() 方法传递。
config (PretrainedConfig, 可选) — 用于模型的配置，而非自动加载的配置。配置可以在以下情况下自动加载：
- 该模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 该模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 *config.json* 的配置文件。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。所有下载现在在可能的情况下默认恢复。将在 Transformers v5 中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每个请求上使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项仅应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上存在的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则要为 Hub 上的代码使用的特定修订版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为根据是否提供了 config 或自动加载而有所不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新都已完成）
- 如果没有提供配置，kwargs 将首先传递给配置类初始化函数 (from_pretrained())。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有掩码语言建模头）。

albert — TFAlbertForMaskedLM (ALBERT 模型)
bert — TFBertForMaskedLM (BERT 模型)
camembert — TFCamembertForMaskedLM (CamemBERT 模型)
convbert — TFConvBertForMaskedLM (ConvBERT 模型)
deberta — TFDebertaForMaskedLM (DeBERTa 模型)
deberta-v2 — TFDebertaV2ForMaskedLM (DeBERTa-v2 模型)
distilbert — TFDistilBertForMaskedLM (DistilBERT 模型)
electra — TFElectraForMaskedLM (ELECTRA 模型)
esm — TFEsmForMaskedLM (ESM 模型)
flaubert — TFFlaubertWithLMHeadModel (FlauBERT 模型)
funnel — TFFunnelForMaskedLM (Funnel Transformer 模型)
layoutlm — TFLayoutLMForMaskedLM (LayoutLM 模型)
longformer — TFLongformerForMaskedLM (Longformer 模型)
mobilebert — TFMobileBertForMaskedLM (MobileBERT 模型)
mpnet — TFMPNetForMaskedLM (MPNet 模型)
rembert — TFRemBertForMaskedLM (RemBERT 模型)
roberta — TFRobertaForMaskedLM (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
roformer — TFRoFormerForMaskedLM (RoFormer 模型)
tapas — TFTapasForMaskedLM (TAPAS 模型)
xlm — TFXLMWithLMHeadModel (XLM 模型)
xlm-roberta — TFXLMRobertaForMaskedLM (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMaskedLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForMaskedLM

class transformers.FlaxAutoModelForMaskedLM

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有掩码语言建模头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类根据配置类选择：
- AlbertConfig 配置类：FlaxAlbertForMaskedLM (ALBERT 模型)
- BartConfig 配置类：FlaxBartForConditionalGeneration (BART 模型)
- BertConfig 配置类：FlaxBertForMaskedLM (BERT 模型)
- BigBirdConfig 配置类：FlaxBigBirdForMaskedLM (BigBird 模型)
- DistilBertConfig 配置类：FlaxDistilBertForMaskedLM (DistilBERT 模型)
- ElectraConfig 配置类：FlaxElectraForMaskedLM (ELECTRA 模型)
- MBartConfig 配置类：FlaxMBartForConditionalGeneration (mBART 模型)
- RoFormerConfig 配置类：FlaxRoFormerForMaskedLM (RoFormer 模型)
- RobertaConfig 配置类：FlaxRobertaForMaskedLM (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：FlaxRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
- XLMRobertaConfig 配置类：FlaxXLMRobertaForMaskedLM (XLM-RoBERTa 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

从配置中实例化库中的一个模型类（带有掩码语言建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForMaskedLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForMaskedLM.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如 ./my_model_directory/。
- 一个指向*PyTorch state_dict 保存文件*的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型然后加载 TensorFlow 模型要慢。
model_args (额外的位置参数, 可选) — 将沿底层模型的 __init__() 方法传递。
config (PretrainedConfig, 可选) — 用于模型的配置，而非自动加载的配置。配置可以在以下情况下自动加载：
- 该模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 该模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 *config.json* 的配置文件。
cache_dir (str 或 os.PathLike，可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_pt (bool，可选，默认为 False) — 从 PyTorch 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool，可选，默认为 False) — 是否强制（重新）下载模型权重和配置文件，如果存在缓存版本则覆盖它们。
resume_download — 已弃用并忽略。所有下载现在在可能的情况下默认恢复。将在 Transformers v5 版本中移除。
proxies (dict[str, str]，可选) — 用于按协议或端点使用的代理服务器字典，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次请求时使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选，默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str，可选，默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool，可选，默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上存在的代码。
code_revision (str，可选，默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则指定在 Hub 上使用的代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (附加关键字参数，可选) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为根据是否提供了 config 或自动加载而有所不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有掩码语言建模头）。

albert — FlaxAlbertForMaskedLM (ALBERT 模型)
bart — FlaxBartForConditionalGeneration (BART 模型)
bert — FlaxBertForMaskedLM (BERT 模型)
big_bird — FlaxBigBirdForMaskedLM (BigBird 模型)
distilbert — FlaxDistilBertForMaskedLM (DistilBERT 模型)
electra — FlaxElectraForMaskedLM (ELECTRA 模型)
mbart — FlaxMBartForConditionalGeneration (mBART 模型)
roberta — FlaxRobertaForMaskedLM (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormForMaskedLM (RoBERTa-PreLayerNorm 模型)
roformer — FlaxRoFormerForMaskedLM (RoFormer 模型)
xlm-roberta — FlaxXLMRobertaForMaskedLM (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForMaskedLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForMaskedLM.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForMaskedLM.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForMaskGeneration

class transformers.AutoModelForMaskGeneration

（ *args **kwargs ）

TFAutoModelForMaskGeneration

class transformers.TFAutoModelForMaskGeneration

（ *args **kwargs ）

AutoModelForSeq2SeqLM

class transformers.AutoModelForSeq2SeqLM

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有序列到序列语言建模头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- BartConfig 配置类：BartForConditionalGeneration (BART 模型)
- BigBirdPegasusConfig 配置类：BigBirdPegasusForConditionalGeneration (BigBird-Pegasus 模型)
- BlenderbotConfig 配置类：BlenderbotForConditionalGeneration (Blenderbot 模型)
- BlenderbotSmallConfig 配置类：BlenderbotSmallForConditionalGeneration (BlenderbotSmall 模型)
- EncoderDecoderConfig 配置类：EncoderDecoderModel (编码器-解码器模型)
- FSMTConfig 配置类：FSMTForConditionalGeneration (FairSeq 机器翻译模型)
- GPTSanJapaneseConfig 配置类：GPTSanJapaneseForConditionalGeneration (GPTSAN-japanese 模型)
- GraniteSpeechConfig 配置类：GraniteSpeechForConditionalGeneration (GraniteSpeech 模型)
- LEDConfig 配置类：LEDForConditionalGeneration (LED 模型)
- LongT5Config 配置类：LongT5ForConditionalGeneration (LongT5 模型)
- M2M100Config 配置类：M2M100ForConditionalGeneration (M2M100 模型)
- MBartConfig 配置类：MBartForConditionalGeneration (mBART 模型)
- MT5Config 配置类：MT5ForConditionalGeneration (MT5 模型)
- MarianConfig 配置类：MarianMTModel (Marian 模型)
- MvpConfig 配置类：MvpForConditionalGeneration (MVP 模型)
- NllbMoeConfig 配置类：NllbMoeForConditionalGeneration (NLLB-MOE 模型)
- PLBartConfig 配置类：PLBartForConditionalGeneration (PLBart 模型)
- PegasusConfig 配置类：PegasusForConditionalGeneration (Pegasus 模型)
- PegasusXConfig 配置类：PegasusXForConditionalGeneration (PEGASUS-X 模型)
- ProphetNetConfig 配置类：ProphetNetForConditionalGeneration (ProphetNet 模型)
- Qwen2AudioConfig 配置类：Qwen2AudioForConditionalGeneration (Qwen2Audio 模型)
- SeamlessM4TConfig 配置类：SeamlessM4TForTextToText (SeamlessM4T 模型)
- SeamlessM4Tv2Config 配置类：SeamlessM4Tv2ForTextToText (SeamlessM4Tv2 模型)
- SwitchTransformersConfig 配置类：SwitchTransformersForConditionalGeneration (SwitchTransformers 模型)
- T5Config 配置类：T5ForConditionalGeneration (T5 模型)
- T5GemmaConfig 配置类：T5GemmaForConditionalGeneration (T5Gemma 模型)
- UMT5Config 配置类：UMT5ForConditionalGeneration (UMT5 模型)
- XLMProphetNetConfig 配置类：XLMProphetNetForConditionalGeneration (XLM-ProphetNet 模型)
attn_implementation (str，可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认为手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有序列到序列语言建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = AutoModelForSeq2SeqLM.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的模型 ID。
- 包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- tensorflow 索引检查点文件的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，from_tf 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后再加载 PyTorch 模型要慢。
model_args (附加位置参数，可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的模型 ID 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在目录中找到了名为 config.json 的配置 JSON 文件。
state_dict (dict[str, torch.Tensor], 可选) — 要使用的状态字典，而不是从保存的权重文件加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否是更简单的选项。
cache_dir (str 或 os.PathLike，可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_tf (bool，可选，默认为 False) — 从 TensorFlow 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool，可选，默认为 False) — 是否强制（重新）下载模型权重和配置文件，如果存在缓存版本则覆盖它们。
resume_download — 已弃用并忽略。所有下载现在在可能的情况下默认恢复。将在 Transformers v5 版本中移除。
proxies (dict[str, str]，可选) — 用于按协议或端点使用的代理服务器字典，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次请求时使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选，默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str，可选，默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool，可选，默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上存在的代码。
code_revision (str，可选，默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则指定在 Hub 上使用的代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (附加关键字参数，可选) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为根据是否提供了 config 或自动加载而有所不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有序列到序列语言建模头）。

bart — BartForConditionalGeneration (BART 模型)
bigbird_pegasus — BigBirdPegasusForConditionalGeneration (BigBird-Pegasus 模型)
blenderbot — BlenderbotForConditionalGeneration (Blenderbot 模型)
blenderbot-small — BlenderbotSmallForConditionalGeneration (BlenderbotSmall 模型)
encoder-decoder — EncoderDecoderModel (编码器-解码器模型)
fsmt — FSMTForConditionalGeneration (FairSeq 机器翻译模型)
gptsan-japanese — GPTSanJapaneseForConditionalGeneration (GPTSAN-japanese 模型)
granite_speech — GraniteSpeechForConditionalGeneration (GraniteSpeech 模型)
led — LEDForConditionalGeneration (LED 模型)
longt5 — LongT5ForConditionalGeneration (LongT5 模型)
m2m_100 — M2M100ForConditionalGeneration (M2M100 模型)
marian — MarianMTModel (Marian 模型)
mbart — MBartForConditionalGeneration (mBART 模型)
mt5 — MT5ForConditionalGeneration (MT5 模型)
mvp — MvpForConditionalGeneration (MVP 模型)
nllb-moe — NllbMoeForConditionalGeneration (NLLB-MOE 模型)
pegasus — PegasusForConditionalGeneration (Pegasus 模型)
pegasus_x — PegasusXForConditionalGeneration (PEGASUS-X 模型)
plbart — PLBartForConditionalGeneration (PLBart 模型)
prophetnet — ProphetNetForConditionalGeneration (ProphetNet 模型)
qwen2_audio — Qwen2AudioForConditionalGeneration (Qwen2Audio 模型)
seamless_m4t — SeamlessM4TForTextToText (SeamlessM4T 模型)
seamless_m4t_v2 — SeamlessM4Tv2ForTextToText (SeamlessM4Tv2 模型)
switch_transformers — SwitchTransformersForConditionalGeneration (SwitchTransformers 模型)
t5 — T5ForConditionalGeneration (T5 模型)
t5gemma — T5GemmaForConditionalGeneration (T5Gemma 模型)
umt5 — UMT5ForConditionalGeneration (UMT5 模型)
xlm-prophetnet — XLMProphetNetForConditionalGeneration (XLM-ProphetNet 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> # Update configuration during loading
>>> model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/t5_tf_model_config.json")
>>> model = AutoModelForSeq2SeqLM.from_pretrained(
...     "./tf_model/t5_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForSeq2SeqLM

class transformers.TFAutoModelForSeq2SeqLM

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有序列到序列语言建模头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- BartConfig 配置类：TFBartForConditionalGeneration (BART 模型)
- BlenderbotConfig 配置类：TFBlenderbotForConditionalGeneration (Blenderbot 模型)
- BlenderbotSmallConfig 配置类：TFBlenderbotSmallForConditionalGeneration (BlenderbotSmall 模型)
- EncoderDecoderConfig 配置类：TFEncoderDecoderModel (编码器-解码器模型)
- LEDConfig 配置类：TFLEDForConditionalGeneration (LED 模型)
- MBartConfig 配置类：TFMBartForConditionalGeneration (mBART 模型)
- MT5Config 配置类：TFMT5ForConditionalGeneration (MT5 模型)
- MarianConfig 配置类：TFMarianMTModel (Marian 模型)
- PegasusConfig 配置类：TFPegasusForConditionalGeneration (Pegasus 模型)
- T5Config 配置类：TFT5ForConditionalGeneration (T5 模型)
attn_implementation (str，可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认为手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有序列到序列语言建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = TFAutoModelForSeq2SeqLM.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即 huggingface.co 上模型仓库中托管的预训练模型的 model id。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个指向 PyTorch state_dict 保存文件 的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，应将 from_pt 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型后再加载 TensorFlow 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，以代替自动加载的配置。在以下情况下可以自动加载配置：
- 该模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
cache_dir (str or os.PathLike, 可选) — 当不应使用标准缓存时，下载的预训练模型配置应缓存到的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件中加载模型权重（参见 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用且被忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers v5 中移除。
proxies (dict[str, str], 可选) — 一个根据协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理服务器用于每个请求。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为你信任且已阅读其代码的仓库设置为 True，因为它将在你的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则用于 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如 output_attentions=True）。行为因是否提供 config 或自动加载而异：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设对配置的所有相关更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有序列到序列语言建模头）。

bart — TFBartForConditionalGeneration (BART 模型)
blenderbot — TFBlenderbotForConditionalGeneration (Blenderbot 模型)
blenderbot-small — TFBlenderbotSmallForConditionalGeneration (BlenderbotSmall 模型)
encoder-decoder — TFEncoderDecoderModel (编码器-解码器模型)
led — TFLEDForConditionalGeneration (LED 模型)
marian — TFMarianMTModel (Marian 模型)
mbart — TFMBartForConditionalGeneration (mBART 模型)
mt5 — TFMT5ForConditionalGeneration (MT5 模型)
pegasus — TFPegasusForConditionalGeneration (Pegasus 模型)
t5 — TFT5ForConditionalGeneration (T5 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> # Update configuration during loading
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/t5_pt_model_config.json")
>>> model = TFAutoModelForSeq2SeqLM.from_pretrained(
...     "./pt_model/t5_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForSeq2SeqLM

class transformers.FlaxAutoModelForSeq2SeqLM

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有序列到序列语言建模头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- BartConfig 配置类：FlaxBartForConditionalGeneration (BART 模型)
- BlenderbotConfig 配置类：FlaxBlenderbotForConditionalGeneration (Blenderbot 模型)
- BlenderbotSmallConfig 配置类：FlaxBlenderbotSmallForConditionalGeneration (BlenderbotSmall 模型)
- EncoderDecoderConfig 配置类：FlaxEncoderDecoderModel (编码器-解码器模型)
- LongT5Config 配置类：FlaxLongT5ForConditionalGeneration (LongT5 模型)
- MBartConfig 配置类：FlaxMBartForConditionalGeneration (mBART 模型)
- MT5Config 配置类：FlaxMT5ForConditionalGeneration (MT5 模型)
- MarianConfig 配置类：FlaxMarianMTModel (Marian 模型)
- PegasusConfig 配置类：FlaxPegasusForConditionalGeneration (Pegasus 模型)
- T5Config 配置类：FlaxT5ForConditionalGeneration (T5 模型)
attn_implementation (str, 可选) — 要在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有序列到序列语言建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForSeq2SeqLM

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-t5/t5-base")
>>> model = FlaxAutoModelForSeq2SeqLM.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即 huggingface.co 上模型仓库中托管的预训练模型的 model id。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个指向 PyTorch state_dict 保存文件 的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，应将 from_pt 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型后再加载 TensorFlow 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，以代替自动加载的配置。在以下情况下可以自动加载配置：
- 该模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
cache_dir (str or os.PathLike, 可选) — 当不应使用标准缓存时，下载的预训练模型配置应缓存到的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件中加载模型权重（参见 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用且被忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers v5 中移除。
proxies (dict[str, str], 可选) — 一个根据协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理服务器用于每个请求。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为你信任且已阅读其代码的仓库设置为 True，因为它将在你的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则用于 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如 output_attentions=True）。行为因是否提供 config 或自动加载而异：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设对配置的所有相关更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有序列到序列语言建模头）。

bart — FlaxBartForConditionalGeneration (BART 模型)
blenderbot — FlaxBlenderbotForConditionalGeneration (Blenderbot 模型)
blenderbot-small — FlaxBlenderbotSmallForConditionalGeneration (BlenderbotSmall 模型)
encoder-decoder — FlaxEncoderDecoderModel (编码器-解码器模型)
longt5 — FlaxLongT5ForConditionalGeneration (LongT5 模型)
marian — FlaxMarianMTModel (Marian 模型)
mbart — FlaxMBartForConditionalGeneration (mBART 模型)
mt5 — FlaxMT5ForConditionalGeneration (MT5 模型)
pegasus — FlaxPegasusForConditionalGeneration (Pegasus 模型)
t5 — FlaxT5ForConditionalGeneration (T5 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForSeq2SeqLM

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/t5_pt_model_config.json")
>>> model = FlaxAutoModelForSeq2SeqLM.from_pretrained(
...     "./pt_model/t5_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForSequenceClassification

class transformers.AutoModelForSequenceClassification

（ *args **kwargs ）

这是一个通用的模型类，在使用 from_pretrained() 类方法或 from_config() 类方法创建时，将被实例化为库中的某个模型类（带序列分类头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 用于实例化模型类的配置。实例化的模型类将根据配置类选择：
- AlbertConfig 配置类：AlbertForSequenceClassification (ALBERT 模型)
- ArceeConfig 配置类：ArceeForSequenceClassification (Arcee 模型)
- BartConfig 配置类：BartForSequenceClassification (BART 模型)
- BertConfig 配置类：BertForSequenceClassification (BERT 模型)
- BigBirdConfig 配置类：BigBirdForSequenceClassification (BigBird 模型)
- BigBirdPegasusConfig 配置类：BigBirdPegasusForSequenceClassification (BigBird-Pegasus 模型)
- BioGptConfig 配置类：BioGptForSequenceClassification (BioGpt 模型)
- BloomConfig 配置类：BloomForSequenceClassification (BLOOM 模型)
- CTRLConfig 配置类：CTRLForSequenceClassification (CTRL 模型)
- CamembertConfig 配置类：CamembertForSequenceClassification (CamemBERT 模型)
- CanineConfig 配置类：CanineForSequenceClassification (CANINE 模型)
- ConvBertConfig 配置类：ConvBertForSequenceClassification (ConvBERT 模型)
- Data2VecTextConfig 配置类：Data2VecTextForSequenceClassification (Data2VecText 模型)
- DebertaConfig 配置类：DebertaForSequenceClassification (DeBERTa 模型)
- DebertaV2Config 配置类：DebertaV2ForSequenceClassification (DeBERTa-v2 模型)
- DiffLlamaConfig 配置类：DiffLlamaForSequenceClassification (DiffLlama 模型)
- DistilBertConfig 配置类：DistilBertForSequenceClassification (DistilBERT 模型)
- ElectraConfig 配置类：ElectraForSequenceClassification (ELECTRA 模型)
- ErnieConfig 配置类：ErnieForSequenceClassification (ERNIE 模型)
- ErnieMConfig 配置类：ErnieMForSequenceClassification (ErnieM 模型)
- EsmConfig 配置类：EsmForSequenceClassification (ESM 模型)
- FNetConfig 配置类：FNetForSequenceClassification (FNet 模型)
- FalconConfig 配置类：FalconForSequenceClassification (Falcon 模型)
- FlaubertConfig 配置类：FlaubertForSequenceClassification (FlauBERT 模型)
- FunnelConfig 配置类：FunnelForSequenceClassification (Funnel Transformer 模型)
- GPT2Config 配置类：GPT2ForSequenceClassification (OpenAI GPT-2 模型)
- GPTBigCodeConfig 配置类：GPTBigCodeForSequenceClassification (GPTBigCode 模型)
- GPTJConfig 配置类：GPTJForSequenceClassification (GPT-J 模型)
- GPTNeoConfig 配置类：GPTNeoForSequenceClassification (GPT Neo 模型)
- GPTNeoXConfig 配置类：GPTNeoXForSequenceClassification (GPT NeoX 模型)
- Gemma2Config 配置类：Gemma2ForSequenceClassification (Gemma2 模型)
- GemmaConfig 配置类：GemmaForSequenceClassification (Gemma 模型)
- Glm4Config 配置类：Glm4ForSequenceClassification (GLM4 模型)
- GlmConfig 配置类：GlmForSequenceClassification (GLM 模型)
- HeliumConfig 配置类：HeliumForSequenceClassification (Helium 模型)
- IBertConfig 配置类：IBertForSequenceClassification (I-BERT 模型)
- JambaConfig 配置类：JambaForSequenceClassification (Jamba 模型)
- JetMoeConfig 配置类：JetMoeForSequenceClassification (JetMoe 模型)
- LEDConfig 配置类：LEDForSequenceClassification (LED 模型)
- LayoutLMConfig 配置类：LayoutLMForSequenceClassification (LayoutLM 模型)
- LayoutLMv2Config 配置类：LayoutLMv2ForSequenceClassification (LayoutLMv2 模型)
- LayoutLMv3Config 配置类：LayoutLMv3ForSequenceClassification (LayoutLMv3 模型)
- LiltConfig 配置类：LiltForSequenceClassification (LiLT 模型)
- LlamaConfig 配置类：LlamaForSequenceClassification (LLaMA 模型)
- LongformerConfig 配置类：LongformerForSequenceClassification (Longformer 模型)
- LukeConfig 配置类：LukeForSequenceClassification (LUKE 模型)
- MBartConfig 配置类：MBartForSequenceClassification (mBART 模型)
- MPNetConfig 配置类：MPNetForSequenceClassification (MPNet 模型)
- MT5Config 配置类：MT5ForSequenceClassification (MT5 模型)
- MarkupLMConfig 配置类：MarkupLMForSequenceClassification (MarkupLM 模型)
- MegaConfig 配置类：MegaForSequenceClassification (MEGA 模型)
- MegatronBertConfig 配置类：MegatronBertForSequenceClassification (Megatron-BERT 模型)
- MiniMaxConfig 配置类：MiniMaxForSequenceClassification (MiniMax 模型)
- MistralConfig 配置类：MistralForSequenceClassification (Mistral 模型)
- MixtralConfig 配置类：MixtralForSequenceClassification (Mixtral 模型)
- MobileBertConfig 配置类：MobileBertForSequenceClassification (MobileBERT 模型)
- ModernBertConfig 配置类：ModernBertForSequenceClassification (ModernBERT 模型)
- MptConfig 配置类：MptForSequenceClassification (MPT 模型)
- MraConfig 配置类：MraForSequenceClassification (MRA 模型)
- MvpConfig 配置类：MvpForSequenceClassification (MVP 模型)
- NemotronConfig 配置类：NemotronForSequenceClassification (Nemotron 模型)
- NezhaConfig 配置类：NezhaForSequenceClassification (Nezha 模型)
- NystromformerConfig 配置类：NystromformerForSequenceClassification (Nyströmformer 模型)
- OPTConfig 配置类：OPTForSequenceClassification (OPT 模型)
- OpenAIGPTConfig 配置类：OpenAIGPTForSequenceClassification (OpenAI GPT 模型)
- OpenLlamaConfig 配置类：OpenLlamaForSequenceClassification (OpenLlama 模型)
- PLBartConfig 配置类：PLBartForSequenceClassification (PLBart 模型)
- PerceiverConfig 配置类：PerceiverForSequenceClassification (Perceiver 模型)
- PersimmonConfig 配置类：PersimmonForSequenceClassification (Persimmon 模型)
- Phi3Config 配置类：Phi3ForSequenceClassification (Phi3 模型)
- PhiConfig 配置类：PhiForSequenceClassification (Phi 模型)
- PhimoeConfig 配置类：PhimoeForSequenceClassification (Phimoe 模型)
- QDQBertConfig 配置类：QDQBertForSequenceClassification (QDQBert 模型)
- Qwen2Config 配置类：Qwen2ForSequenceClassification (Qwen2 模型)
- Qwen2MoeConfig 配置类：Qwen2MoeForSequenceClassification (Qwen2MoE 模型)
- Qwen3Config 配置类：Qwen3ForSequenceClassification (Qwen3 模型)
- Qwen3MoeConfig 配置类：Qwen3MoeForSequenceClassification (Qwen3MoE 模型)
- ReformerConfig 配置类：ReformerForSequenceClassification (Reformer 模型)
- RemBertConfig 配置类：RemBertForSequenceClassification (RemBERT 模型)
- RoCBertConfig 配置类：RoCBertForSequenceClassification (RoCBert 模型)
- RoFormerConfig 配置类：RoFormerForSequenceClassification (RoFormer 模型)
- RobertaConfig 配置类：RobertaForSequenceClassification (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：RobertaPreLayerNormForSequenceClassification (RoBERTa-PreLayerNorm 模型)
- SmolLM3Config 配置类：SmolLM3ForSequenceClassification (SmolLM3 模型)
- SqueezeBertConfig 配置类：SqueezeBertForSequenceClassification (SqueezeBERT 模型)
- StableLmConfig 配置类：StableLmForSequenceClassification (StableLm 模型)
- Starcoder2Config 配置类：Starcoder2ForSequenceClassification (Starcoder2 模型)
- T5Config 配置类：T5ForSequenceClassification (T5 模型)
- T5GemmaConfig 配置类：T5GemmaForSequenceClassification (T5Gemma 模型)
- TapasConfig 配置类：TapasForSequenceClassification (TAPAS 模型)
- TransfoXLConfig 配置类：TransfoXLForSequenceClassification (Transformer-XL 模型)
- UMT5Config 配置类：UMT5ForSequenceClassification (UMT5 模型)
- XLMConfig 配置类：XLMForSequenceClassification (XLM 模型)
- XLMRobertaConfig 配置类：XLMRobertaForSequenceClassification (XLM-RoBERTa 模型)
- XLMRobertaXLConfig 配置类：XLMRobertaXLForSequenceClassification (XLM-RoBERTa-XL 模型)
- XLNetConfig 配置类：XLNetForSequenceClassification (XLNet 模型)
- XmodConfig 配置类：XmodForSequenceClassification (X-MOD 模型)
- YosoConfig 配置类：YosoForSequenceClassification (YOSO 模型)
- Zamba2Config 配置类：Zamba2ForSequenceClassification (Zamba2 模型)
- ZambaConfig 配置类：ZambaForSequenceClassification (Zamba 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现方式（如果相关）。可以是 "eager"（手动实现注意力）、"sdpa"（使用 F.scaled_dot_product_attention），或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认值为手动实现的 "eager"。

根据配置实例化库中的一个模型类（带有序列分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForSequenceClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSequenceClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个指向包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如 ./my_model_directory/。
- 一个指向*TensorFlow 索引检查点文件*的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，应将 `from_tf` 设置为 `True`，并应提供一个配置对象作为 `config` 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型，然后再加载 PyTorch 模型要慢。
model_args (其他位置参数，可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供一个本地目录作为 `pretrained_model_name_or_path` 来加载模型，并且在目录中找到了名为 *config.json* 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 要使用的状态字典，而不是从保存的权重文件中加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存的目录路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 `pretrained_model_name_or_path` 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认恢复。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 `revision` 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许 Hub 上自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 `True`，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则指定要用于 Hub 上代码的特定修订版。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 `revision` 可以是 git 允许的任何标识符。
kwargs (其他关键字参数，可选) — 可用于更新配置对象（加载后）和初始化模型（例如，`output_attentions=True`）。其行为根据是否提供了 `config` 或自动加载而有所不同：
- 如果通过 `config` 提供了配置，`**kwargs` 将直接传递给底层模型的 `__init__` 方法（我们假设所有相关的配置更新都已完成）。
- 如果没有提供配置，`kwargs` 将首先传递给配置类的初始化函数（from_pretrained()）。`kwargs` 中与配置属性对应的每个键将用于使用提供的 `kwargs` 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 `__init__` 函数。

从预训练模型实例化库中的一个模型类（带有序列分类头）。

albert — `AlbertForSequenceClassification` (ALBERT 模型)
arcee — ArceeForSequenceClassification (Arcee 模型)
bart — BartForSequenceClassification (BART 模型)
bert — BertForSequenceClassification (BERT 模型)
big_bird — BigBirdForSequenceClassification (BigBird 模型)
bigbird_pegasus — BigBirdPegasusForSequenceClassification (BigBird-Pegasus 模型)
biogpt — BioGptForSequenceClassification (BioGpt 模型)
bloom — BloomForSequenceClassification (BLOOM 模型)
camembert — CamembertForSequenceClassification (CamemBERT 模型)
canine — CanineForSequenceClassification (CANINE 模型)
code_llama — LlamaForSequenceClassification (CodeLlama 模型)
convbert — ConvBertForSequenceClassification (ConvBERT 模型)
ctrl — CTRLForSequenceClassification (CTRL 模型)
data2vec-text — Data2VecTextForSequenceClassification (Data2VecText 模型)
deberta — DebertaForSequenceClassification (DeBERTa 模型)
deberta-v2 — DebertaV2ForSequenceClassification (DeBERTa-v2 模型)
diffllama — DiffLlamaForSequenceClassification (DiffLlama 模型)
distilbert — DistilBertForSequenceClassification (DistilBERT 模型)
electra — ElectraForSequenceClassification (ELECTRA 模型)
ernie — ErnieForSequenceClassification (ERNIE 模型)
ernie_m — ErnieMForSequenceClassification (ErnieM 模型)
esm — EsmForSequenceClassification (ESM 模型)
falcon — FalconForSequenceClassification (Falcon 模型)
flaubert — FlaubertForSequenceClassification (FlauBERT 模型)
fnet — FNetForSequenceClassification (FNet 模型)
funnel — FunnelForSequenceClassification (Funnel Transformer 模型)
gemma — GemmaForSequenceClassification (Gemma 模型)
gemma2 — Gemma2ForSequenceClassification (Gemma2 模型)
glm — GlmForSequenceClassification (GLM 模型)
glm4 — Glm4ForSequenceClassification (GLM4 模型)
gpt-sw3 — GPT2ForSequenceClassification (GPT-Sw3 模型)
gpt2 — GPT2ForSequenceClassification (OpenAI GPT-2 模型)
gpt_bigcode — GPTBigCodeForSequenceClassification (GPTBigCode 模型)
gpt_neo — GPTNeoForSequenceClassification (GPT Neo 模型)
gpt_neox — GPTNeoXForSequenceClassification (GPT NeoX 模型)
gptj — GPTJForSequenceClassification (GPT-J 模型)
helium — HeliumForSequenceClassification (Helium 模型)
ibert — IBertForSequenceClassification (I-BERT 模型)
jamba — JambaForSequenceClassification (Jamba 模型)
jetmoe — JetMoeForSequenceClassification (JetMoe 模型)
layoutlm — LayoutLMForSequenceClassification (LayoutLM 模型)
layoutlmv2 — LayoutLMv2ForSequenceClassification (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3ForSequenceClassification (LayoutLMv3 模型)
led — LEDForSequenceClassification (LED 模型)
lilt — LiltForSequenceClassification (LiLT 模型)
llama — LlamaForSequenceClassification (LLaMA 模型)
longformer — LongformerForSequenceClassification (Longformer 模型)
luke — LukeForSequenceClassification (LUKE 模型)
markuplm — MarkupLMForSequenceClassification (MarkupLM 模型)
mbart — MBartForSequenceClassification (mBART 模型)
mega — MegaForSequenceClassification (MEGA 模型)
megatron-bert — MegatronBertForSequenceClassification (Megatron-BERT 模型)
minimax — MiniMaxForSequenceClassification (MiniMax 模型)
mistral — MistralForSequenceClassification (Mistral 模型)
mixtral — MixtralForSequenceClassification (Mixtral 模型)
mobilebert — MobileBertForSequenceClassification (MobileBERT 模型)
modernbert — ModernBertForSequenceClassification (ModernBERT 模型)
mpnet — MPNetForSequenceClassification (MPNet 模型)
mpt — MptForSequenceClassification (MPT 模型)
mra — MraForSequenceClassification (MRA 模型)
mt5 — MT5ForSequenceClassification (MT5 模型)
mvp — MvpForSequenceClassification (MVP 模型)
nemotron — NemotronForSequenceClassification (Nemotron 模型)
nezha — NezhaForSequenceClassification (Nezha 模型)
nystromformer — NystromformerForSequenceClassification (Nyströmformer 模型)
open-llama — OpenLlamaForSequenceClassification (OpenLlama 模型)
openai-gpt — OpenAIGPTForSequenceClassification (OpenAI GPT 模型)
opt — OPTForSequenceClassification (OPT 模型)
perceiver — PerceiverForSequenceClassification (Perceiver 模型)
persimmon — PersimmonForSequenceClassification (Persimmon 模型)
phi — PhiForSequenceClassification (Phi 模型)
phi3 — Phi3ForSequenceClassification (Phi3 模型)
phimoe — PhimoeForSequenceClassification (Phimoe 模型)
plbart — PLBartForSequenceClassification (PLBart 模型)
qdqbert — QDQBertForSequenceClassification (QDQBert 模型)
qwen2 — Qwen2ForSequenceClassification (Qwen2 模型)
qwen2_moe — Qwen2MoeForSequenceClassification (Qwen2MoE 模型)
qwen3 — Qwen3ForSequenceClassification (Qwen3 模型)
qwen3_moe — Qwen3MoeForSequenceClassification (Qwen3MoE 模型)
reformer — ReformerForSequenceClassification (Reformer 模型)
rembert — RemBertForSequenceClassification (RemBERT 模型)
roberta — RobertaForSequenceClassification (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormForSequenceClassification (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertForSequenceClassification (RoCBert 模型)
roformer — RoFormerForSequenceClassification (RoFormer 模型)
smollm3 — SmolLM3ForSequenceClassification (SmolLM3 模型)
squeezebert — SqueezeBertForSequenceClassification (SqueezeBERT 模型)
stablelm — StableLmForSequenceClassification (StableLm 模型)
starcoder2 — Starcoder2ForSequenceClassification (Starcoder2 模型)
t5 — T5ForSequenceClassification (T5 模型)
t5gemma — T5GemmaForSequenceClassification (T5Gemma 模型)
tapas — TapasForSequenceClassification (TAPAS 模型)
transfo-xl — TransfoXLForSequenceClassification (Transformer-XL 模型)
umt5 — UMT5ForSequenceClassification (UMT5 模型)
xlm — XLMForSequenceClassification (XLM 模型)
xlm-roberta — XLMRobertaForSequenceClassification (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLForSequenceClassification (XLM-RoBERTa-XL 模型)
xlnet — XLNetForSequenceClassification (XLNet 模型)
xmod — XmodForSequenceClassification (X-MOD 模型)
yoso — YosoForSequenceClassification (YOSO 模型)
zamba — ZambaForSequenceClassification (Zamba 模型)
zamba2 — Zamba2ForSequenceClassification (Zamba2 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSequenceClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForSequenceClassification

class transformers.TFAutoModelForSequenceClassification

（ *args **kwargs ）

这是一个通用的模型类，在使用 from_pretrained() 类方法或 from_config() 类方法创建时，将被实例化为库中的某个模型类（带序列分类头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 用于实例化模型类的配置。实例化的模型类将根据配置类选择：
- AlbertConfig 配置类：TFAlbertForSequenceClassification (ALBERT 模型)
- BartConfig 配置类：TFBartForSequenceClassification (BART 模型)
- BertConfig 配置类：TFBertForSequenceClassification (BERT 模型)
- CTRLConfig 配置类：TFCTRLForSequenceClassification (CTRL 模型)
- CamembertConfig 配置类：TFCamembertForSequenceClassification (CamemBERT 模型)
- ConvBertConfig 配置类：TFConvBertForSequenceClassification (ConvBERT 模型)
- DebertaConfig 配置类：TFDebertaForSequenceClassification (DeBERTa 模型)
- DebertaV2Config 配置类：TFDebertaV2ForSequenceClassification (DeBERTa-v2 模型)
- DistilBertConfig 配置类：TFDistilBertForSequenceClassification (DistilBERT 模型)
- ElectraConfig 配置类：TFElectraForSequenceClassification (ELECTRA 模型)
- EsmConfig 配置类：TFEsmForSequenceClassification (ESM 模型)
- FlaubertConfig 配置类：TFFlaubertForSequenceClassification (FlauBERT 模型)
- FunnelConfig 配置类：TFFunnelForSequenceClassification (Funnel Transformer 模型)
- GPT2Config 配置类：TFGPT2ForSequenceClassification (OpenAI GPT-2 模型)
- GPTJConfig 配置类：TFGPTJForSequenceClassification (GPT-J 模型)
- LayoutLMConfig 配置类：TFLayoutLMForSequenceClassification (LayoutLM 模型)
- LayoutLMv3Config 配置类：TFLayoutLMv3ForSequenceClassification (LayoutLMv3 模型)
- LongformerConfig 配置类：TFLongformerForSequenceClassification (Longformer 模型)
- MPNetConfig 配置类：TFMPNetForSequenceClassification (MPNet 模型)
- MistralConfig 配置类：TFMistralForSequenceClassification (Mistral 模型)
- MobileBertConfig 配置类：TFMobileBertForSequenceClassification (MobileBERT 模型)
- OpenAIGPTConfig 配置类：TFOpenAIGPTForSequenceClassification (OpenAI GPT 模型)
- RemBertConfig 配置类：TFRemBertForSequenceClassification (RemBERT 模型)
- RoFormerConfig 配置类：TFRoFormerForSequenceClassification (RoFormer 模型)
- RobertaConfig 配置类：TFRobertaForSequenceClassification (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：TFRobertaPreLayerNormForSequenceClassification (RoBERTa-PreLayerNorm 模型)
- TapasConfig 配置类：TFTapasForSequenceClassification (TAPAS 模型)
- TransfoXLConfig 配置类：TFTransfoXLForSequenceClassification (Transformer-XL 模型)
- XLMConfig 配置类：TFXLMForSequenceClassification (XLM 模型)
- XLMRobertaConfig 配置类：TFXLMRobertaForSequenceClassification (XLM-RoBERTa 模型)
- XLNetConfig 配置类：TFXLNetForSequenceClassification (XLNet 模型)
attn_implementation (str, 可选) — 模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认为手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有序列分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForSequenceClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的模型 ID。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个指向 PyTorch state_dict 保存文件的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应通过 config 参数提供一个配置对象。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型后再加载 TensorFlow 模型要慢。
model_args (额外的 positional 参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，以替代自动加载的配置。在以下情况下可以自动加载配置：
- 该模型是库提供的模型（使用预训练模型的模型 ID字符串加载）。
- 该模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置 JSON 文件。
cache_dir (str 或 os.PathLike, 可选) — 当不应使用标准缓存时，用于缓存下载的预训练模型配置的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。现在所有下载在可能的情况下都会默认续传。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次请求时使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地机器上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则用于 Hub 上代码的特定修订版本。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载了配置：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中与配置属性对应的每个键都将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有序列分类头）。

albert — TFAlbertForSequenceClassification (ALBERT 模型)
bart — TFBartForSequenceClassification (BART 模型)
bert — TFBertForSequenceClassification (BERT 模型)
camembert — TFCamembertForSequenceClassification (CamemBERT 模型)
convbert — TFConvBertForSequenceClassification (ConvBERT 模型)
ctrl — TFCTRLForSequenceClassification (CTRL 模型)
deberta — TFDebertaForSequenceClassification (DeBERTa 模型)
deberta-v2 — TFDebertaV2ForSequenceClassification (DeBERTa-v2 模型)
distilbert — TFDistilBertForSequenceClassification (DistilBERT 模型)
electra — TFElectraForSequenceClassification (ELECTRA 模型)
esm — TFEsmForSequenceClassification (ESM 模型)
flaubert — TFFlaubertForSequenceClassification (FlauBERT 模型)
funnel — TFFunnelForSequenceClassification (Funnel Transformer 模型)
gpt-sw3 — TFGPT2ForSequenceClassification (GPT-Sw3 模型)
gpt2 — TFGPT2ForSequenceClassification (OpenAI GPT-2 模型)
gptj — TFGPTJForSequenceClassification (GPT-J 模型)
layoutlm — TFLayoutLMForSequenceClassification (LayoutLM 模型)
layoutlmv3 — TFLayoutLMv3ForSequenceClassification (LayoutLMv3 模型)
longformer — TFLongformerForSequenceClassification (Longformer 模型)
mistral — TFMistralForSequenceClassification (Mistral 模型)
mobilebert — TFMobileBertForSequenceClassification (MobileBERT 模型)
mpnet — TFMPNetForSequenceClassification (MPNet 模型)
openai-gpt — TFOpenAIGPTForSequenceClassification (OpenAI GPT 模型)
rembert — TFRemBertForSequenceClassification (RemBERT 模型)
roberta — TFRobertaForSequenceClassification (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForSequenceClassification (RoBERTa-PreLayerNorm 模型)
roformer — TFRoFormerForSequenceClassification (RoFormer 模型)
tapas — TFTapasForSequenceClassification (TAPAS 模型)
transfo-xl — TFTransfoXLForSequenceClassification (Transformer-XL 模型)
xlm — TFXLMForSequenceClassification (XLM 模型)
xlm-roberta — TFXLMRobertaForSequenceClassification (XLM-RoBERTa 模型)
xlnet — TFXLNetForSequenceClassification (XLNet 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSequenceClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForSequenceClassification

class transformers.FlaxAutoModelForSequenceClassification

（ *args **kwargs ）

这是一个通用的模型类，在使用 from_pretrained() 类方法或 from_config() 类方法创建时，将被实例化为库中的某个模型类（带序列分类头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- AlbertConfig 配置类：FlaxAlbertForSequenceClassification (ALBERT 模型)
- BartConfig 配置类：FlaxBartForSequenceClassification (BART 模型)
- BertConfig 配置类：FlaxBertForSequenceClassification (BERT 模型)
- BigBirdConfig 配置类：FlaxBigBirdForSequenceClassification (BigBird 模型)
- DistilBertConfig 配置类：FlaxDistilBertForSequenceClassification (DistilBERT 模型)
- ElectraConfig 配置类：FlaxElectraForSequenceClassification (ELECTRA 模型)
- MBartConfig 配置类：FlaxMBartForSequenceClassification (mBART 模型)
- RoFormerConfig 配置类：FlaxRoFormerForSequenceClassification (RoFormer 模型)
- RobertaConfig 配置类：FlaxRobertaForSequenceClassification (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：FlaxRobertaPreLayerNormForSequenceClassification (RoBERTa-PreLayerNorm 模型)
- XLMRobertaConfig 配置类：FlaxXLMRobertaForSequenceClassification (XLM-RoBERTa 模型)
attn_implementation (str, 可选) — 模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认为手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有序列分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForSequenceClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForSequenceClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的模型 ID。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个指向 PyTorch state_dict 保存文件的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应通过 config 参数提供一个配置对象。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型后再加载 TensorFlow 模型要慢。
model_args (额外的 positional 参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，以替代自动加载的配置。在以下情况下可以自动加载配置：
- 该模型是库提供的模型（使用预训练模型的模型 ID字符串加载）。
- 该模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置 JSON 文件。
cache_dir (str 或 os.PathLike, 可选) — 当不应使用标准缓存时，用于缓存下载的预训练模型配置的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。现在所有下载在可能的情况下都会默认续传。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次请求时使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地机器上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则用于 Hub 上代码的特定修订版本。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载了配置：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中与配置属性对应的每个键都将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有序列分类头）。

albert — FlaxAlbertForSequenceClassification (ALBERT 模型)
bart — FlaxBartForSequenceClassification (BART 模型)
bert — FlaxBertForSequenceClassification (BERT 模型)
big_bird — FlaxBigBirdForSequenceClassification (BigBird 模型)
distilbert — FlaxDistilBertForSequenceClassification (DistilBERT 模型)
electra — FlaxElectraForSequenceClassification (ELECTRA 模型)
mbart — FlaxMBartForSequenceClassification (mBART 模型)
roberta — FlaxRobertaForSequenceClassification (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormForSequenceClassification (RoBERTa-PreLayerNorm 模型)
roformer — FlaxRoFormerForSequenceClassification (RoFormer 模型)
xlm-roberta — FlaxXLMRobertaForSequenceClassification (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForSequenceClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForSequenceClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForMultipleChoice

class transformers.AutoModelForMultipleChoice

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有选择题头部）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- AlbertConfig 配置类：AlbertForMultipleChoice (ALBERT 模型)
- BertConfig 配置类：BertForMultipleChoice (BERT 模型)
- BigBirdConfig 配置类：BigBirdForMultipleChoice (BigBird 模型)
- CamembertConfig 配置类：CamembertForMultipleChoice (CamemBERT 模型)
- CanineConfig 配置类：CanineForMultipleChoice (CANINE 模型)
- ConvBertConfig 配置类：ConvBertForMultipleChoice (ConvBERT 模型)
- Data2VecTextConfig 配置类：Data2VecTextForMultipleChoice (Data2VecText 模型)
- DebertaV2Config 配置类：DebertaV2ForMultipleChoice (DeBERTa-v2 模型)
- DistilBertConfig 配置类：DistilBertForMultipleChoice (DistilBERT 模型)
- ElectraConfig 配置类：ElectraForMultipleChoice (ELECTRA 模型)
- ErnieConfig 配置类：ErnieForMultipleChoice (ERNIE 模型)
- ErnieMConfig 配置类：ErnieMForMultipleChoice (ErnieM 模型)
- FNetConfig 配置类：FNetForMultipleChoice (FNet 模型)
- FlaubertConfig 配置类：FlaubertForMultipleChoice (FlauBERT 模型)
- FunnelConfig 配置类：FunnelForMultipleChoice (Funnel Transformer 模型)
- IBertConfig 配置类：IBertForMultipleChoice (I-BERT 模型)
- LongformerConfig 配置类：LongformerForMultipleChoice (Longformer 模型)
- LukeConfig 配置类：LukeForMultipleChoice (LUKE 模型)
- MPNetConfig 配置类：MPNetForMultipleChoice (MPNet 模型)
- MegaConfig 配置类：MegaForMultipleChoice (MEGA 模型)
- MegatronBertConfig 配置类：MegatronBertForMultipleChoice (Megatron-BERT 模型)
- MobileBertConfig 配置类：MobileBertForMultipleChoice (MobileBERT 模型)
- MraConfig 配置类：MraForMultipleChoice (MRA 模型)
- NezhaConfig 配置类：NezhaForMultipleChoice (Nezha 模型)
- NystromformerConfig 配置类：NystromformerForMultipleChoice (Nyströmformer 模型)
- QDQBertConfig 配置类：QDQBertForMultipleChoice (QDQBert 模型)
- RemBertConfig 配置类：RemBertForMultipleChoice (RemBERT 模型)
- RoCBertConfig 配置类：RoCBertForMultipleChoice (RoCBert 模型)
- RoFormerConfig 配置类：RoFormerForMultipleChoice (RoFormer 模型)
- RobertaConfig 配置类：RobertaForMultipleChoice (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：RobertaPreLayerNormForMultipleChoice (RoBERTa-PreLayerNorm 模型)
- SqueezeBertConfig 配置类：SqueezeBertForMultipleChoice (SqueezeBERT 模型)
- XLMConfig 配置类：XLMForMultipleChoice (XLM 模型)
- XLMRobertaConfig 配置类：XLMRobertaForMultipleChoice (XLM-RoBERTa 模型)
- XLMRobertaXLConfig 配置类：XLMRobertaXLForMultipleChoice (XLM-RoBERTa-XL 模型)
- XLNetConfig 配置类：XLNetForMultipleChoice (XLNet 模型)
- XmodConfig 配置类：XmodForMultipleChoice (X-MOD 模型)
- YosoConfig 配置类：YosoForMultipleChoice (YOSO 模型)
attn_implementation (str, optional) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

从一个配置中实例化库中的一个模型类（带有多项选择头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForMultipleChoice

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMultipleChoice.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如：./my_model_directory/。
- 一个*tensorflow 索引检查点文件*的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，from_tf 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型，然后再加载 PyTorch 模型要慢。
model_args (其他位置参数, optional) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 *config.json* 的配置文件。
state_dict (dict[str, torch.Tensor], optional) — 要使用的状态字典，而不是从保存的权重文件中加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是更简单的选项。
cache_dir (str or os.PathLike, optional) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_tf (bool, optional, defaults to False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。所有下载现在在可能时都会默认恢复。将在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 按协议或端点使用的代理服务器字典，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用于每个请求。
output_loading_info(bool, optional, defaults to False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, optional, defaults to False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上自定义模型定义在其自己的建模文件中。此选项只应为 True 设置为您信任且已阅读其代码的仓库，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, optional, defaults to "main") — 如果代码位于与模型其余部分不同的存储库中，则用于 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 `revision` 可以是 git 允许的任何标识符。
kwargs (其他关键字参数, optional) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为根据是否提供了 config 或自动加载而不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类初始化函数 (from_pretrained())。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有多项选择头）。

albert — AlbertForMultipleChoice (ALBERT 模型)
bert — BertForMultipleChoice (BERT 模型)
big_bird — BigBirdForMultipleChoice (BigBird 模型)
camembert — CamembertForMultipleChoice (CamemBERT 模型)
canine — CanineForMultipleChoice (CANINE 模型)
convbert — ConvBertForMultipleChoice (ConvBERT 模型)
data2vec-text — Data2VecTextForMultipleChoice (Data2VecText 模型)
deberta-v2 — DebertaV2ForMultipleChoice (DeBERTa-v2 模型)
distilbert — DistilBertForMultipleChoice (DistilBERT 模型)
electra — ElectraForMultipleChoice (ELECTRA 模型)
ernie — ErnieForMultipleChoice (ERNIE 模型)
ernie_m — ErnieMForMultipleChoice (ErnieM 模型)
flaubert — FlaubertForMultipleChoice (FlauBERT 模型)
fnet — FNetForMultipleChoice (FNet 模型)
funnel — FunnelForMultipleChoice (Funnel Transformer 模型)
ibert — IBertForMultipleChoice (I-BERT 模型)
longformer — LongformerForMultipleChoice (Longformer 模型)
luke — LukeForMultipleChoice (LUKE 模型)
mega — MegaForMultipleChoice (MEGA 模型)
megatron-bert — MegatronBertForMultipleChoice (Megatron-BERT 模型)
mobilebert — MobileBertForMultipleChoice (MobileBERT 模型)
mpnet — MPNetForMultipleChoice (MPNet 模型)
mra — MraForMultipleChoice (MRA 模型)
nezha — NezhaForMultipleChoice (Nezha 模型)
nystromformer — NystromformerForMultipleChoice (Nyströmformer 模型)
qdqbert — QDQBertForMultipleChoice (QDQBert 模型)
rembert — RemBertForMultipleChoice (RemBERT 模型)
roberta — RobertaForMultipleChoice (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormForMultipleChoice (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertForMultipleChoice (RoCBert 模型)
roformer — RoFormerForMultipleChoice (RoFormer 模型)
squeezebert — SqueezeBertForMultipleChoice (SqueezeBERT 模型)
xlm — XLMForMultipleChoice (XLM 模型)
xlm-roberta — XLMRobertaForMultipleChoice (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLForMultipleChoice (XLM-RoBERTa-XL 模型)
xlnet — XLNetForMultipleChoice (XLNet 模型)
xmod — XmodForMultipleChoice (X-MOD 模型)
yoso — YosoForMultipleChoice (YOSO 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMultipleChoice.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForMultipleChoice

class transformers.TFAutoModelForMultipleChoice

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有选择题头部）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- AlbertConfig 配置类：TFAlbertForMultipleChoice (ALBERT 模型)
- BertConfig 配置类：TFBertForMultipleChoice (BERT 模型)
- CamembertConfig 配置类：TFCamembertForMultipleChoice (CamemBERT 模型)
- ConvBertConfig 配置类：TFConvBertForMultipleChoice (ConvBERT 模型)
- DebertaV2Config 配置类：TFDebertaV2ForMultipleChoice (DeBERTa-v2 模型)
- DistilBertConfig 配置类：TFDistilBertForMultipleChoice (DistilBERT 模型)
- ElectraConfig 配置类：TFElectraForMultipleChoice (ELECTRA 模型)
- FlaubertConfig 配置类：TFFlaubertForMultipleChoice (FlauBERT 模型)
- FunnelConfig 配置类：TFFunnelForMultipleChoice (Funnel Transformer 模型)
- LongformerConfig 配置类：TFLongformerForMultipleChoice (Longformer 模型)
- MPNetConfig 配置类：TFMPNetForMultipleChoice (MPNet 模型)
- MobileBertConfig 配置类：TFMobileBertForMultipleChoice (MobileBERT 模型)
- RemBertConfig 配置类：TFRemBertForMultipleChoice (RemBERT 模型)
- RoFormerConfig 配置类：TFRoFormerForMultipleChoice (RoFormer 模型)
- RobertaConfig 配置类：TFRobertaForMultipleChoice (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：TFRobertaPreLayerNormForMultipleChoice (RoBERTa-PreLayerNorm 模型)
- XLMConfig 配置类：TFXLMForMultipleChoice (XLM 模型)
- XLMRobertaConfig 配置类：TFXLMRobertaForMultipleChoice (XLM-RoBERTa 模型)
- XLNetConfig 配置类：TFXLNetForMultipleChoice (XLNet 模型)
attn_implementation (str, optional) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

从一个配置中实例化库中的一个模型类（带有多项选择头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForMultipleChoice.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如：./my_model_directory/。
- 一个*PyTorch state_dict 保存文件*的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型，然后再加载 TensorFlow 模型要慢。
model_args (其他位置参数, optional) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 *config.json* 的配置文件。
cache_dir (str or os.PathLike, optional) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_pt (bool, optional, defaults to False) — 从 PyTorch 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。所有下载现在在可能时都会默认恢复。将在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 按协议或端点使用的代理服务器字典，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用于每个请求。
output_loading_info(bool, optional, defaults to False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, optional, defaults to False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上自定义模型定义在其自己的建模文件中。此选项只应为 True 设置为您信任且已阅读其代码的仓库，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, optional, defaults to "main") — 如果代码位于与模型其余部分不同的存储库中，则用于 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 `revision` 可以是 git 允许的任何标识符。
kwargs (其他关键字参数, optional) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为根据是否提供了 config 或自动加载而不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类初始化函数 (from_pretrained())。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有多项选择头）。

albert — TFAlbertForMultipleChoice (ALBERT 模型)
bert — TFBertForMultipleChoice (BERT 模型)
camembert — TFCamembertForMultipleChoice (CamemBERT 模型)
convbert — TFConvBertForMultipleChoice (ConvBERT 模型)
deberta-v2 — TFDebertaV2ForMultipleChoice (DeBERTa-v2 模型)
distilbert — TFDistilBertForMultipleChoice (DistilBERT 模型)
electra — TFElectraForMultipleChoice (ELECTRA 模型)
flaubert — TFFlaubertForMultipleChoice (FlauBERT 模型)
funnel — TFFunnelForMultipleChoice (Funnel Transformer 模型)
longformer — TFLongformerForMultipleChoice (Longformer 模型)
mobilebert — TFMobileBertForMultipleChoice (MobileBERT 模型)
mpnet — TFMPNetForMultipleChoice (MPNet 模型)
rembert — TFRemBertForMultipleChoice (RemBERT 模型)
roberta — TFRobertaForMultipleChoice (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForMultipleChoice (RoBERTa-PreLayerNorm 模型)
roformer — TFRoFormerForMultipleChoice (RoFormer 模型)
xlm — TFXLMForMultipleChoice (XLM 模型)
xlm-roberta — TFXLMRobertaForMultipleChoice (XLM-RoBERTa 模型)
xlnet — TFXLNetForMultipleChoice (XLNet 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMultipleChoice.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForMultipleChoice

class transformers.FlaxAutoModelForMultipleChoice

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有选择题头部）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类来选择的：
- AlbertConfig 配置类：FlaxAlbertForMultipleChoice (ALBERT 模型)
- BertConfig 配置类：FlaxBertForMultipleChoice (BERT 模型)
- BigBirdConfig 配置类：FlaxBigBirdForMultipleChoice (BigBird 模型)
- DistilBertConfig 配置类：FlaxDistilBertForMultipleChoice (DistilBERT 模型)
- ElectraConfig 配置类：FlaxElectraForMultipleChoice (ELECTRA 模型)
- RoFormerConfig 配置类：FlaxRoFormerForMultipleChoice (RoFormer 模型)
- RobertaConfig 配置类：FlaxRobertaForMultipleChoice (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：FlaxRobertaPreLayerNormForMultipleChoice (RoBERTa-PreLayerNorm 模型)
- XLMRobertaConfig 配置类：FlaxXLMRobertaForMultipleChoice (XLM-RoBERTa 模型)
attn_implementation (str, 可选) — 模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认为手动的 "eager" 实现。

从一个配置中实例化库中的一个模型类（带有多项选择头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForMultipleChoice

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForMultipleChoice.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 模型仓库中的预训练模型的*模型ID*。
- 一个指向包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如：./my_model_directory/。
- 一个指向*PyTorch state_dict 保存文件*的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型后再加载 TensorFlow 模型要慢。
model_args (其他位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而非自动加载的配置。当满足以下条件时，可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
cache_dir (str or os.PathLike, 可选) — 如果不想使用标准缓存，可以指定一个目录路径，用于缓存下载的预训练模型配置。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 一个协议或端点使用的代理服务器字典，例如：{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。可以是一个分支名、一个标签名或一个提交ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型文件中定义模型。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地机器上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则使用 Hub 上的特定代码版本。可以是一个分支名、一个标签名或一个提交ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (其他关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。行为方式取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键都将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有多项选择头）。

albert — FlaxAlbertForMultipleChoice (ALBERT 模型)
bert — FlaxBertForMultipleChoice (BERT 模型)
big_bird — FlaxBigBirdForMultipleChoice (BigBird 模型)
distilbert — FlaxDistilBertForMultipleChoice (DistilBERT 模型)
electra — FlaxElectraForMultipleChoice (ELECTRA 模型)
roberta — FlaxRobertaForMultipleChoice (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormForMultipleChoice (RoBERTa-PreLayerNorm 模型)
roformer — FlaxRoFormerForMultipleChoice (RoFormer 模型)
xlm-roberta — FlaxXLMRobertaForMultipleChoice (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForMultipleChoice

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForMultipleChoice.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForNextSentencePrediction

class transformers.AutoModelForNextSentencePrediction

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有下一句预测头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类来选择的：
- BertConfig 配置类：BertForNextSentencePrediction (BERT 模型)
- ErnieConfig 配置类：ErnieForNextSentencePrediction (ERNIE 模型)
- FNetConfig 配置类：FNetForNextSentencePrediction (FNet 模型)
- MegatronBertConfig 配置类：MegatronBertForNextSentencePrediction (Megatron-BERT 模型)
- MobileBertConfig 配置类：MobileBertForNextSentencePrediction (MobileBERT 模型)
- NezhaConfig 配置类：NezhaForNextSentencePrediction (Nezha 模型)
- QDQBertConfig 配置类：QDQBertForNextSentencePrediction (QDQBert 模型)
attn_implementation (str, 可选) — 模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认为手动的 "eager" 实现。

从配置中实例化库中的一个模型类（带有下一句预测头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForNextSentencePrediction.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 模型仓库中的预训练模型的*模型ID*。
- 一个指向包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如：./my_model_directory/。
- 一个指向*tensorflow索引检查点文件*的路径或URL（例如，./tf_model/model.ckpt.index）。在这种情况下，from_tf 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型后再加载 PyTorch 模型要慢。
model_args (其他位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而非自动加载的配置。当满足以下条件时，可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 要使用的状态字典，而不是从保存的权重文件加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str or os.PathLike, 可选) — 如果不想使用标准缓存，可以指定一个目录路径，用于缓存下载的预训练模型配置。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 一个协议或端点使用的代理服务器字典，例如：{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。可以是一个分支名、一个标签名或一个提交ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型文件中定义模型。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地机器上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则使用 Hub 上的特定代码版本。可以是一个分支名、一个标签名或一个提交ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (其他关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。行为方式取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键都将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有下一句预测头）。

bert — BertForNextSentencePrediction (BERT 模型)
ernie — ErnieForNextSentencePrediction (ERNIE 模型)
fnet — FNetForNextSentencePrediction (FNet 模型)
megatron-bert — MegatronBertForNextSentencePrediction (Megatron-BERT 模型)
mobilebert — MobileBertForNextSentencePrediction (MobileBERT 模型)
nezha — NezhaForNextSentencePrediction (Nezha 模型)
qdqbert — QDQBertForNextSentencePrediction (QDQBert 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForNextSentencePrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForNextSentencePrediction.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForNextSentencePrediction

class transformers.TFAutoModelForNextSentencePrediction

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有下一句预测头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类来选择的：
- BertConfig 配置类：TFBertForNextSentencePrediction (BERT 模型)
- MobileBertConfig 配置类：TFMobileBertForNextSentencePrediction (MobileBERT 模型)
attn_implementation (str, 可选) — 模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认为手动的 "eager" 实现。

从配置中实例化库中的一个模型类（带有下一句预测头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForNextSentencePrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForNextSentencePrediction.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 模型仓库中的预训练模型的*模型ID*。
- 一个指向包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如：./my_model_directory/。
- 一个指向*PyTorch state_dict 保存文件*的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型后再加载 TensorFlow 模型要慢。
model_args (其他位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而非自动加载的配置。当满足以下条件时，可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
cache_dir (str or os.PathLike, 可选) — 如果不想使用标准缓存，可以指定一个目录路径，用于缓存下载的预训练模型配置。
from_pt (bool, optional, defaults to False) — 是否从 PyTorch 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，如果存在缓存版本则覆盖它们。
resume_download — 已弃用并忽略。现在所有下载在可能的情况下都会默认恢复。将在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用于每个请求。
output_loading_info(bool, optional, defaults to False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, optional, defaults to False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, optional, defaults to "main") — 如果代码位于与模型其余部分不同的仓库中，则用于 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (additional keyword arguments, optional) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为因是否提供了 config 或自动加载而异：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有对配置的相关更新已经完成）
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有下一句预测头）。

bert — TFBertForNextSentencePrediction (BERT 模型)
mobilebert — TFMobileBertForNextSentencePrediction (MobileBERT 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForNextSentencePrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForNextSentencePrediction.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForNextSentencePrediction

class transformers.FlaxAutoModelForNextSentencePrediction

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有下一句预测头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类根据配置类选择：
- BertConfig 配置类：FlaxBertForNextSentencePrediction (BERT 模型)
attn_implementation (str, optional) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（手动实现注意力）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

从配置中实例化库中的一个模型类（带有下一句预测头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForNextSentencePrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForNextSentencePrediction.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的 模型 ID。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个 PyTorch state_dict 保存文件 的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型然后加载 TensorFlow 模型要慢。
model_args (additional positional arguments, optional) — 将传递给底层模型 __init__() 方法。
config (PretrainedConfig, optional) — 用于模型的配置，而不是自动加载的配置。配置可以在以下情况下自动加载：
- 该模型是库提供的模型（使用预训练模型的 模型 ID 字符串加载）。
- 该模型使用 save_pretrained() 保存，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在目录中找到了名为 config.json 的配置文件。
cache_dir (str or os.PathLike, optional) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存的目录路径。
from_pt (bool, optional, defaults to False) — 是否从 PyTorch 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，如果存在缓存版本则覆盖它们。
resume_download — 已弃用并忽略。现在所有下载在可能的情况下都会默认恢复。将在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用于每个请求。
output_loading_info(bool, optional, defaults to False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, optional, defaults to False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, optional, defaults to "main") — 如果代码位于与模型其余部分不同的仓库中，则用于 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (additional keyword arguments, optional) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为因是否提供了 config 或自动加载而异：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有对配置的相关更新已经完成）
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有下一句预测头）。

bert — FlaxBertForNextSentencePrediction (BERT 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForNextSentencePrediction

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForNextSentencePrediction.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForTokenClassification

class transformers.AutoModelForTokenClassification

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有词元分类头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类根据配置类选择：
- AlbertConfig 配置类：AlbertForTokenClassification (ALBERT 模型)
- ArceeConfig 配置类：ArceeForTokenClassification (Arcee 模型)
- BertConfig 配置类：BertForTokenClassification (BERT 模型)
- BigBirdConfig 配置类：BigBirdForTokenClassification (BigBird 模型)
- BioGptConfig 配置类：BioGptForTokenClassification (BioGpt 模型)
- BloomConfig 配置类：BloomForTokenClassification (BLOOM 模型)
- BrosConfig 配置类：BrosForTokenClassification (BROS 模型)
- CamembertConfig 配置类：CamembertForTokenClassification (CamemBERT 模型)
- CanineConfig 配置类：CanineForTokenClassification (CANINE 模型)
- ConvBertConfig 配置类：ConvBertForTokenClassification (ConvBERT 模型)
- Data2VecTextConfig 配置类：Data2VecTextForTokenClassification (Data2VecText 模型)
- DebertaConfig 配置类：DebertaForTokenClassification (DeBERTa 模型)
- DebertaV2Config 配置类：DebertaV2ForTokenClassification (DeBERTa-v2 模型)
- DiffLlamaConfig 配置类：DiffLlamaForTokenClassification (DiffLlama 模型)
- DistilBertConfig 配置类：DistilBertForTokenClassification (DistilBERT 模型)
- ElectraConfig 配置类：ElectraForTokenClassification (ELECTRA 模型)
- ErnieConfig 配置类：ErnieForTokenClassification (ERNIE 模型)
- ErnieMConfig 配置类：ErnieMForTokenClassification (ErnieM 模型)
- EsmConfig 配置类：EsmForTokenClassification (ESM 模型)
- FNetConfig 配置类：FNetForTokenClassification (FNet 模型)
- FalconConfig 配置类：FalconForTokenClassification (Falcon 模型)
- FlaubertConfig 配置类：FlaubertForTokenClassification (FlauBERT 模型)
- FunnelConfig 配置类：FunnelForTokenClassification (Funnel Transformer 模型)
- GPT2Config 配置类：GPT2ForTokenClassification (OpenAI GPT-2 模型)
- GPTBigCodeConfig 配置类：GPTBigCodeForTokenClassification (GPTBigCode 模型)
- GPTNeoConfig 配置类：GPTNeoForTokenClassification (GPT Neo 模型)
- GPTNeoXConfig 配置类：GPTNeoXForTokenClassification (GPT NeoX 模型)
- Gemma2Config 配置类：Gemma2ForTokenClassification (Gemma2 模型)
- GemmaConfig 配置类：GemmaForTokenClassification (Gemma 模型)
- Glm4Config 配置类：Glm4ForTokenClassification (GLM4 模型)
- GlmConfig 配置类：GlmForTokenClassification (GLM 模型)
- HeliumConfig 配置类：HeliumForTokenClassification (Helium 模型)
- IBertConfig 配置类：IBertForTokenClassification (I-BERT 模型)
- LayoutLMConfig 配置类：LayoutLMForTokenClassification (LayoutLM 模型)
- LayoutLMv2Config 配置类：LayoutLMv2ForTokenClassification (LayoutLMv2 模型)
- LayoutLMv3Config 配置类：LayoutLMv3ForTokenClassification (LayoutLMv3 模型)
- LiltConfig 配置类：LiltForTokenClassification (LiLT 模型)
- LlamaConfig 配置类：LlamaForTokenClassification (LLaMA 模型)
- LongformerConfig 配置类：LongformerForTokenClassification (Longformer 模型)
- LukeConfig 配置类：LukeForTokenClassification (LUKE 模型)
- MPNetConfig 配置类：MPNetForTokenClassification (MPNet 模型)
- MT5Config 配置类：MT5ForTokenClassification (MT5 模型)
- MarkupLMConfig 配置类：MarkupLMForTokenClassification (MarkupLM 模型)
- MegaConfig 配置类：MegaForTokenClassification (MEGA 模型)
- MegatronBertConfig 配置类：MegatronBertForTokenClassification (Megatron-BERT 模型)
- MiniMaxConfig 配置类：MiniMaxForTokenClassification (MiniMax 模型)
- MistralConfig 配置类：MistralForTokenClassification (Mistral 模型)
- MixtralConfig 配置类：MixtralForTokenClassification (Mixtral 模型)
- MobileBertConfig 配置类：MobileBertForTokenClassification (MobileBERT 模型)
- ModernBertConfig 配置类：ModernBertForTokenClassification (ModernBERT 模型)
- MptConfig 配置类：MptForTokenClassification (MPT 模型)
- MraConfig 配置类：MraForTokenClassification (MRA 模型)
- NemotronConfig 配置类：NemotronForTokenClassification (Nemotron 模型)
- NezhaConfig 配置类：NezhaForTokenClassification (Nezha 模型)
- NystromformerConfig 配置类：NystromformerForTokenClassification (Nyströmformer 模型)
- PersimmonConfig 配置类：PersimmonForTokenClassification (Persimmon 模型)
- Phi3Config 配置类：Phi3ForTokenClassification (Phi3 模型)
- PhiConfig 配置类：PhiForTokenClassification (Phi 模型)
- QDQBertConfig 配置类：QDQBertForTokenClassification (QDQBert 模型)
- Qwen2Config 配置类：Qwen2ForTokenClassification (Qwen2 模型)
- Qwen2MoeConfig 配置类：Qwen2MoeForTokenClassification (Qwen2MoE 模型)
- Qwen3Config 配置类：Qwen3ForTokenClassification (Qwen3 模型)
- Qwen3MoeConfig 配置类：Qwen3MoeForTokenClassification (Qwen3MoE 模型)
- RemBertConfig 配置类：RemBertForTokenClassification (RemBERT 模型)
- RoCBertConfig 配置类：RoCBertForTokenClassification (RoCBert 模型)
- RoFormerConfig 配置类：RoFormerForTokenClassification (RoFormer 模型)
- RobertaConfig 配置类：RobertaForTokenClassification (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：RobertaPreLayerNormForTokenClassification (RoBERTa-PreLayerNorm 模型)
- SmolLM3Config 配置类：SmolLM3ForTokenClassification (SmolLM3 模型)
- SqueezeBertConfig 配置类：SqueezeBertForTokenClassification (SqueezeBERT 模型)
- StableLmConfig 配置类：StableLmForTokenClassification (StableLm 模型)
- Starcoder2Config 配置类：Starcoder2ForTokenClassification (Starcoder2 模型)
- T5Config 配置类：T5ForTokenClassification (T5 模型)
- T5GemmaConfig 配置类：T5GemmaForTokenClassification (T5Gemma 模型)
- UMT5Config 配置类：UMT5ForTokenClassification (UMT5 模型)
- XLMConfig 配置类：XLMForTokenClassification (XLM 模型)
- XLMRobertaConfig 配置类：XLMRobertaForTokenClassification (XLM-RoBERTa 模型)
- XLMRobertaXLConfig 配置类：XLMRobertaXLForTokenClassification (XLM-RoBERTa-XL 模型)
- XLNetConfig 配置类：XLNetForTokenClassification (XLNet 模型)
- XmodConfig 配置类：XmodForTokenClassification (X-MOD 模型)
- YosoConfig 配置类：YosoForTokenClassification (YOSO 模型)
attn_implementation (str, optional) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（手动实现注意力）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的某个模型类（带有词元分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForTokenClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForTokenClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的 模型 ID。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个 tensorflow 索引检查点文件 的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，from_tf 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后加载 PyTorch 模型要慢。
model_args (additional positional arguments, optional) — 将传递给底层模型 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 该模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录进行重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 要使用的状态字典，而不是从保存的权重文件中加载的状态字典。

如果你想从预训练的配置中创建一个模型，但加载自己的权重，可以使用此选项。但在这种情况下，你应该检查使用 save_pretrained() 和 from_pretrained() 是否是更简单的选择。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。现在所有下载在可能时都默认支持断点续传。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用于每个请求。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为你信任的且已阅读其代码的仓库设置为 True，因为它将在你的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，要使用的 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (附加关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果未提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有词元分类头）。

albert — AlbertForTokenClassification (ALBERT 模型)
arcee — ArceeForTokenClassification (Arcee 模型)
bert — BertForTokenClassification (BERT 模型)
big_bird — BigBirdForTokenClassification (BigBird 模型)
biogpt — BioGptForTokenClassification (BioGpt 模型)
bloom — BloomForTokenClassification (BLOOM 模型)
bros — BrosForTokenClassification (BROS 模型)
camembert — CamembertForTokenClassification (CamemBERT 模型)
canine — CanineForTokenClassification (CANINE 模型)
convbert — ConvBertForTokenClassification (ConvBERT 模型)
data2vec-text — Data2VecTextForTokenClassification (Data2VecText 模型)
deberta — DebertaForTokenClassification (DeBERTa 模型)
deberta-v2 — DebertaV2ForTokenClassification (DeBERTa-v2 模型)
diffllama — DiffLlamaForTokenClassification (DiffLlama 模型)
distilbert — DistilBertForTokenClassification (DistilBERT 模型)
electra — ElectraForTokenClassification (ELECTRA 模型)
ernie — ErnieForTokenClassification (ERNIE 模型)
ernie_m — ErnieMForTokenClassification (ErnieM 模型)
esm — EsmForTokenClassification (ESM 模型)
falcon — FalconForTokenClassification (Falcon 模型)
flaubert — FlaubertForTokenClassification (FlauBERT 模型)
fnet — FNetForTokenClassification (FNet 模型)
funnel — FunnelForTokenClassification (Funnel Transformer 模型)
gemma — GemmaForTokenClassification (Gemma 模型)
gemma2 — Gemma2ForTokenClassification (Gemma2 模型)
glm — GlmForTokenClassification (GLM 模型)
glm4 — Glm4ForTokenClassification (GLM4 模型)
gpt-sw3 — GPT2ForTokenClassification (GPT-Sw3 模型)
gpt2 — GPT2ForTokenClassification (OpenAI GPT-2 模型)
gpt_bigcode — GPTBigCodeForTokenClassification (GPTBigCode 模型)
gpt_neo — GPTNeoForTokenClassification (GPT Neo 模型)
gpt_neox — GPTNeoXForTokenClassification (GPT NeoX 模型)
helium — HeliumForTokenClassification (Helium 模型)
ibert — IBertForTokenClassification (I-BERT 模型)
layoutlm — LayoutLMForTokenClassification (LayoutLM 模型)
layoutlmv2 — LayoutLMv2ForTokenClassification (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3ForTokenClassification (LayoutLMv3 模型)
lilt — LiltForTokenClassification (LiLT 模型)
llama — LlamaForTokenClassification (LLaMA 模型)
longformer — LongformerForTokenClassification (Longformer 模型)
luke — LukeForTokenClassification (LUKE 模型)
markuplm — MarkupLMForTokenClassification (MarkupLM 模型)
mega — MegaForTokenClassification (MEGA 模型)
megatron-bert — MegatronBertForTokenClassification (Megatron-BERT 模型)
minimax — MiniMaxForTokenClassification (MiniMax 模型)
mistral — MistralForTokenClassification (Mistral 模型)
mixtral — MixtralForTokenClassification (Mixtral 模型)
mobilebert — MobileBertForTokenClassification (MobileBERT 模型)
modernbert — ModernBertForTokenClassification (ModernBERT 模型)
mpnet — MPNetForTokenClassification (MPNet 模型)
mpt — MptForTokenClassification (MPT 模型)
mra — MraForTokenClassification (MRA 模型)
mt5 — MT5ForTokenClassification (MT5 模型)
nemotron — NemotronForTokenClassification (Nemotron 模型)
nezha — NezhaForTokenClassification (Nezha 模型)
nystromformer — NystromformerForTokenClassification (Nyströmformer 模型)
persimmon — PersimmonForTokenClassification (Persimmon 模型)
phi — PhiForTokenClassification (Phi 模型)
phi3 — Phi3ForTokenClassification (Phi3 模型)
qdqbert — QDQBertForTokenClassification (QDQBert 模型)
qwen2 — Qwen2ForTokenClassification (Qwen2 模型)
qwen2_moe — Qwen2MoeForTokenClassification (Qwen2MoE 模型)
qwen3 — Qwen3ForTokenClassification (Qwen3 模型)
qwen3_moe — Qwen3MoeForTokenClassification (Qwen3MoE 模型)
rembert — RemBertForTokenClassification (RemBERT 模型)
roberta — RobertaForTokenClassification (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormForTokenClassification (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertForTokenClassification (RoCBert 模型)
roformer — RoFormerForTokenClassification (RoFormer 模型)
smollm3 — SmolLM3ForTokenClassification (SmolLM3 模型)
squeezebert — SqueezeBertForTokenClassification (SqueezeBERT 模型)
stablelm — StableLmForTokenClassification (StableLm 模型)
starcoder2 — Starcoder2ForTokenClassification (Starcoder2 模型)
t5 — T5ForTokenClassification (T5 模型)
t5gemma — T5GemmaForTokenClassification (T5Gemma 模型)
umt5 — UMT5ForTokenClassification (UMT5 模型)
xlm — XLMForTokenClassification (XLM 模型)
xlm-roberta — XLMRobertaForTokenClassification (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLForTokenClassification (XLM-RoBERTa-XL 模型)
xlnet — XLNetForTokenClassification (XLNet 模型)
xmod — XmodForTokenClassification (X-MOD 模型)
yoso — YosoForTokenClassification (YOSO 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForTokenClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForTokenClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForTokenClassification

class transformers.TFAutoModelForTokenClassification

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有词元分类头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- AlbertConfig 配置类：TFAlbertForTokenClassification (ALBERT 模型)
- BertConfig 配置类：TFBertForTokenClassification (BERT 模型)
- CamembertConfig 配置类：TFCamembertForTokenClassification (CamemBERT 模型)
- ConvBertConfig 配置类：TFConvBertForTokenClassification (ConvBERT 模型)
- DebertaConfig 配置类：TFDebertaForTokenClassification (DeBERTa 模型)
- DebertaV2Config 配置类：TFDebertaV2ForTokenClassification (DeBERTa-v2 模型)
- DistilBertConfig 配置类：TFDistilBertForTokenClassification (DistilBERT 模型)
- ElectraConfig 配置类：TFElectraForTokenClassification (ELECTRA 模型)
- EsmConfig 配置类：TFEsmForTokenClassification (ESM 模型)
- FlaubertConfig 配置类：TFFlaubertForTokenClassification (FlauBERT 模型)
- FunnelConfig 配置类：TFFunnelForTokenClassification (Funnel Transformer 模型)
- LayoutLMConfig 配置类：TFLayoutLMForTokenClassification (LayoutLM 模型)
- LayoutLMv3Config 配置类：TFLayoutLMv3ForTokenClassification (LayoutLMv3 模型)
- LongformerConfig 配置类：TFLongformerForTokenClassification (Longformer 模型)
- MPNetConfig 配置类：TFMPNetForTokenClassification (MPNet 模型)
- MobileBertConfig 配置类：TFMobileBertForTokenClassification (MobileBERT 模型)
- RemBertConfig 配置类：TFRemBertForTokenClassification (RemBERT 模型)
- RoFormerConfig 配置类：TFRoFormerForTokenClassification (RoFormer 模型)
- RobertaConfig 配置类：TFRobertaForTokenClassification (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：TFRobertaPreLayerNormForTokenClassification (RoBERTa-PreLayerNorm 模型)
- XLMConfig 配置类：TFXLMForTokenClassification (XLM 模型)
- XLMRobertaConfig 配置类：TFXLMRobertaForTokenClassification (XLM-RoBERTa 模型)
- XLNetConfig 配置类：TFXLNetForTokenClassification (XLNet 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention），或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的某个模型类（带有词元分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForTokenClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForTokenClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 字符串，即托管在 huggingface.co 上模型仓库中的预训练模型的*模型 ID*。
- 包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如 ./my_model_directory/。
- 指向*PyTorch state_dict 保存文件*的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，应将 from_pt 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型并随后加载 TensorFlow 模型要慢。
model_args (附加位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 该模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录进行重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。现在所有下载在可能时都默认支持断点续传。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用于每个请求。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为你信任的且已阅读其代码的仓库设置为 True，因为它将在你的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，要使用的 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (附加关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果未提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有词元分类头）。

albert — TFAlbertForTokenClassification (ALBERT 模型)
bert — TFBertForTokenClassification (BERT 模型)
camembert — TFCamembertForTokenClassification (CamemBERT 模型)
convbert — TFConvBertForTokenClassification (ConvBERT 模型)
deberta — TFDebertaForTokenClassification (DeBERTa 模型)
deberta-v2 — TFDebertaV2ForTokenClassification (DeBERTa-v2 模型)
distilbert — TFDistilBertForTokenClassification (DistilBERT 模型)
electra — TFElectraForTokenClassification (ELECTRA 模型)
esm — TFEsmForTokenClassification (ESM 模型)
flaubert — TFFlaubertForTokenClassification (FlauBERT 模型)
funnel — TFFunnelForTokenClassification (Funnel Transformer 模型)
layoutlm — TFLayoutLMForTokenClassification (LayoutLM 模型)
layoutlmv3 — TFLayoutLMv3ForTokenClassification (LayoutLMv3 模型)
longformer — TFLongformerForTokenClassification (Longformer 模型)
mobilebert — TFMobileBertForTokenClassification (MobileBERT 模型)
mpnet — TFMPNetForTokenClassification (MPNet 模型)
rembert — TFRemBertForTokenClassification (RemBERT 模型)
roberta — TFRobertaForTokenClassification (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForTokenClassification (RoBERTa-PreLayerNorm 模型)
roformer — TFRoFormerForTokenClassification (RoFormer 模型)
xlm — TFXLMForTokenClassification (XLM 模型)
xlm-roberta — TFXLMRobertaForTokenClassification (XLM-RoBERTa 模型)
xlnet — TFXLNetForTokenClassification (XLNet 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForTokenClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForTokenClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForTokenClassification

class transformers.FlaxAutoModelForTokenClassification

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有词元分类头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- AlbertConfig 配置类：FlaxAlbertForTokenClassification (ALBERT 模型)
- BertConfig 配置类：FlaxBertForTokenClassification (BERT 模型)
- BigBirdConfig 配置类：FlaxBigBirdForTokenClassification (BigBird 模型)
- DistilBertConfig 配置类：FlaxDistilBertForTokenClassification (DistilBERT 模型)
- ElectraConfig 配置类：FlaxElectraForTokenClassification (ELECTRA 模型)
- RoFormerConfig 配置类：FlaxRoFormerForTokenClassification (RoFormer 模型)
- RobertaConfig 配置类：FlaxRobertaForTokenClassification (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：FlaxRobertaPreLayerNormForTokenClassification (RoBERTa-PreLayerNorm 模型)
- XLMRobertaConfig 配置类：FlaxXLMRobertaForTokenClassification (XLM-RoBERTa 模型)
attn_implementation (str, optional) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的某个模型类（带有词元分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForTokenClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForTokenClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如：./my_model_directory/。
- 一个*PyTorch state_dict 保存文件*的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应该设置为 True，并且应该提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型，然后加载 TensorFlow 模型要慢。
model_args (额外的位置参数, optional) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型通过提供本地目录作为 pretrained_model_name_or_path 加载，并且在该目录中找到了名为 *config.json* 的配置文件。
cache_dir (str 或 os.PathLike, optional) — 如果不想使用标准缓存，可以指定一个目录路径，用于缓存下载的预训练模型配置。
from_pt (bool, optional, defaults to False) — 从 PyTorch 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认续传。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], optional) — 一个按协议或端点使用的代理服务器字典，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, optional, defaults to False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, optional, defaults to False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为你信任且已阅读其代码的仓库设置为 True，因为它将在你的本地机器上执行 Hub 上的代码。
code_revision (str, optional, defaults to "main") — 如果代码与模型的其余部分位于不同的仓库中，则指定用于 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, optional) — 可用于更新配置对象（在加载后）并初始化模型（例如，output_attentions=True）。行为会根据是否提供 config 或自动加载而有所不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有词元分类头）。

albert — FlaxAlbertForTokenClassification (ALBERT 模型)
bert — FlaxBertForTokenClassification (BERT 模型)
big_bird — FlaxBigBirdForTokenClassification (BigBird 模型)
distilbert — FlaxDistilBertForTokenClassification (DistilBERT 模型)
electra — FlaxElectraForTokenClassification (ELECTRA 模型)
roberta — FlaxRobertaForTokenClassification (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormForTokenClassification (RoBERTa-PreLayerNorm 模型)
roformer — FlaxRoFormerForTokenClassification (RoFormer 模型)
xlm-roberta — FlaxXLMRobertaForTokenClassification (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForTokenClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForTokenClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForTokenClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForQuestionAnswering

class transformers.AutoModelForQuestionAnswering

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有问答头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- AlbertConfig 配置类：AlbertForQuestionAnswering (ALBERT 模型)
- ArceeConfig 配置类：ArceeForQuestionAnswering (Arcee 模型)
- BartConfig 配置类：BartForQuestionAnswering (BART 模型)
- BertConfig 配置类：BertForQuestionAnswering (BERT 模型)
- BigBirdConfig 配置类：BigBirdForQuestionAnswering (BigBird 模型)
- BigBirdPegasusConfig 配置类：BigBirdPegasusForQuestionAnswering (BigBird-Pegasus 模型)
- BloomConfig 配置类：BloomForQuestionAnswering (BLOOM 模型)
- CamembertConfig 配置类：CamembertForQuestionAnswering (CamemBERT 模型)
- CanineConfig 配置类：CanineForQuestionAnswering (CANINE 模型)
- ConvBertConfig 配置类：ConvBertForQuestionAnswering (ConvBERT 模型)
- Data2VecTextConfig 配置类：Data2VecTextForQuestionAnswering (Data2VecText 模型)
- DebertaConfig 配置类：DebertaForQuestionAnswering (DeBERTa 模型)
- DebertaV2Config 配置类：DebertaV2ForQuestionAnswering (DeBERTa-v2 模型)
- DiffLlamaConfig 配置类：DiffLlamaForQuestionAnswering (DiffLlama 模型)
- DistilBertConfig 配置类：DistilBertForQuestionAnswering (DistilBERT 模型)
- ElectraConfig 配置类：ElectraForQuestionAnswering (ELECTRA 模型)
- ErnieConfig 配置类：ErnieForQuestionAnswering (ERNIE 模型)
- ErnieMConfig 配置类：ErnieMForQuestionAnswering (ErnieM 模型)
- FNetConfig 配置类：FNetForQuestionAnswering (FNet 模型)
- FalconConfig 配置类：FalconForQuestionAnswering (Falcon 模型)
- FlaubertConfig 配置类：FlaubertForQuestionAnsweringSimple (FlauBERT 模型)
- FunnelConfig 配置类：FunnelForQuestionAnswering (Funnel Transformer 模型)
- GPT2Config 配置类：GPT2ForQuestionAnswering (OpenAI GPT-2 模型)
- GPTJConfig 配置类：GPTJForQuestionAnswering (GPT-J 模型)
- GPTNeoConfig 配置类：GPTNeoForQuestionAnswering (GPT Neo 模型)
- GPTNeoXConfig 配置类：GPTNeoXForQuestionAnswering (GPT NeoX 模型)
- IBertConfig 配置类：IBertForQuestionAnswering (I-BERT 模型)
- LEDConfig 配置类：LEDForQuestionAnswering (LED 模型)
- LayoutLMv2Config 配置类：LayoutLMv2ForQuestionAnswering (LayoutLMv2 模型)
- LayoutLMv3Config 配置类：LayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)
- LiltConfig 配置类：LiltForQuestionAnswering (LiLT 模型)
- LlamaConfig 配置类：LlamaForQuestionAnswering (LLaMA 模型)
- LongformerConfig 配置类：LongformerForQuestionAnswering (Longformer 模型)
- LukeConfig 配置类：LukeForQuestionAnswering (LUKE 模型)
- LxmertConfig 配置类：LxmertForQuestionAnswering (LXMERT 模型)
- MBartConfig 配置类：MBartForQuestionAnswering (mBART 模型)
- MPNetConfig 配置类：MPNetForQuestionAnswering (MPNet 模型)
- MT5Config 配置类：MT5ForQuestionAnswering (MT5 模型)
- MarkupLMConfig 配置类：MarkupLMForQuestionAnswering (MarkupLM 模型)
- MegaConfig 配置类：MegaForQuestionAnswering (MEGA 模型)
- MegatronBertConfig 配置类：MegatronBertForQuestionAnswering (Megatron-BERT 模型)
- MiniMaxConfig 配置类：MiniMaxForQuestionAnswering (MiniMax 模型)
- MistralConfig 配置类：MistralForQuestionAnswering (Mistral 模型)
- MixtralConfig 配置类：MixtralForQuestionAnswering (Mixtral 模型)
- MobileBertConfig 配置类：MobileBertForQuestionAnswering (MobileBERT 模型)
- ModernBertConfig 配置类：ModernBertForQuestionAnswering (ModernBERT 模型)
- MptConfig 配置类：MptForQuestionAnswering (MPT 模型)
- MraConfig 配置类：MraForQuestionAnswering (MRA 模型)
- MvpConfig 配置类：MvpForQuestionAnswering (MVP 模型)
- NemotronConfig 配置类：NemotronForQuestionAnswering (Nemotron 模型)
- NezhaConfig 配置类：NezhaForQuestionAnswering (Nezha 模型)
- NystromformerConfig 配置类：NystromformerForQuestionAnswering (Nyströmformer 模型)
- OPTConfig 配置类：OPTForQuestionAnswering (OPT 模型)
- QDQBertConfig 配置类：QDQBertForQuestionAnswering (QDQBert 模型)
- Qwen2Config 配置类：Qwen2ForQuestionAnswering (Qwen2 模型)
- Qwen2MoeConfig 配置类：Qwen2MoeForQuestionAnswering (Qwen2MoE 模型)
- Qwen3Config 配置类：Qwen3ForQuestionAnswering (Qwen3 模型)
- Qwen3MoeConfig 配置类：Qwen3MoeForQuestionAnswering (Qwen3MoE 模型)
- ReformerConfig 配置类：ReformerForQuestionAnswering (Reformer 模型)
- RemBertConfig 配置类：RemBertForQuestionAnswering (RemBERT 模型)
- RoCBertConfig 配置类：RoCBertForQuestionAnswering (RoCBert 模型)
- RoFormerConfig 配置类：RoFormerForQuestionAnswering (RoFormer 模型)
- RobertaConfig 配置类：RobertaForQuestionAnswering (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类：RobertaPreLayerNormForQuestionAnswering (RoBERTa-PreLayerNorm 模型)
- SmolLM3Config 配置类：SmolLM3ForQuestionAnswering (SmolLM3 模型)
- SplinterConfig 配置类：SplinterForQuestionAnswering (Splinter 模型)
- SqueezeBertConfig 配置类：SqueezeBertForQuestionAnswering (SqueezeBERT 模型)
- T5Config 配置类：T5ForQuestionAnswering (T5 模型)
- UMT5Config 配置类：UMT5ForQuestionAnswering (UMT5 模型)
- XLMConfig 配置类：XLMForQuestionAnsweringSimple (XLM 模型)
- XLMRobertaConfig 配置类：XLMRobertaForQuestionAnswering (XLM-RoBERTa 模型)
- XLMRobertaXLConfig 配置类：XLMRobertaXLForQuestionAnswering (XLM-RoBERTa-XL 模型)
- XLNetConfig 配置类：XLNetForQuestionAnsweringSimple (XLNet 模型)
- XmodConfig 配置类：XmodForQuestionAnswering (X-MOD 模型)
- YosoConfig 配置类：YosoForQuestionAnswering (YOSO 模型)
attn_implementation (str, optional) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

从配置中实例化库中的一个模型类（带有问答头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如：./my_model_directory/。
- 一个 *TensorFlow 索引检查点文件*的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，from_tf 应该设置为 True，并且应该提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型，然后加载 PyTorch 模型要慢。
model_args (额外的位置参数, optional) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型通过提供本地目录作为 pretrained_model_name_or_path 加载，并且在该目录中找到了名为 *config.json* 的配置文件。
state_dict (dict[str, torch.Tensor], optional) — 一个状态字典，用于代替从保存的权重文件中加载的状态字典。

如果你想从预训练的配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，你应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str 或 os.PathLike, optional) — 如果不想使用标准缓存，可以指定一个目录路径，用于缓存下载的预训练模型配置。
from_tf (bool, optional, defaults to False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认续传。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], optional) — 一个按协议或端点使用的代理服务器字典，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, optional, defaults to False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, optional, defaults to False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统来存储 huggingface.co 上的模型和其他构件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，要用于 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统来存储 huggingface.co 上的模型和其他构件，所以 revision 可以是 git 允许的任何标识符。
kwargs (附加关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。行为方式取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有问答头）。

albert — AlbertForQuestionAnswering (ALBERT 模型)
arcee — ArceeForQuestionAnswering (Arcee 模型)
bart — BartForQuestionAnswering (BART 模型)
bert — BertForQuestionAnswering (BERT 模型)
big_bird — BigBirdForQuestionAnswering (BigBird 模型)
bigbird_pegasus — BigBirdPegasusForQuestionAnswering (BigBird-Pegasus 模型)
bloom — BloomForQuestionAnswering (BLOOM 模型)
camembert — CamembertForQuestionAnswering (CamemBERT 模型)
canine — CanineForQuestionAnswering (CANINE 模型)
convbert — ConvBertForQuestionAnswering (ConvBERT 模型)
data2vec-text — Data2VecTextForQuestionAnswering (Data2VecText 模型)
deberta — DebertaForQuestionAnswering (DeBERTa 模型)
deberta-v2 — DebertaV2ForQuestionAnswering (DeBERTa-v2 模型)
diffllama — DiffLlamaForQuestionAnswering (DiffLlama 模型)
distilbert — DistilBertForQuestionAnswering (DistilBERT 模型)
electra — ElectraForQuestionAnswering (ELECTRA 模型)
ernie — ErnieForQuestionAnswering (ERNIE 模型)
ernie_m — ErnieMForQuestionAnswering (ErnieM 模型)
falcon — FalconForQuestionAnswering (Falcon 模型)
flaubert — FlaubertForQuestionAnsweringSimple (FlauBERT 模型)
fnet — FNetForQuestionAnswering (FNet 模型)
funnel — FunnelForQuestionAnswering (Funnel Transformer 模型)
gpt2 — GPT2ForQuestionAnswering (OpenAI GPT-2 模型)
gpt_neo — GPTNeoForQuestionAnswering (GPT Neo 模型)
gpt_neox — GPTNeoXForQuestionAnswering (GPT NeoX 模型)
gptj — GPTJForQuestionAnswering (GPT-J 模型)
ibert — IBertForQuestionAnswering (I-BERT 模型)
layoutlmv2 — LayoutLMv2ForQuestionAnswering (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)
led — LEDForQuestionAnswering (LED 模型)
lilt — LiltForQuestionAnswering (LiLT 模型)
llama — LlamaForQuestionAnswering (LLaMA 模型)
longformer — LongformerForQuestionAnswering (Longformer 模型)
luke — LukeForQuestionAnswering (LUKE 模型)
lxmert — LxmertForQuestionAnswering (LXMERT 模型)
markuplm — MarkupLMForQuestionAnswering (MarkupLM 模型)
mbart — MBartForQuestionAnswering (mBART 模型)
mega — MegaForQuestionAnswering (MEGA 模型)
megatron-bert — MegatronBertForQuestionAnswering (Megatron-BERT 模型)
minimax — MiniMaxForQuestionAnswering (MiniMax 模型)
mistral — MistralForQuestionAnswering (Mistral 模型)
mixtral — MixtralForQuestionAnswering (Mixtral 模型)
mobilebert — MobileBertForQuestionAnswering (MobileBERT 模型)
modernbert — ModernBertForQuestionAnswering (ModernBERT 模型)
mpnet — MPNetForQuestionAnswering (MPNet 模型)
mpt — MptForQuestionAnswering (MPT 模型)
mra — MraForQuestionAnswering (MRA 模型)
mt5 — MT5ForQuestionAnswering (MT5 模型)
mvp — MvpForQuestionAnswering (MVP 模型)
nemotron — NemotronForQuestionAnswering (Nemotron 模型)
nezha — NezhaForQuestionAnswering (Nezha 模型)
nystromformer — NystromformerForQuestionAnswering (Nyströmformer 模型)
opt — OPTForQuestionAnswering (OPT 模型)
qdqbert — QDQBertForQuestionAnswering (QDQBert 模型)
qwen2 — Qwen2ForQuestionAnswering (Qwen2 模型)
qwen2_moe — Qwen2MoeForQuestionAnswering (Qwen2MoE 模型)
qwen3 — Qwen3ForQuestionAnswering (Qwen3 模型)
qwen3_moe — Qwen3MoeForQuestionAnswering (Qwen3MoE 模型)
reformer — ReformerForQuestionAnswering (Reformer 模型)
rembert — RemBertForQuestionAnswering (RemBERT 模型)
roberta — RobertaForQuestionAnswering (RoBERTa 模型)
roberta-prelayernorm — RobertaPreLayerNormForQuestionAnswering (RoBERTa-PreLayerNorm 模型)
roc_bert — RoCBertForQuestionAnswering (RoCBert 模型)
roformer — RoFormerForQuestionAnswering (RoFormer 模型)
smollm3 — SmolLM3ForQuestionAnswering (SmolLM3 模型)
splinter — SplinterForQuestionAnswering (Splinter 模型)
squeezebert — SqueezeBertForQuestionAnswering (SqueezeBERT 模型)
t5 — T5ForQuestionAnswering (T5 模型)
umt5 — UMT5ForQuestionAnswering (UMT5 模型)
xlm — XLMForQuestionAnsweringSimple (XLM 模型)
xlm-roberta — XLMRobertaForQuestionAnswering (XLM-RoBERTa 模型)
xlm-roberta-xl — XLMRobertaXLForQuestionAnswering (XLM-RoBERTa-XL 模型)
xlnet — XLNetForQuestionAnsweringSimple (XLNet 模型)
xmod — XmodForQuestionAnswering (X-MOD 模型)
yoso — YosoForQuestionAnswering (YOSO 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForQuestionAnswering.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForQuestionAnswering

class transformers.TFAutoModelForQuestionAnswering

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有问答头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- AlbertConfig 配置类: TFAlbertForQuestionAnswering (ALBERT 模型)
- BertConfig 配置类: TFBertForQuestionAnswering (BERT 模型)
- CamembertConfig 配置类: TFCamembertForQuestionAnswering (CamemBERT 模型)
- ConvBertConfig 配置类: TFConvBertForQuestionAnswering (ConvBERT 模型)
- DebertaConfig 配置类: TFDebertaForQuestionAnswering (DeBERTa 模型)
- DebertaV2Config 配置类: TFDebertaV2ForQuestionAnswering (DeBERTa-v2 模型)
- DistilBertConfig 配置类: TFDistilBertForQuestionAnswering (DistilBERT 模型)
- ElectraConfig 配置类: TFElectraForQuestionAnswering (ELECTRA 模型)
- FlaubertConfig 配置类: TFFlaubertForQuestionAnsweringSimple (FlauBERT 模型)
- FunnelConfig 配置类: TFFunnelForQuestionAnswering (Funnel Transformer 模型)
- GPTJConfig 配置类: TFGPTJForQuestionAnswering (GPT-J 模型)
- LayoutLMv3Config 配置类: TFLayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)
- LongformerConfig 配置类: TFLongformerForQuestionAnswering (Longformer 模型)
- MPNetConfig 配置类: TFMPNetForQuestionAnswering (MPNet 模型)
- MobileBertConfig 配置类: TFMobileBertForQuestionAnswering (MobileBERT 模型)
- RemBertConfig 配置类: TFRemBertForQuestionAnswering (RemBERT 模型)
- RoFormerConfig 配置类: TFRoFormerForQuestionAnswering (RoFormer 模型)
- RobertaConfig 配置类: TFRobertaForQuestionAnswering (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类: TFRobertaPreLayerNormForQuestionAnswering (RoBERTa-PreLayerNorm 模型)
- XLMConfig 配置类: TFXLMForQuestionAnsweringSimple (XLM 模型)
- XLMRobertaConfig 配置类: TFXLMRobertaForQuestionAnswering (XLM-RoBERTa 模型)
- XLNetConfig 配置类: TFXLNetForQuestionAnsweringSimple (XLNet 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

从配置中实例化库中的一个模型类（带有问答头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的 model id。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个指向 PyTorch state_dict 保存文件 的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型然后加载 TensorFlow 模型要慢。
model_args (附加位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。配置可以在以下情况下自动加载：
- 模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型通过提供本地目录作为 pretrained_model_name_or_path 加载，并且在该目录中找到了名为 config.json 的配置文件。
cache_dir (str or os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件中加载模型权重（参见 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。所有下载现在在可能时都默认恢复。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理用于每个请求。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统来存储 huggingface.co 上的模型和其他构件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，要用于 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统来存储 huggingface.co 上的模型和其他构件，所以 revision 可以是 git 允许的任何标识符。
kwargs (附加关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。行为方式取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有问答头）。

albert — TFAlbertForQuestionAnswering (ALBERT 模型)
bert — TFBertForQuestionAnswering (BERT 模型)
camembert — TFCamembertForQuestionAnswering (CamemBERT 模型)
convbert — TFConvBertForQuestionAnswering (ConvBERT 模型)
deberta — TFDebertaForQuestionAnswering (DeBERTa 模型)
deberta-v2 — TFDebertaV2ForQuestionAnswering (DeBERTa-v2 模型)
distilbert — TFDistilBertForQuestionAnswering (DistilBERT 模型)
electra — TFElectraForQuestionAnswering (ELECTRA 模型)
flaubert — TFFlaubertForQuestionAnsweringSimple (FlauBERT 模型)
funnel — TFFunnelForQuestionAnswering (Funnel Transformer 模型)
gptj — TFGPTJForQuestionAnswering (GPT-J 模型)
layoutlmv3 — TFLayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)
longformer — TFLongformerForQuestionAnswering (Longformer 模型)
mobilebert — TFMobileBertForQuestionAnswering (MobileBERT 模型)
mpnet — TFMPNetForQuestionAnswering (MPNet 模型)
rembert — TFRemBertForQuestionAnswering (RemBERT 模型)
roberta — TFRobertaForQuestionAnswering (RoBERTa 模型)
roberta-prelayernorm — TFRobertaPreLayerNormForQuestionAnswering (RoBERTa-PreLayerNorm 模型)
roformer — TFRoFormerForQuestionAnswering (RoFormer 模型)
xlm — TFXLMForQuestionAnsweringSimple (XLM 模型)
xlm-roberta — TFXLMRobertaForQuestionAnswering (XLM-RoBERTa 模型)
xlnet — TFXLNetForQuestionAnsweringSimple (XLNet 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForQuestionAnswering.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForQuestionAnswering

class transformers.FlaxAutoModelForQuestionAnswering

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有问答头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- AlbertConfig 配置类: FlaxAlbertForQuestionAnswering (ALBERT 模型)
- BartConfig 配置类: FlaxBartForQuestionAnswering (BART 模型)
- BertConfig 配置类: FlaxBertForQuestionAnswering (BERT 模型)
- BigBirdConfig 配置类: FlaxBigBirdForQuestionAnswering (BigBird 模型)
- DistilBertConfig 配置类: FlaxDistilBertForQuestionAnswering (DistilBERT 模型)
- ElectraConfig 配置类: FlaxElectraForQuestionAnswering (ELECTRA 模型)
- MBartConfig 配置类: FlaxMBartForQuestionAnswering (mBART 模型)
- RoFormerConfig 配置类: FlaxRoFormerForQuestionAnswering (RoFormer 模型)
- RobertaConfig 配置类: FlaxRobertaForQuestionAnswering (RoBERTa 模型)
- RobertaPreLayerNormConfig 配置类: FlaxRobertaPreLayerNormForQuestionAnswering (RoBERTa-PreLayerNorm 模型)
- XLMRobertaConfig 配置类: FlaxXLMRobertaForQuestionAnswering (XLM-RoBERTa 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

从配置中实例化库中的一个模型类（带有问答头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的 model id。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个指向 PyTorch state_dict 保存文件 的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型然后加载 TensorFlow 模型要慢。
model_args (附加位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。配置可以在以下情况下自动加载：
- 模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型通过提供本地目录作为 pretrained_model_name_or_path 加载，并且在该目录中找到了名为 config.json 的配置文件。
cache_dir (str or os.PathLike, optional) — 目录路径，如果不想使用标准缓存，则下载的预训练模型配置将缓存到此目录中。
from_pt (bool, optional, defaults to False) — 从 PyTorch 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], optional) — 代理服务器字典，用于按协议或端点指定代理，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。每次请求都会使用这些代理。
output_loading_info(bool, optional, defaults to False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, optional, defaults to False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许 Hub 上自定义模型在其自己的建模文件中定义。此选项只应为你信任且已阅读其代码的仓库设置为 True，因为它将在你的本地计算机上执行 Hub 上的代码。
code_revision (str, optional, defaults to "main") — 如果代码位于与模型其余部分不同的仓库中，则指定要用于 Hub 上代码的特定修订版。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (附加关键字参数, optional) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载配置：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设对配置的所有相关更新都已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应于任何配置属性的其余键将被传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有问答头）。

albert — FlaxAlbertForQuestionAnswering (ALBERT 模型)
bart — FlaxBartForQuestionAnswering (BART 模型)
bert — FlaxBertForQuestionAnswering (BERT 模型)
big_bird — FlaxBigBirdForQuestionAnswering (BigBird 模型)
distilbert — FlaxDistilBertForQuestionAnswering (DistilBERT 模型)
electra — FlaxElectraForQuestionAnswering (ELECTRA 模型)
mbart — FlaxMBartForQuestionAnswering (mBART 模型)
roberta — FlaxRobertaForQuestionAnswering (RoBERTa 模型)
roberta-prelayernorm — FlaxRobertaPreLayerNormForQuestionAnswering (RoBERTa-PreLayerNorm 模型)
roformer — FlaxRoFormerForQuestionAnswering (RoFormer 模型)
xlm-roberta — FlaxXLMRobertaForQuestionAnswering (XLM-RoBERTa 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForQuestionAnswering.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForTextEncoding

class transformers.AutoModelForTextEncoding

（ *args **kwargs ）

TFAutoModelForTextEncoding

class transformers.TFAutoModelForTextEncoding

（ *args **kwargs ）

计算机视觉

以下 auto 类可用于以下计算机视觉任务。

AutoModelForDepthEstimation

class transformers.AutoModelForDepthEstimation

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有深度估计头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- DPTConfig 配置类：DPTForDepthEstimation (DPT 模型)
- DepthAnythingConfig 配置类：DepthAnythingForDepthEstimation (Depth Anything 模型)
- DepthProConfig 配置类：DepthProForDepthEstimation (DepthPro 模型)
- GLPNConfig 配置类：GLPNForDepthEstimation (GLPN 模型)
- PromptDepthAnythingConfig 配置类：PromptDepthAnythingForDepthEstimation (PromptDepthAnything 模型)
- ZoeDepthConfig 配置类：ZoeDepthForDepthEstimation (ZoeDepth 模型)
attn_implementation (str, optional) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

从配置中实例化库中的一个模型类（带有深度估计头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForDepthEstimation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForDepthEstimation.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 上的模型仓库中的预训练模型的 model id。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个指向 tensorflow index checkpoint file 的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，应将 from_tf 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后再加载 PyTorch 模型要慢。
model_args (附加位置参数, optional) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用于模型的配置，而不是自动加载的配置。配置可以在以下情况下自动加载：
- 模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型使用 save_pretrained() 保存，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到名为 config.json 的配置文件。
state_dict (dict[str, torch.Tensor], optional) — 一个状态字典，用于代替从保存的权重文件中加载的状态字典。

如果你想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，你应该检查使用 save_pretrained() 和 from_pretrained() 是否是更简单的选择。
cache_dir (str or os.PathLike, optional) — 目录路径，如果不想使用标准缓存，则下载的预训练模型配置将缓存到此目录中。
from_tf (bool, optional, defaults to False) — 从 TensorFlow 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], optional) — 代理服务器字典，用于按协议或端点指定代理，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。每次请求都会使用这些代理。
output_loading_info(bool, optional, defaults to False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, optional, defaults to False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许 Hub 上自定义模型在其自己的建模文件中定义。此选项只应为你信任且已阅读其代码的仓库设置为 True，因为它将在你的本地计算机上执行 Hub 上的代码。
code_revision (str, optional, defaults to "main") — 如果代码位于与模型其余部分不同的仓库中，则指定要用于 Hub 上代码的特定修订版。它可以是分支名、标签名或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (附加关键字参数, optional) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载配置：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设对配置的所有相关更新都已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应于任何配置属性的其余键将被传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有深度估计头）。

depth_anything — DepthAnythingForDepthEstimation (Depth Anything 模型)
depth_pro — DepthProForDepthEstimation (DepthPro 模型)
dpt — DPTForDepthEstimation (DPT 模型)
glpn — GLPNForDepthEstimation (GLPN 模型)
prompt_depth_anything — PromptDepthAnythingForDepthEstimation (PromptDepthAnything 模型)
zoedepth — ZoeDepthForDepthEstimation (ZoeDepth 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForDepthEstimation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForDepthEstimation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForDepthEstimation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForImageClassification

class transformers.AutoModelForImageClassification

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有图像分类头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- BeitConfig 配置类：BeitForImageClassification (BEiT 模型)
- BitConfig 配置类：BitForImageClassification (BiT 模型)
- CLIPConfig 配置类：CLIPForImageClassification (CLIP 模型)
- ConvNextConfig 配置类：ConvNextForImageClassification (ConvNeXT 模型)
- ConvNextV2Config 配置类：ConvNextV2ForImageClassification (ConvNeXTV2 模型)
- CvtConfig 配置类：CvtForImageClassification (CvT 模型)
- Data2VecVisionConfig 配置类：Data2VecVisionForImageClassification (Data2VecVision 模型)
- DeiTConfig 配置类：DeiTForImageClassification 或 DeiTForImageClassificationWithTeacher (DeiT 模型)
- DinatConfig 配置类：DinatForImageClassification (DiNAT 模型)
- Dinov2Config 配置类：Dinov2ForImageClassification (DINOv2 模型)
- Dinov2WithRegistersConfig 配置类：Dinov2WithRegistersForImageClassification (DINOv2 with Registers 模型)
- DonutSwinConfig 配置类：DonutSwinForImageClassification (DonutSwin 模型)
- EfficientFormerConfig 配置类：EfficientFormerForImageClassification 或 EfficientFormerForImageClassificationWithTeacher (EfficientFormer 模型)
- EfficientNetConfig 配置类：EfficientNetForImageClassification (EfficientNet 模型)
- FocalNetConfig 配置类：FocalNetForImageClassification (FocalNet 模型)
- HGNetV2Config 配置类：HGNetV2ForImageClassification (HGNet-V2 模型)
- HieraConfig 配置类：HieraForImageClassification (Hiera 模型)
- IJepaConfig 配置类：IJepaForImageClassification (I-JEPA 模型)
- ImageGPTConfig 配置类：ImageGPTForImageClassification (ImageGPT 模型)
- LevitConfig 配置类：LevitForImageClassification 或 LevitForImageClassificationWithTeacher (LeViT 模型)
- MobileNetV1Config 配置类：MobileNetV1ForImageClassification (MobileNetV1 模型)
- MobileNetV2Config 配置类：MobileNetV2ForImageClassification (MobileNetV2 模型)
- MobileViTConfig 配置类：MobileViTForImageClassification (MobileViT 模型)
- MobileViTV2Config 配置类：MobileViTV2ForImageClassification (MobileViTV2 模型)
- NatConfig 配置类：NatForImageClassification (NAT 模型)
- PerceiverConfig 配置类：PerceiverForImageClassificationLearned 或 PerceiverForImageClassificationFourier 或 PerceiverForImageClassificationConvProcessing (Perceiver 模型)
- PoolFormerConfig 配置类：PoolFormerForImageClassification (PoolFormer 模型)
- PvtConfig 配置类：PvtForImageClassification (PVT 模型)
- PvtV2Config 配置类：PvtV2ForImageClassification (PVTv2 模型)
- RegNetConfig 配置类：RegNetForImageClassification (RegNet 模型)
- ResNetConfig 配置类：ResNetForImageClassification (ResNet 模型)
- SegformerConfig 配置类：SegformerForImageClassification (SegFormer 模型)
- ShieldGemma2Config 配置类：ShieldGemma2ForImageClassification (Shieldgemma2 模型)
- Siglip2Config 配置类：Siglip2ForImageClassification (SigLIP2 模型)
- SiglipConfig 配置类：SiglipForImageClassification (SigLIP 模型)
- SwiftFormerConfig 配置类：SwiftFormerForImageClassification (SwiftFormer 模型)
- SwinConfig 配置类：SwinForImageClassification (Swin Transformer 模型)
- Swinv2Config 配置类：Swinv2ForImageClassification (Swin Transformer V2 模型)
- TextNetConfig 配置类：TextNetForImageClassification (TextNet 模型)
- TimmWrapperConfig 配置类：TimmWrapperForImageClassification (TimmWrapperModel 模型)
- VanConfig 配置类：VanForImageClassification (VAN 模型)
- ViTConfig 配置类：ViTForImageClassification (ViT 模型)
- ViTHybridConfig 配置类：ViTHybridForImageClassification (ViT Hybrid 模型)
- ViTMSNConfig 配置类：ViTMSNForImageClassification (ViTMSN 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（手动的注意力实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认为手动的 "eager" 实现。

从一个配置中实例化库中的一个模型类（带有图像分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的模型 ID。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个指向 TensorFlow 索引检查点文件的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，应将 from_tf 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型，然后再加载 PyTorch 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而非自动加载的配置。配置可以在以下情况下自动加载：
- 该模型是库提供的模型（使用预训练模型的模型 ID 字符串加载）。
- 该模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 一个状态字典，用于代替从保存的权重文件中加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 一个按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型文件中定义自定义模型。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则要使用的 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从一个预训练模型中实例化库中的一个模型类（带有图像分类头）。

beit — BeitForImageClassification (BEiT 模型)
bit — BitForImageClassification (BiT 模型)
clip — CLIPForImageClassification (CLIP 模型)
convnext — ConvNextForImageClassification (ConvNeXT 模型)
convnextv2 — ConvNextV2ForImageClassification (ConvNeXTV2 模型)
cvt — CvtForImageClassification (CvT 模型)
data2vec-vision — Data2VecVisionForImageClassification (Data2VecVision 模型)
deit — DeiTForImageClassification 或 DeiTForImageClassificationWithTeacher (DeiT 模型)
dinat — DinatForImageClassification (DiNAT 模型)
dinov2 — Dinov2ForImageClassification (DINOv2 模型)
dinov2_with_registers — Dinov2WithRegistersForImageClassification (DINOv2 with Registers 模型)
donut-swin — DonutSwinForImageClassification (DonutSwin 模型)
efficientformer — EfficientFormerForImageClassification 或 EfficientFormerForImageClassificationWithTeacher (EfficientFormer 模型)
efficientnet — EfficientNetForImageClassification (EfficientNet 模型)
focalnet — FocalNetForImageClassification (FocalNet 模型)
hgnet_v2 — HGNetV2ForImageClassification (HGNet-V2 模型)
hiera — HieraForImageClassification (Hiera 模型)
ijepa — IJepaForImageClassification (I-JEPA 模型)
imagegpt — ImageGPTForImageClassification (ImageGPT 模型)
levit — LevitForImageClassification 或 LevitForImageClassificationWithTeacher (LeViT 模型)
mobilenet_v1 — MobileNetV1ForImageClassification (MobileNetV1 模型)
mobilenet_v2 — MobileNetV2ForImageClassification (MobileNetV2 模型)
mobilevit — MobileViTForImageClassification (MobileViT 模型)
mobilevitv2 — MobileViTV2ForImageClassification (MobileViTV2 模型)
nat — NatForImageClassification (NAT 模型)
perceiver — PerceiverForImageClassificationLearned 或 PerceiverForImageClassificationFourier 或 PerceiverForImageClassificationConvProcessing (Perceiver 模型)
poolformer — PoolFormerForImageClassification (PoolFormer 模型)
pvt — PvtForImageClassification (PVT 模型)
pvt_v2 — PvtV2ForImageClassification (PVTv2 模型)
regnet — RegNetForImageClassification (RegNet 模型)
resnet — ResNetForImageClassification (ResNet 模型)
segformer — SegformerForImageClassification (SegFormer 模型)
shieldgemma2 — ShieldGemma2ForImageClassification (Shieldgemma2 模型)
siglip — SiglipForImageClassification (SigLIP 模型)
siglip2 — Siglip2ForImageClassification (SigLIP2 模型)
swiftformer — SwiftFormerForImageClassification (SwiftFormer 模型)
swin — SwinForImageClassification (Swin Transformer 模型)
swinv2 — Swinv2ForImageClassification (Swin Transformer V2 模型)
textnet — TextNetForImageClassification (TextNet 模型)
timm_wrapper — TimmWrapperForImageClassification (TimmWrapperModel 模型)
van — VanForImageClassification (VAN 模型)
vit — ViTForImageClassification (ViT 模型)
vit_hybrid — ViTHybridForImageClassification (ViT Hybrid 模型)
vit_msn — ViTMSNForImageClassification (ViTMSN 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForImageClassification

class transformers.TFAutoModelForImageClassification

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有图像分类头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- ConvNextConfig 配置类：TFConvNextForImageClassification (ConvNeXT 模型)
- ConvNextV2Config 配置类：TFConvNextV2ForImageClassification (ConvNeXTV2 模型)
- CvtConfig 配置类：TFCvtForImageClassification (CvT 模型)
- Data2VecVisionConfig 配置类：TFData2VecVisionForImageClassification (Data2VecVision 模型)
- DeiTConfig 配置类：TFDeiTForImageClassification 或 TFDeiTForImageClassificationWithTeacher (DeiT 模型)
- EfficientFormerConfig 配置类：TFEfficientFormerForImageClassification 或 TFEfficientFormerForImageClassificationWithTeacher (EfficientFormer 模型)
- MobileViTConfig 配置类：TFMobileViTForImageClassification (MobileViT 模型)
- RegNetConfig 配置类：TFRegNetForImageClassification (RegNet 模型)
- ResNetConfig 配置类：TFResNetForImageClassification (ResNet 模型)
- SegformerConfig 配置类：TFSegformerForImageClassification (SegFormer 模型)
- SwiftFormerConfig 配置类：TFSwiftFormerForImageClassification (SwiftFormer 模型)
- SwinConfig 配置类：TFSwinForImageClassification (Swin Transformer 模型)
- ViTConfig 配置类：TFViTForImageClassification (ViT 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（手动的注意力实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认为手动的 "eager" 实现。

从一个配置中实例化库中的一个模型类（带有图像分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForImageClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的模型 ID。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个指向 PyTorch state_dict 保存文件的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，应将 from_pt 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型，然后再加载 TensorFlow 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而非自动加载的配置。配置可以在以下情况下自动加载：
- 该模型是库提供的模型（使用预训练模型的模型 ID 字符串加载）。
- 该模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 一个按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型文件中定义自定义模型。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则要使用的 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从一个预训练模型中实例化库中的一个模型类（带有图像分类头）。

convnext — TFConvNextForImageClassification (ConvNeXT 模型)
convnextv2 — TFConvNextV2ForImageClassification (ConvNeXTV2 模型)
cvt — TFCvtForImageClassification (CvT 模型)
data2vec-vision — TFData2VecVisionForImageClassification (Data2VecVision 模型)
deit — TFDeiTForImageClassification 或 TFDeiTForImageClassificationWithTeacher (DeiT 模型)
efficientformer — TFEfficientFormerForImageClassification 或 TFEfficientFormerForImageClassificationWithTeacher (EfficientFormer 模型)
mobilevit — TFMobileViTForImageClassification (MobileViT 模型)
regnet — TFRegNetForImageClassification (RegNet 模型)
resnet — TFResNetForImageClassification (ResNet 模型)
segformer — TFSegformerForImageClassification (SegFormer 模型)
swiftformer — TFSwiftFormerForImageClassification (SwiftFormer 模型)
swin — TFSwinForImageClassification (Swin Transformer 模型)
vit — TFViTForImageClassification (ViT 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForImageClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForImageClassification

class transformers.FlaxAutoModelForImageClassification

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有图像分类头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- BeitConfig 配置类：FlaxBeitForImageClassification (BEiT 模型)
- Dinov2Config 配置类：FlaxDinov2ForImageClassification (DINOv2 模型)
- RegNetConfig 配置类：FlaxRegNetForImageClassification (RegNet 模型)
- ResNetConfig 配置类：FlaxResNetForImageClassification (ResNet 模型)
- ViTConfig 配置类：FlaxViTForImageClassification (ViT 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

从一个配置中实例化库中的一个模型类（带有图像分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForImageClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个指向 *PyTorch state_dict 保存文件* 的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型然后加载 TensorFlow 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（通过预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
cache_dir (str 或 os.PathLike, 可选) — 当不应使用标准缓存时，用于缓存下载的预训练模型配置的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用且被忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次请求时使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则使用 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为取决于是否提供 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中与配置属性对应的每个键都将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从一个预训练模型中实例化库中的一个模型类（带有图像分类头）。

beit — FlaxBeitForImageClassification (BEiT 模型)
dinov2 — FlaxDinov2ForImageClassification (DINOv2 模型)
regnet — FlaxRegNetForImageClassification (RegNet 模型)
resnet — FlaxResNetForImageClassification (ResNet 模型)
vit — FlaxViTForImageClassification (ViT 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForImageClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForVideoClassification

class transformers.AutoModelForVideoClassification

（ *args **kwargs ）

这是一个通用模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有视频分类头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- TimesformerConfig 配置类：TimesformerForVideoClassification (TimeSformer 模型)
- VJEPA2Config 配置类：VJEPA2ForVideoClassification (VJEPA2Model 模型)
- VideoMAEConfig 配置类：VideoMAEForVideoClassification (VideoMAE 模型)
- VivitConfig 配置类：VivitForVideoClassification (ViViT 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

从一个配置实例化库中的某个模型类（带有视频分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForVideoClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForVideoClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个指向 *tensorflow 索引检查点文件* 的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，from_tf 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后加载 PyTorch 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（通过预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 要使用的状态字典，而不是从保存的权重文件中加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str 或 os.PathLike, 可选) — 当不应使用标准缓存时，用于缓存下载的预训练模型配置的目录路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用且被忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次请求时使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则使用 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为取决于是否提供 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中与配置属性对应的每个键都将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有视频分类头）。

timesformer — TimesformerForVideoClassification (TimeSformer 模型)
videomae — VideoMAEForVideoClassification (VideoMAE 模型)
vivit — VivitForVideoClassification (ViViT 模型)
vjepa2 — VJEPA2ForVideoClassification (VJEPA2Model 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForVideoClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForVideoClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForVideoClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForKeypointDetection

class transformers.AutoModelForKeypointDetection

（ *args **kwargs ）

AutoModelForMaskedImageModeling

class transformers.AutoModelForMaskedImageModeling

（ *args **kwargs ）

这是一个通用模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有掩码图像建模头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- DeiTConfig 配置类：DeiTForMaskedImageModeling (DeiT 模型)
- FocalNetConfig 配置类：FocalNetForMaskedImageModeling (FocalNet 模型)
- SwinConfig 配置类：SwinForMaskedImageModeling (Swin Transformer 模型)
- Swinv2Config 配置类：Swinv2ForMaskedImageModeling (Swin Transformer V2 模型)
- ViTConfig 配置类：ViTForMaskedImageModeling (ViT 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

从一个配置实例化库中的某个模型类（带有掩码图像建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForMaskedImageModeling

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForMaskedImageModeling.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个指向 *tensorflow 索引检查点文件* 的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，from_tf 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后加载 PyTorch 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（通过预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 要使用的状态字典，而不是从保存的权重文件中加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str 或 os.PathLike, 可选) — 当不应使用标准缓存时，用于缓存下载的预训练模型配置的目录路径。
from_tf (bool, 可选, 默认为 False) — 是否从 TensorFlow 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 一个根据协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其他部分位于不同的仓库中，指定要用于 Hub 上代码的特定修订版。它可以是分支名、标签名或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (附加关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。行为方式取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新都已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键都将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有掩码图像建模头）。

deit — DeiTForMaskedImageModeling (DeiT model)
focalnet — FocalNetForMaskedImageModeling (FocalNet model)
swin — SwinForMaskedImageModeling (Swin Transformer model)
swinv2 — Swinv2ForMaskedImageModeling (Swin Transformer V2 model)
vit — ViTForMaskedImageModeling (ViT model)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForMaskedImageModeling

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForMaskedImageModeling.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForMaskedImageModeling

class transformers.TFAutoModelForMaskedImageModeling

（ *args **kwargs ）

这是一个通用模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有掩码图像建模头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- DeiTConfig 配置类：TFDeiTForMaskedImageModeling (DeiT model)
- SwinConfig 配置类：TFSwinForMaskedImageModeling (Swin Transformer model)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

从一个配置实例化库中的某个模型类（带有掩码图像建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForMaskedImageModeling

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForMaskedImageModeling.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的 model id。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个 PyTorch state_dict 保存文件的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型然后加载 TensorFlow 模型要慢。
model_args (附加位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型使用 save_pretrained() 保存，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
cache_dir (str or os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存的目录路径。
from_pt (bool, 可选, 默认为 False) — 是否从 PyTorch 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 一个根据协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其他部分位于不同的仓库中，指定要用于 Hub 上代码的特定修订版。它可以是分支名、标签名或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (附加关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。行为方式取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新都已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键都将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有掩码图像建模头）。

deit — TFDeiTForMaskedImageModeling (DeiT model)
swin — TFSwinForMaskedImageModeling (Swin Transformer model)

示例

>>> from transformers import AutoConfig, TFAutoModelForMaskedImageModeling

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForMaskedImageModeling.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForMaskedImageModeling.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForObjectDetection

class transformers.AutoModelForObjectDetection

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有目标检测头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- ConditionalDetrConfig 配置类：ConditionalDetrForObjectDetection (Conditional DETR model)
- DFineConfig 配置类：DFineForObjectDetection (D-FINE model)
- DabDetrConfig 配置类：DabDetrForObjectDetection (DAB-DETR model)
- DeformableDetrConfig 配置类：DeformableDetrForObjectDetection (Deformable DETR model)
- DetaConfig 配置类：DetaForObjectDetection (DETA model)
- DetrConfig 配置类：DetrForObjectDetection (DETR model)
- RTDetrConfig 配置类：RTDetrForObjectDetection (RT-DETR model)
- RTDetrV2Config 配置类：RTDetrV2ForObjectDetection (RT-DETRv2 model)
- TableTransformerConfig 配置类：TableTransformerForObjectDetection (Table Transformer model)
- YolosConfig 配置类：YolosForObjectDetection (YOLOS model)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有目标检测头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForObjectDetection

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForObjectDetection.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的 model id。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个 TensorFlow 索引检查点文件的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，from_tf 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后加载 PyTorch 模型要慢。
model_args (附加位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型使用 save_pretrained() 保存，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 使用的状态字典，而不是从保存的权重文件中加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是更简单的选项。
cache_dir (str or os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存的目录路径。
from_tf (bool, 可选, 默认为 False) — 是否从 TensorFlow 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 一个根据协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名、标签名或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其他部分位于不同的仓库中，指定要用于 Hub 上代码的特定修订版。它可以是分支名、标签名或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (附加关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。行为方式取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新都已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键都将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有目标检测头）。

conditional_detr — ConditionalDetrForObjectDetection (Conditional DETR model)
d_fine — DFineForObjectDetection (D-FINE model)
dab-detr — DabDetrForObjectDetection (DAB-DETR model)
deformable_detr — DeformableDetrForObjectDetection (Deformable DETR model)
deta — DetaForObjectDetection (DETA model)
detr — DetrForObjectDetection (DETR 模型)
rt_detr — RTDetrForObjectDetection (RT-DETR 模型)
rt_detr_v2 — RTDetrV2ForObjectDetection (RT-DETRv2 模型)
table-transformer — TableTransformerForObjectDetection (Table Transformer 模型)
yolos — YolosForObjectDetection (YOLOS 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForObjectDetection

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForObjectDetection.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForImageSegmentation

class transformers.AutoModelForImageSegmentation

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有图像分割头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- DetrConfig 配置类：DetrForSegmentation (DETR 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有图像分割头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForImageSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageSegmentation.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 模型仓库中的预训练模型的 model id。
- 一个指向包含使用 save_pretrained() 保存的模型权重的目录的路径，例如 ./my_model_directory/。
- 一个指向 tensorflow 索引检查点文件的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，应将 from_tf 设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后再加载 PyTorch 模型要慢。
model_args (其他位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过将本地目录作为 pretrained_model_name_or_path 提供来加载模型，并且在目录中找到了名为 config.json 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 要使用的状态字典，而不是从保存的权重文件中加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。所有下载现在在可能的情况下都会默认恢复。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次请求时使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为那些您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地机器上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则用于 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (其他关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为根据是否提供了 config 或自动加载而有所不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新都已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类初始化函数 (from_pretrained())。每个与配置属性对应的 kwargs 键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有图像分割头）。

detr — DetrForSegmentation (DETR 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForImageSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForImageSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForImageToImage

class transformers.AutoModelForImageToImage

（ *args **kwargs ）

AutoModelForSemanticSegmentation

class transformers.AutoModelForSemanticSegmentation

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有语义分割头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- BeitConfig 配置类： BeitForSemanticSegmentation (BEiT 模型)
- DPTConfig 配置类： DPTForSemanticSegmentation (DPT 模型)
- Data2VecVisionConfig 配置类： Data2VecVisionForSemanticSegmentation (Data2VecVision 模型)
- MobileNetV2Config 配置类： MobileNetV2ForSemanticSegmentation (MobileNetV2 模型)
- MobileViTConfig 配置类： MobileViTForSemanticSegmentation (MobileViT 模型)
- MobileViTV2Config 配置类： MobileViTV2ForSemanticSegmentation (MobileViTV2 模型)
- SegformerConfig 配置类： SegformerForSemanticSegmentation (SegFormer 模型)
- UperNetConfig 配置类： UperNetForSemanticSegmentation (UPerNet 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有语义分割头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForSemanticSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSemanticSegmentation.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 模型仓库中的预训练模型的 model id。
- 一个指向包含使用 save_pretrained() 保存的模型权重的目录的路径，例如 ./my_model_directory/。
- 一个指向 tensorflow 索引检查点文件的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，应将 from_tf 设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后再加载 PyTorch 模型要慢。
model_args (其他位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过将本地目录作为 pretrained_model_name_or_path 提供来加载模型，并且在目录中找到了名为 config.json 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 要使用的状态字典，而不是从保存的权重文件中加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。所有下载现在在可能的情况下都会默认恢复。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次请求时使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为那些您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地机器上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则用于 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (其他关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为根据是否提供了 config 或自动加载而有所不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新都已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类初始化函数 (from_pretrained())。每个与配置属性对应的 kwargs 键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有语义分割头）。

beit — BeitForSemanticSegmentation (BEiT 模型)
data2vec-vision — Data2VecVisionForSemanticSegmentation (Data2VecVision 模型)
dpt — DPTForSemanticSegmentation (DPT 模型)
mobilenet_v2 — MobileNetV2ForSemanticSegmentation (MobileNetV2 模型)
mobilevit — MobileViTForSemanticSegmentation (MobileViT 模型)
mobilevitv2 — MobileViTV2ForSemanticSegmentation (MobileViTV2 模型)
segformer — SegformerForSemanticSegmentation (SegFormer 模型)
upernet — UperNetForSemanticSegmentation (UPerNet 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForSemanticSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSemanticSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForSemanticSegmentation

class transformers.TFAutoModelForSemanticSegmentation

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有语义分割头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- Data2VecVisionConfig 配置类：TFData2VecVisionForSemanticSegmentation (Data2VecVision 模型)
- MobileViTConfig 配置类：TFMobileViTForSemanticSegmentation (MobileViT 模型)
- SegformerConfig 配置类：TFSegformerForSemanticSegmentation (SegFormer 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有语义分割头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForSemanticSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForSemanticSegmentation.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 模型仓库中的预训练模型的*模型 ID*。
- 一个指向包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如：./my_model_directory/。
- 一个指向 *PyTorch state_dict 保存文件*的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型后再加载 TensorFlow 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型通过提供本地目录作为 pretrained_model_name_or_path 加载，并且在该目录中找到了名为 config.json 的配置文件。
cache_dir (str or os.PathLike, 可选) — 下载的预训练模型配置应缓存的目录路径，如果不应使用标准缓存。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件中加载模型权重（参见 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 一个协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。可以是一个分支名、一个标签名或一个提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用它们自己的建模文件。此选项只应为那些您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果 Hub 上的代码位于与模型其余部分不同的仓库中，则使用该代码的特定修订版。可以是一个分支名、一个标签名或一个提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（在加载后）和初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载配置：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新都已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有语义分割头）。

data2vec-vision — TFData2VecVisionForSemanticSegmentation (Data2VecVision 模型)
mobilevit — TFMobileViTForSemanticSegmentation (MobileViT 模型)
segformer — TFSegformerForSemanticSegmentation (SegFormer 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForSemanticSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForSemanticSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSemanticSegmentation.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForInstanceSegmentation

class transformers.AutoModelForInstanceSegmentation

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有实例分割头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- MaskFormerConfig 配置类：MaskFormerForInstanceSegmentation (MaskFormer 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有实例分割头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForInstanceSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForInstanceSegmentation.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 模型仓库中的预训练模型的*模型 ID*。
- 一个指向包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如：./my_model_directory/。
- 一个指向 *tensorflow 索引检查点文件*的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，from_tf 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型后再加载 PyTorch 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型通过提供本地目录作为 pretrained_model_name_or_path 加载，并且在该目录中找到了名为 config.json 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 一个状态字典，用于替代从保存的权重文件加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否是更简单的选择。
cache_dir (str or os.PathLike, 可选) — 下载的预训练模型配置应缓存的目录路径，如果不应使用标准缓存。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件中加载模型权重（参见 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 一个协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。可以是一个分支名、一个标签名或一个提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用它们自己的建模文件。此选项只应为那些您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果 Hub 上的代码位于与模型其余部分不同的仓库中，则使用该代码的特定修订版。可以是一个分支名、一个标签名或一个提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（在加载后）和初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载配置：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新都已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有实例分割头）。

maskformer — MaskFormerForInstanceSegmentation (MaskFormer 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForInstanceSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForInstanceSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForInstanceSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForUniversalSegmentation

class transformers.AutoModelForUniversalSegmentation

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有通用图像分割头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- DetrConfig 配置类：DetrForSegmentation (DETR 模型)
- Mask2FormerConfig 配置类：Mask2FormerForUniversalSegmentation (Mask2Former 模型)
- MaskFormerConfig 配置类：MaskFormerForInstanceSegmentation (MaskFormer 模型)
- OneFormerConfig 配置类：OneFormerForUniversalSegmentation (OneFormer 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有通用图像分割头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForUniversalSegmentation

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForUniversalSegmentation.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 模型仓库中的预训练模型的*模型 ID*。
- 一个指向包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如：./my_model_directory/。
- 一个指向 *tensorflow 索引检查点文件*的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，from_tf 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型后再加载 PyTorch 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型通过提供本地目录作为 pretrained_model_name_or_path 加载，并且在该目录中找到了名为 config.json 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 一个状态字典，用于替代从保存的权重文件加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否是更简单的选择。
cache_dir (str or os.PathLike, 可选) — 下载的预训练模型配置应缓存的目录路径，如果不应使用标准缓存。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。现在所有下载在可能的情况下都默认支持断点续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许 Hub 上自定义模型在其自己的建模文件中定义。此选项只应为受信任且已阅读代码的仓库设置为 True，因为它将在本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则指定 Hub 上要使用的特定代码版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如 output_attentions=True）。其行为取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有对配置的相关更新已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键将用于以提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有通用图像分割头）。

detr — DetrForSegmentation (DETR 模型)
mask2former — Mask2FormerForUniversalSegmentation (Mask2Former 模型)
maskformer — MaskFormerForInstanceSegmentation (MaskFormer 模型)
oneformer — OneFormerForUniversalSegmentation (OneFormer 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForUniversalSegmentation

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForUniversalSegmentation.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForUniversalSegmentation.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForZeroShotImageClassification

class transformers.AutoModelForZeroShotImageClassification

（ *args **kwargs ）

这是一个通用模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有零样本图像分类头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- AlignConfig 配置类：AlignModel (ALIGN 模型)
- AltCLIPConfig 配置类：AltCLIPModel (AltCLIP 模型)
- Blip2Config 配置类：Blip2ForImageTextRetrieval (BLIP-2 模型)
- BlipConfig 配置类：BlipModel (BLIP 模型)
- CLIPConfig 配置类：CLIPModel (CLIP 模型)
- CLIPSegConfig 配置类：CLIPSegModel (CLIPSeg 模型)
- ChineseCLIPConfig 配置类：ChineseCLIPModel (Chinese-CLIP 模型)
- Siglip2Config 配置类：Siglip2Model (SigLIP2 模型)
- SiglipConfig 配置类：SiglipModel (SigLIP 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

从配置中实例化库中的一个模型类（带有零样本图像分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForZeroShotImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForZeroShotImageClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个*tensorflow 索引检查点文件*的路径或 URL（例如 ./tf_model/model.ckpt.index）。在这种情况下，from_tf 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型，然后再加载 PyTorch 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 模型的配置，用于替代自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID* 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型通过提供本地目录作为 pretrained_model_name_or_path 加载，并且在该目录中找到了名为 *config.json* 的配置 JSON 文件。
state_dict (dict[str, torch.Tensor], 可选) — 一个状态字典，用于代替从保存的权重文件中加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str or os.PathLike, 可选) — 一个目录的路径，用于缓存下载的预训练模型配置，如果不想使用标准缓存目录。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。现在所有下载在可能的情况下都默认支持断点续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许 Hub 上自定义模型在其自己的建模文件中定义。此选项只应为受信任且已阅读代码的仓库设置为 True，因为它将在本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则指定 Hub 上要使用的特定代码版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如 output_attentions=True）。其行为取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有对配置的相关更新已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键将用于以提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有零样本图像分类头）。

align — AlignModel (ALIGN 模型)
altclip — AltCLIPModel (AltCLIP 模型)
blip — BlipModel (BLIP 模型)
blip-2 — Blip2ForImageTextRetrieval (BLIP-2 模型)
chinese_clip — ChineseCLIPModel (Chinese-CLIP 模型)
clip — CLIPModel (CLIP 模型)
clipseg — CLIPSegModel (CLIPSeg 模型)
siglip — SiglipModel (SigLIP 模型)
siglip2 — Siglip2Model (SigLIP2 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForZeroShotImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForZeroShotImageClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForZeroShotImageClassification

class transformers.TFAutoModelForZeroShotImageClassification

（ *args **kwargs ）

这是一个通用模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有零样本图像分类头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- BlipConfig 配置类：TFBlipModel (BLIP 模型)
- CLIPConfig 配置类：TFCLIPModel (CLIP 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

从配置中实例化库中的一个模型类（带有零样本图像分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForZeroShotImageClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForZeroShotImageClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个*PyTorch state_dict 保存文件*的路径或 URL（例如 ./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型，然后再加载 TensorFlow 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 模型的配置，用于替代自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID* 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型通过提供本地目录作为 pretrained_model_name_or_path 加载，并且在该目录中找到了名为 *config.json* 的配置 JSON 文件。
cache_dir (str or os.PathLike, 可选) — 一个目录的路径，用于缓存下载的预训练模型配置，如果不想使用标准缓存目录。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。现在所有下载在可能的情况下都默认支持断点续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许 Hub 上自定义模型在其自己的建模文件中定义。此选项只应为受信任且已阅读代码的仓库设置为 True，因为它将在本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则指定 Hub 上要使用的特定代码版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如 output_attentions=True）。其行为取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有对配置的相关更新已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键将用于以提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有零样本图像分类头）。

blip — TFBlipModel (BLIP 模型)
clip — TFCLIPModel (CLIP 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForZeroShotImageClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForZeroShotImageClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForZeroShotImageClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForZeroShotObjectDetection

class transformers.AutoModelForZeroShotObjectDetection

（ *args **kwargs ）

这是一个通用模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有零样本目标检测头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- GroundingDinoConfig 配置类： GroundingDinoForObjectDetection (Grounding DINO 模型)
- OmDetTurboConfig 配置类： OmDetTurboForObjectDetection (OmDet-Turbo 模型)
- OwlViTConfig 配置类： OwlViTForObjectDetection (OWL-ViT 模型)
- Owlv2Config 配置类： Owlv2ForObjectDetection (OWLv2 模型)
attn_implementation (str, 可选) — 模型中使用的注意力实现（如果相关）。可以是 "eager"（手动实现注意力）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认为手动的 "eager" 实现。

从一个配置中实例化一个库中的模型类（带有零样本对象检测头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForZeroShotObjectDetection

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForZeroShotObjectDetection.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 上的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如 ./my_model_directory/。
- 一个指向 *tensorflow 索引检查点文件* 的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，应将 from_tf 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型，然后再加载 PyTorch 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 模型的配置，用于替代自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID* 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型通过提供本地目录作为 pretrained_model_name_or_path 加载，并且在该目录中找到了名为 *config.json* 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 用于代替从已保存的权重文件中加载的状态字典。

如果你想从预训练的配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，你应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存的目录路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认恢复。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每个请求上使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。可以是一个分支名、一个标签名或一个提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为你信任的且已阅读其代码的仓库设置为 True，因为它将在你的本地机器上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则用于 Hub 上代码的特定修订版。可以是一个分支名、一个标签名或一个提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（在加载后）和初始化模型（例如，output_attentions=True）。其行为因是否提供 config 或自动加载而异：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新都已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中每个与配置属性对应的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化一个库中的模型类（带有零样本对象检测头）。

grounding-dino — GroundingDinoForObjectDetection (Grounding DINO 模型)
omdet-turbo — OmDetTurboForObjectDetection (OmDet-Turbo 模型)
owlv2 — Owlv2ForObjectDetection (OWLv2 模型)
owlvit — OwlViTForObjectDetection (OWL-ViT 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForZeroShotObjectDetection

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForZeroShotObjectDetection.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

音频

以下自动类可用于以下音频任务。

AutoModelForAudioClassification

class transformers.AutoModelForAudioClassification

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中带有音频分类头的模型类之一。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- ASTConfig 配置类：ASTForAudioClassification (Audio Spectrogram Transformer 模型)
- Data2VecAudioConfig 配置类：Data2VecAudioForSequenceClassification (Data2VecAudio 模型)
- HubertConfig 配置类：HubertForSequenceClassification (Hubert 模型)
- SEWConfig 配置类：SEWForSequenceClassification (SEW 模型)
- SEWDConfig 配置类：SEWDForSequenceClassification (SEW-D 模型)
- UniSpeechConfig 配置类：UniSpeechForSequenceClassification (UniSpeech 模型)
- UniSpeechSatConfig 配置类：UniSpeechSatForSequenceClassification (UniSpeechSat 模型)
- Wav2Vec2BertConfig 配置类：Wav2Vec2BertForSequenceClassification (Wav2Vec2-BERT 模型)
- Wav2Vec2Config 配置类：Wav2Vec2ForSequenceClassification (Wav2Vec2 模型)
- Wav2Vec2ConformerConfig 配置类：Wav2Vec2ConformerForSequenceClassification (Wav2Vec2-Conformer 模型)
- WavLMConfig 配置类：WavLMForSequenceClassification (WavLM 模型)
- WhisperConfig 配置类：WhisperForAudioClassification (Whisper 模型)
attn_implementation (str, 可选) — 模型中使用的注意力实现（如果相关）。可以是 "eager"（手动实现注意力）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认为手动的 "eager" 实现。

从一个配置中实例化一个库中的模型类（带有音频分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForAudioClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 上的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如 ./my_model_directory/。
- 一个指向 *tensorflow 索引检查点文件* 的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，应将 from_tf 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型，然后再加载 PyTorch 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 模型的配置，用于替代自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID* 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型通过提供本地目录作为 pretrained_model_name_or_path 加载，并且在该目录中找到了名为 *config.json* 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 用于代替从已保存的权重文件中加载的状态字典。

如果你想从预训练的配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，你应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存的目录路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认恢复。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每个请求上使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。可以是一个分支名、一个标签名或一个提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为你信任的且已阅读其代码的仓库设置为 True，因为它将在你的本地机器上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则用于 Hub 上代码的特定修订版。可以是一个分支名、一个标签名或一个提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（在加载后）和初始化模型（例如，output_attentions=True）。其行为因是否提供 config 或自动加载而异：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新都已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中每个与配置属性对应的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化一个库中的模型类（带有音频分类头）。

audio-spectrogram-transformer — ASTForAudioClassification (Audio Spectrogram Transformer 模型)
data2vec-audio — Data2VecAudioForSequenceClassification (Data2VecAudio 模型)
hubert — HubertForSequenceClassification (Hubert 模型)
sew — SEWForSequenceClassification (SEW 模型)
sew-d — SEWDForSequenceClassification (SEW-D 模型)
unispeech — UniSpeechForSequenceClassification (UniSpeech 模型)
unispeech-sat — UniSpeechSatForSequenceClassification (UniSpeechSat 模型)
wav2vec2 — Wav2Vec2ForSequenceClassification (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2BertForSequenceClassification (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2ConformerForSequenceClassification (Wav2Vec2-Conformer 模型)
wavlm — WavLMForSequenceClassification (WavLM 模型)
whisper — WhisperForAudioClassification (Whisper 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForAudioClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForAudioFrameClassification

class transformers.TFAutoModelForAudioClassification

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中带有音频分类头的模型类之一。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- Wav2Vec2Config 配置类：TFWav2Vec2ForSequenceClassification (Wav2Vec2 模型)
attn_implementation (str, 可选) — 模型中使用的注意力实现（如果相关）。可以是 "eager"（手动实现注意力）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认为手动的 "eager" 实现。

从一个配置中实例化一个库中的模型类（带有音频分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForAudioClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForAudioClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 上的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如 ./my_model_directory/。
- 一个指向 *PyTorch state_dict 保存文件* 的路径或 URL（例如 ./pt_model/pytorch_model.bin）。在这种情况下，应将 from_pt 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型，然后再加载 TensorFlow 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 模型的配置，用于替代自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID* 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型通过提供本地目录作为 pretrained_model_name_or_path 加载，并且在该目录中找到了名为 *config.json* 的配置文件。
cache_dir (str or os.PathLike, optional) — 目录路径，如果不想使用标准缓存，下载的预训练模型配置将缓存到该目录中。
from_pt (bool, optional, defaults to False) — 从 PyTorch 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，若存在缓存版本则覆盖它们。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都默认支持断点续传。将在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 一个字典，用于指定按协议或端点使用的代理服务器，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将用于每个请求。
output_loading_info(bool, optional, defaults to False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, optional, defaults to False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上的自定义模型文件中定义模型。此选项只应在你信任的且已阅读过代码的仓库中设置为 True，因为它将在你的本地计算机上执行 Hub 上的代码。
code_revision (str, optional, defaults to "main") — 如果 Hub 上的代码与模型的其余部分不在同一个仓库中，要使用的特定代码版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (additional keyword arguments, optional) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为因是否提供 config 或自动加载而异：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有对配置的相关更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化一个库中的模型类（带有音频分类头）。

wav2vec2 — TFWav2Vec2ForSequenceClassification (Wav2Vec2 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForAudioClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForAudioClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForAudioClassification.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

TFAutoModelForAudioFrameClassification

class transformers.AutoModelForAudioFrameClassification

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有音频帧（词元）分类头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- Data2VecAudioConfig 配置类：Data2VecAudioForAudioFrameClassification (Data2VecAudio 模型)
- UniSpeechSatConfig 配置类：UniSpeechSatForAudioFrameClassification (UniSpeechSat 模型)
- Wav2Vec2BertConfig 配置类：Wav2Vec2BertForAudioFrameClassification (Wav2Vec2-BERT 模型)
- Wav2Vec2Config 配置类：Wav2Vec2ForAudioFrameClassification (Wav2Vec2 模型)
- Wav2Vec2ConformerConfig 配置类：Wav2Vec2ConformerForAudioFrameClassification (Wav2Vec2-Conformer 模型)
- WavLMConfig 配置类：WavLMForAudioFrameClassification (WavLM 模型)
attn_implementation (str, optional) — 模型中要使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动实现的 "eager"。

根据配置实例化库中的一个模型类（带有音频帧（词元）分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForAudioFrameClassification

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioFrameClassification.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 字符串，即 huggingface.co 上模型仓库中托管的预训练模型的 *模型 ID*。
- 包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- *tensorflow 索引检查点文件*的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，应将 from_tf 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后加载 PyTorch 模型要慢。
model_args (additional positional arguments, optional) — 将传递给底层模型 __init__() 方法。
config (PretrainedConfig, optional) — 用于模型的配置，而不是自动加载的配置。配置可以在以下情况下自动加载：
- 模型是库提供的模型（使用预训练模型的 *模型 ID* 字符串加载）。
- 模型使用 save_pretrained() 保存，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到名为 *config.json* 的配置 JSON 文件。
state_dict (dict[str, torch.Tensor], optional) — 一个状态字典，用于代替从已保存的权重文件中加载的状态字典。

如果您想从预训练的配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是更简单的选择。
cache_dir (str or os.PathLike, optional) — 目录路径，如果不想使用标准缓存，下载的预训练模型配置将缓存到该目录中。
from_tf (bool, optional, defaults to False) — 从 TensorFlow 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，若存在缓存版本则覆盖它们。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都默认支持断点续传。将在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 一个字典，用于指定按协议或端点使用的代理服务器，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将用于每个请求。
output_loading_info(bool, optional, defaults to False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, optional, defaults to False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上的自定义模型文件中定义模型。此选项只应在你信任的且已阅读过代码的仓库中设置为 True，因为它将在你的本地计算机上执行 Hub 上的代码。
code_revision (str, optional, defaults to "main") — 如果 Hub 上的代码与模型的其余部分不在同一个仓库中，要使用的特定代码版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (additional keyword arguments, optional) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为因是否提供 config 或自动加载而异：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有对配置的相关更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有音频帧（词元）分类头）。

data2vec-audio — Data2VecAudioForAudioFrameClassification (Data2VecAudio 模型)
unispeech-sat — UniSpeechSatForAudioFrameClassification (UniSpeechSat 模型)
wav2vec2 — Wav2Vec2ForAudioFrameClassification (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2BertForAudioFrameClassification (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2ConformerForAudioFrameClassification (Wav2Vec2-Conformer 模型)
wavlm — WavLMForAudioFrameClassification (WavLM 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForAudioFrameClassification

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioFrameClassification.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioFrameClassification.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForCTC

class transformers.AutoModelForCTC

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的某个模型类（带有连接时序分类头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- Data2VecAudioConfig 配置类：Data2VecAudioForCTC (Data2VecAudio 模型)
- HubertConfig 配置类：HubertForCTC (Hubert 模型)
- MCTCTConfig 配置类：MCTCTForCTC (M-CTC-T 模型)
- SEWConfig 配置类：SEWForCTC (SEW 模型)
- SEWDConfig 配置类：SEWDForCTC (SEW-D 模型)
- UniSpeechConfig 配置类：UniSpeechForCTC (UniSpeech 模型)
- UniSpeechSatConfig 配置类：UniSpeechSatForCTC (UniSpeechSat 模型)
- Wav2Vec2BertConfig 配置类：Wav2Vec2BertForCTC (Wav2Vec2-BERT 模型)
- Wav2Vec2Config 配置类：Wav2Vec2ForCTC (Wav2Vec2 模型)
- Wav2Vec2ConformerConfig 配置类：Wav2Vec2ConformerForCTC (Wav2Vec2-Conformer 模型)
- WavLMConfig 配置类：WavLMForCTC (WavLM 模型)
attn_implementation (str, optional) — 模型中要使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动实现的 "eager"。

根据配置实例化库中的一个模型类（带有连接时序分类头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForCTC

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForCTC.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 字符串，即 huggingface.co 上模型仓库中托管的预训练模型的 *模型 ID*。
- 包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- *tensorflow 索引检查点文件*的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，应将 from_tf 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后加载 PyTorch 模型要慢。
model_args (additional positional arguments, optional) — 将传递给底层模型 __init__() 方法。
config (PretrainedConfig, optional) — 用于模型的配置，而不是自动加载的配置。配置可以在以下情况下自动加载：
- 模型是库提供的模型（使用预训练模型的 *模型 ID* 字符串加载）。
- 模型使用 save_pretrained() 保存，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到名为 *config.json* 的配置 JSON 文件。
state_dict (dict[str, torch.Tensor], optional) — 一个状态字典，用于代替从已保存的权重文件中加载的状态字典。

如果您想从预训练的配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是更简单的选择。
cache_dir (str or os.PathLike, optional) — 目录路径，如果不想使用标准缓存，下载的预训练模型配置将缓存到该目录中。
from_tf (bool, optional, defaults to False) — 是否从 TensorFlow 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，如果存在缓存版本则覆盖它们。
resume_download — 已弃用并忽略。现在，所有下载在可能的情况下都会默认恢复。将在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次请求时使用。
output_loading_info(bool, optional, defaults to False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, optional, defaults to False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, optional, defaults to "main") — 如果代码位于与模型其余部分不同的仓库中，则指定用于 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (additional keyword arguments, optional) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。行为方式根据是否提供 config 或自动加载而有所不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键都将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有连接主义时间分类头）。

data2vec-audio — Data2VecAudioForCTC (Data2VecAudio 模型)
hubert — HubertForCTC (Hubert 模型)
mctct — MCTCTForCTC (M-CTC-T 模型)
sew — SEWForCTC (SEW 模型)
sew-d — SEWDForCTC (SEW-D 模型)
unispeech — UniSpeechForCTC (UniSpeech 模型)
unispeech-sat — UniSpeechSatForCTC (UniSpeechSat 模型)
wav2vec2 — Wav2Vec2ForCTC (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2BertForCTC (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2ConformerForCTC (Wav2Vec2-Conformer 模型)
wavlm — WavLMForCTC (WavLM 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForCTC

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForCTC.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForCTC.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForSpeechSeq2Seq

class transformers.AutoModelForSpeechSeq2Seq

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有序列到序列语音转文本建模头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- DiaConfig 配置类：DiaForConditionalGeneration (Dia 模型)
- GraniteSpeechConfig 配置类：GraniteSpeechForConditionalGeneration (GraniteSpeech 模型)
- KyutaiSpeechToTextConfig 配置类：KyutaiSpeechToTextForConditionalGeneration (KyutaiSpeechToText 模型)
- MoonshineConfig 配置类：MoonshineForConditionalGeneration (Moonshine 模型)
- Pop2PianoConfig 配置类：Pop2PianoForConditionalGeneration (Pop2Piano 模型)
- SeamlessM4TConfig 配置类：SeamlessM4TForSpeechToText (SeamlessM4T 模型)
- SeamlessM4Tv2Config 配置类：SeamlessM4Tv2ForSpeechToText (SeamlessM4Tv2 模型)
- Speech2TextConfig 配置类：Speech2TextForConditionalGeneration (Speech2Text 模型)
- SpeechEncoderDecoderConfig 配置类：SpeechEncoderDecoderModel (Speech Encoder decoder 模型)
- SpeechT5Config 配置类：SpeechT5ForSpeechToText (SpeechT5 模型)
- WhisperConfig 配置类：WhisperForConditionalGeneration (Whisper 模型)
attn_implementation (str, optional) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

从配置实例化库中的一个模型类（带有序列到序列语音转文本建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForSpeechSeq2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForSpeechSeq2Seq.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如 ./my_model_directory/。
- 一个指向*tensorflow 索引检查点文件*的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，from_tf 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型，然后再加载 PyTorch 模型要慢。
model_args (additional positional arguments, optional) — 将传递给底层模型 __init__() 方法的其他位置参数。
config (PretrainedConfig, optional) — 用于模型的配置，而不是自动加载的配置。配置可以在以下情况下自动加载：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型使用 save_pretrained() 保存，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在目录中找到了名为 *config.json* 的配置文件。
state_dict (dict[str, torch.Tensor], optional) — 要使用的状态字典，而不是从保存的权重文件加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，则可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是更简单的选项。
cache_dir (str or os.PathLike, optional) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_tf (bool, optional, defaults to False) — 是否从 TensorFlow 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，如果存在缓存版本则覆盖它们。
resume_download — 已弃用并忽略。现在，所有下载在可能的情况下都会默认恢复。将在 Transformers v5 中移除。
proxies (dict[str, str], optional) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次请求时使用。
output_loading_info(bool, optional, defaults to False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, optional, defaults to False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, optional, defaults to "main") — 如果代码位于与模型其余部分不同的仓库中，则指定用于 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (additional keyword arguments, optional) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。行为方式根据是否提供 config 或自动加载而有所不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键都将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有序列到序列语音转文本建模头）。

dia — DiaForConditionalGeneration (Dia 模型)
granite_speech — GraniteSpeechForConditionalGeneration (GraniteSpeech 模型)
kyutai_speech_to_text — KyutaiSpeechToTextForConditionalGeneration (KyutaiSpeechToText 模型)
moonshine — MoonshineForConditionalGeneration (Moonshine 模型)
pop2piano — Pop2PianoForConditionalGeneration (Pop2Piano 模型)
seamless_m4t — SeamlessM4TForSpeechToText (SeamlessM4T 模型)
seamless_m4t_v2 — SeamlessM4Tv2ForSpeechToText (SeamlessM4Tv2 模型)
speech-encoder-decoder — SpeechEncoderDecoderModel (Speech Encoder decoder 模型)
speech_to_text — Speech2TextForConditionalGeneration (Speech2Text 模型)
speecht5 — SpeechT5ForSpeechToText (SpeechT5 模型)
whisper — WhisperForConditionalGeneration (Whisper 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForSpeechSeq2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForSpeechSeq2Seq.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForSpeechSeq2Seq

class transformers.TFAutoModelForSpeechSeq2Seq

（ *args **kwargs ）

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- Speech2TextConfig 配置类：TFSpeech2TextForConditionalGeneration (Speech2Text 模型)
- WhisperConfig 配置类：TFWhisperForConditionalGeneration (Whisper 模型)
attn_implementation (str, optional) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

从配置实例化库中的一个模型类（带有序列到序列语音转文本建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForSpeechSeq2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForSpeechSeq2Seq.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*的路径，例如 ./my_model_directory/。
- 一个指向*PyTorch state_dict 保存文件*的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型，然后再加载 TensorFlow 模型要慢。
model_args (additional positional arguments, optional) — 将传递给底层模型 __init__() 方法的其他位置参数。
config (PretrainedConfig, optional) — 用于模型的配置，而不是自动加载的配置。配置可以在以下情况下自动加载：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型使用 save_pretrained() 保存，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在目录中找到了名为 *config.json* 的配置文件。
cache_dir (str or os.PathLike, optional) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_pt (bool, optional, defaults to False) — 是否从 PyTorch 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，如果存在缓存版本则覆盖它们。
resume_download — 已弃用并忽略。现在，所有下载在可能的情况下都会默认恢复。将在 Transformers v5 中移除。
proxies (dict[str, str], 可选) — 一个用于按协议或端点指定代理服务器的字典，例如：{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交ID，因为我们使用基于git的系统在huggingface.co上存储模型和其他工件，所以 revision 可以是git允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为受信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则指定要用于 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交ID，因为我们使用基于git的系统在huggingface.co上存储模型和其他工件，所以 revision 可以是git允许的任何标识符。
kwargs (附加关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为方式根据是否提供 config 或自动加载而有所不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有对配置的相关更新已经完成）
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中与配置属性对应的每个键都将用于覆盖该属性的值。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有序列到序列语音转文本建模头）。

speech_to_text — TFSpeech2TextForConditionalGeneration (Speech2Text 模型)
whisper — TFWhisperForConditionalGeneration (Whisper 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForSpeechSeq2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForSpeechSeq2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForSpeechSeq2Seq

class transformers.FlaxAutoModelForSpeechSeq2Seq

（ *args **kwargs ）

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- SpeechEncoderDecoderConfig 配置类: FlaxSpeechEncoderDecoderModel (语音编码器-解码器模型)
- WhisperConfig 配置类: FlaxWhisperForConditionalGeneration (Whisper 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

从配置实例化库中的一个模型类（带有序列到序列语音转文本建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForSpeechSeq2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个*目录*的路径，该目录包含使用 save_pretrained() 保存的模型权重，例如 ./my_model_directory/。
- 一个*PyTorch state_dict 保存文件*的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型后再加载 TensorFlow 模型要慢。
model_args (附加位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在目录中找到了名为 config.json 的配置文件。
cache_dir (str or os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件加载模型权重（参见 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。所有下载现在在可能的情况下都会默认恢复。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 一个用于按协议或端点指定代理服务器的字典，例如：{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交ID，因为我们使用基于git的系统在huggingface.co上存储模型和其他工件，所以 revision 可以是git允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为受信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则指定要用于 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交ID，因为我们使用基于git的系统在huggingface.co上存储模型和其他工件，所以 revision 可以是git允许的任何标识符。
kwargs (附加关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为方式根据是否提供 config 或自动加载而有所不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有对配置的相关更新已经完成）
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中与配置属性对应的每个键都将用于覆盖该属性的值。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有序列到序列语音转文本建模头）。

speech-encoder-decoder — FlaxSpeechEncoderDecoderModel (语音编码器-解码器模型)
whisper — FlaxWhisperForConditionalGeneration (Whisper 模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForSpeechSeq2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForSpeechSeq2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForAudioXVector

class transformers.AutoModelForAudioXVector

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有通过 x-vector 进行音频检索的头部）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- Data2VecAudioConfig 配置类: Data2VecAudioForXVector (Data2VecAudio 模型)
- UniSpeechSatConfig 配置类: UniSpeechSatForXVector (UniSpeechSat 模型)
- Wav2Vec2BertConfig 配置类: Wav2Vec2BertForXVector (Wav2Vec2-BERT 模型)
- Wav2Vec2Config 配置类: Wav2Vec2ForXVector (Wav2Vec2 模型)
- Wav2Vec2ConformerConfig 配置类: Wav2Vec2ConformerForXVector (Wav2Vec2-Conformer 模型)
- WavLMConfig 配置类: WavLMForXVector (WavLM 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有通过 x-vector 进行音频检索的头部）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForAudioXVector

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioXVector.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个*目录*的路径，该目录包含使用 save_pretrained() 保存的模型权重，例如 ./my_model_directory/。
- 一个*tensorflow 索引检查点文件*的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，from_tf 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型后再加载 PyTorch 模型要慢。
model_args (附加位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在目录中找到了名为 config.json 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 用于替代从已保存权重文件加载的状态字典的状态字典。

如果您想从预训练的配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str or os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件加载模型权重（参见 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。所有下载现在在可能的情况下都会默认恢复。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 一个用于按协议或端点指定代理服务器的字典，例如：{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交ID，因为我们使用基于git的系统在huggingface.co上存储模型和其他工件，所以 revision 可以是git允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为受信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码与模型的其余部分位于不同的仓库中，则指定要用于 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交ID，因为我们使用基于git的系统在huggingface.co上存储模型和其他工件，所以 revision 可以是git允许的任何标识符。
kwargs (附加关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为方式根据是否提供 config 或自动加载而有所不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有对配置的相关更新已经完成）
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数 (from_pretrained())。kwargs 中与配置属性对应的每个键都将用于覆盖该属性的值。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有通过 x-vector 进行音频检索的头部）。

data2vec-audio — Data2VecAudioForXVector (Data2VecAudio 模型)
unispeech-sat — UniSpeechSatForXVector (UniSpeechSat 模型)
wav2vec2 — Wav2Vec2ForXVector (Wav2Vec2 模型)
wav2vec2-bert — Wav2Vec2BertForXVector (Wav2Vec2-BERT 模型)
wav2vec2-conformer — Wav2Vec2ConformerForXVector (Wav2Vec2-Conformer 模型)
wavlm — WavLMForXVector (WavLM 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForAudioXVector

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioXVector.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioXVector.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForTextToSpectrogram

class transformers.AutoModelForTextToSpectrogram

（ *args **kwargs ）

AutoModelForTextToWaveform

class transformers.AutoModelForTextToWaveform

（ *args **kwargs ）

AutoModelForAudioTokenization

class transformers.AutoModelForAudioTokenization

（ *args **kwargs ）

这是一个通用的模型类，当通过 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有一个通过码本进行音频分词的头部）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- DacConfig 配置类：DacModel (DAC 模型)
attn_implementation (str, optional) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，对于 torch>=2.1.1，将使用 SDPA。否则，默认是手动的 "eager" 实现。

从配置中实例化库中的一个模型类（带有一个通过码本进行音频分词的头部）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForAudioTokenization

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForAudioTokenization.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个 *TensorFlow 索引检查点文件*的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，应将 from_tf 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型，然后再加载 PyTorch 模型要慢。
model_args (额外的位置参数, optional) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型通过提供一个本地目录作为 pretrained_model_name_or_path 加载，并且在该目录中找到了名为 *config.json* 的配置文件。
state_dict (dict[str, torch.Tensor], optional) — 一个状态字典，用于代替从保存的权重文件中加载的状态字典。

如果你想从预训练的配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，你应该检查使用 save_pretrained() 和 from_pretrained() 是否是更简单的选项。
cache_dir (str or os.PathLike, optional) — 下载的预训练模型配置应缓存的目录路径，如果不应使用标准缓存。
from_tf (bool, optional, defaults to False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。所有下载现在在可能的情况下都会默认恢复。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], optional) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次请求时使用。
output_loading_info(bool, optional, defaults to False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, optional, defaults to False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为你信任的且已阅读其代码的仓库设置为 True，因为它将在你的本地计算机上执行 Hub 上的代码。
code_revision (str, optional, defaults to "main") — 如果代码与模型的其余部分位于不同的仓库中，则用于 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, optional) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为根据是否提供了 config 或自动加载而有所不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有一个通过码本进行音频分词的头部）。

dac — DacModel (DAC 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForAudioTokenization

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForAudioTokenization.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForAudioTokenization.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForAudioTokenization.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

多模态

以下自动类可用于以下多模态任务。

AutoModelForTableQuestionAnswering

class transformers.AutoModelForTableQuestionAnswering

（ *args **kwargs ）

这是一个通用的模型类，当通过 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有一个表格问答头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- TapasConfig 配置类：TapasForQuestionAnswering (TAPAS 模型)
attn_implementation (str, optional) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，对于 torch>=2.1.1，将使用 SDPA。否则，默认是手动的 "eager" 实现。

从配置中实例化库中的一个模型类（带有一个表格问答头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google/tapas-base-finetuned-wtq")
>>> model = AutoModelForTableQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个 *TensorFlow 索引检查点文件*的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，应将 from_tf 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型，然后再加载 PyTorch 模型要慢。
model_args (额外的位置参数, optional) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型通过提供一个本地目录作为 pretrained_model_name_or_path 加载，并且在该目录中找到了名为 *config.json* 的配置文件。
state_dict (dict[str, torch.Tensor], optional) — 一个状态字典，用于代替从保存的权重文件中加载的状态字典。

如果你想从预训练的配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，你应该检查使用 save_pretrained() 和 from_pretrained() 是否是更简单的选项。
cache_dir (str or os.PathLike, optional) — 下载的预训练模型配置应缓存的目录路径，如果不应使用标准缓存。
from_tf (bool, optional, defaults to False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。所有下载现在在可能的情况下都会默认恢复。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], optional) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每次请求时使用。
output_loading_info(bool, optional, defaults to False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, optional, defaults to False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为你信任的且已阅读其代码的仓库设置为 True，因为它将在你的本地计算机上执行 Hub 上的代码。
code_revision (str, optional, defaults to "main") — 如果代码与模型的其余部分位于不同的仓库中，则用于 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, optional) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为根据是否提供了 config 或自动加载而有所不同：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中每个对应于配置属性的键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有一个表格问答头）。

tapas — TapasForQuestionAnswering (TAPAS 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForTableQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")

>>> # Update configuration during loading
>>> model = AutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/tapas_tf_model_config.json")
>>> model = AutoModelForTableQuestionAnswering.from_pretrained(
...     "./tf_model/tapas_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForTableQuestionAnswering

class transformers.TFAutoModelForTableQuestionAnswering

（ *args **kwargs ）

这是一个通用的模型类，当通过 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有一个表格问答头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- TapasConfig 配置类：TFTapasForQuestionAnswering (TAPAS 模型)
attn_implementation (str, optional) — 在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，对于 torch>=2.1.1，将使用 SDPA。否则，默认是手动的 "eager" 实现。

从配置中实例化库中的一个模型类（带有一个表格问答头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForTableQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google/tapas-base-finetuned-wtq")
>>> model = TFAutoModelForTableQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 模型仓库中的预训练模型的 model id。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个指向 PyTorch state_dict 保存文件的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，应将 from_pt 设置为 True，并且应通过 config 参数提供一个配置对象。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型后再加载 TensorFlow 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，以替代自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件中加载模型权重（参见 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用且被忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 一个字典，包含按协议或端点使用的代理服务器，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则使用 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。行为因是否提供了 config 或自动加载而异：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新都已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型中实例化库中的一个模型类（带有一个表格问答头）。

tapas — TFTapasForQuestionAnswering (TAPAS 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForTableQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq")

>>> # Update configuration during loading
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained("google/tapas-base-finetuned-wtq", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/tapas_pt_model_config.json")
>>> model = TFAutoModelForTableQuestionAnswering.from_pretrained(
...     "./pt_model/tapas_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForDocumentQuestionAnswering

class transformers.AutoModelForDocumentQuestionAnswering

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有文档问答头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- LayoutLMConfig 配置类：LayoutLMForQuestionAnswering (LayoutLM 模型)
- LayoutLMv2Config 配置类：LayoutLMv2ForQuestionAnswering (LayoutLMv2 模型)
- LayoutLMv3Config 配置类：LayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)
attn_implementation (str, 可选) — 要在模型中使用的注意力实现（如果相关）。可以是 "eager"（手动的注意力实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有文档问答头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")
>>> model = AutoModelForDocumentQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 模型仓库中的预训练模型的 model id。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个指向 tensorflow 索引检查点文件的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，应将 from_tf 设置为 True，并且应通过 config 参数提供一个配置对象。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型后再加载 PyTorch 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，以替代自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
state_dict (dict[str, torch.Tensor], 可选) — 一个状态字典，用于替代从保存的权重文件中加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否是更简单的选择。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存的目录路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件中加载模型权重（参见 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用且被忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 一个字典，包含按协议或端点使用的代理服务器，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则使用 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。行为因是否提供了 config 或自动加载而异：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新都已完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有文档问答头）。

layoutlm — LayoutLMForQuestionAnswering (LayoutLM 模型)
layoutlmv2 — LayoutLMv2ForQuestionAnswering (LayoutLMv2 模型)
layoutlmv3 — LayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForDocumentQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")

>>> # Update configuration during loading
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/layoutlm_tf_model_config.json")
>>> model = AutoModelForDocumentQuestionAnswering.from_pretrained(
...     "./tf_model/layoutlm_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForDocumentQuestionAnswering

class transformers.TFAutoModelForDocumentQuestionAnswering

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有文档问答头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- LayoutLMConfig 配置类：TFLayoutLMForQuestionAnswering (LayoutLM 模型)
- LayoutLMv3Config 配置类：TFLayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)
attn_implementation (str, 可选) — 要在模型中使用的注意力实现（如果相关）。可以是 "eager"（手动的注意力实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，对于 torch>=2.1.1 将使用 SDPA。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有文档问答头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForDocumentQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")
>>> model = TFAutoModelForDocumentQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 模型仓库中的预训练模型的 model id。
- 一个包含使用 save_pretrained() 保存的模型权重的目录路径，例如 ./my_model_directory/。
- 一个指向 PyTorch state_dict 保存文件的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，应将 from_pt 设置为 True，并且应通过 config 参数提供一个配置对象。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型后再加载 TensorFlow 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，以替代自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的 model id 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在该目录中找到了名为 config.json 的配置文件。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件中加载模型权重（参见 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用且被忽略。现在所有下载在可能的情况下都会默认断点续传。将在 Transformers v5 版本中移除。
proxies (dict[str, str], 可选) — 一个字典，包含按协议或端点使用的代理服务器，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求中使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上定义的自定义模型使用其自己的建模文件。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则使用 Hub 上代码的特定版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如 output_attentions=True）。其行为因是否提供了 `config` 或自动加载而异：
- 如果通过 `config` 提供了配置，`**kwargs` 将直接传递给底层模型的 `__init__` 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，`kwargs` 将首先传递给配置类的初始化函数 (from_pretrained())。`kwargs` 中与配置属性对应的每个键都将用于使用提供的 `kwargs` 值覆盖该属性。不对应任何配置属性的其余键将被传递给底层模型的 `__init__` 函数。

从预训练模型实例化库中的一个模型类（带有文档问答头）。

layoutlm — TFLayoutLMForQuestionAnswering (LayoutLM 模型)
layoutlmv3 — TFLayoutLMv3ForQuestionAnswering (LayoutLMv3 模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForDocumentQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3")

>>> # Update configuration during loading
>>> model = TFAutoModelForDocumentQuestionAnswering.from_pretrained("impira/layoutlm-document-qa", revision="52e01b3", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/layoutlm_pt_model_config.json")
>>> model = TFAutoModelForDocumentQuestionAnswering.from_pretrained(
...     "./pt_model/layoutlm_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForVisualQuestionAnswering

class transformers.AutoModelForVisualQuestionAnswering

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中带有视觉问答头的模型类之一。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- Blip2Config 配置类：Blip2ForConditionalGeneration (BLIP-2 模型)
- BlipConfig 配置类：BlipForQuestionAnswering (BLIP 模型)
- ViltConfig 配置类：ViltForQuestionAnswering (ViLT 模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现方式（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有视觉问答头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForVisualQuestionAnswering

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("dandelin/vilt-b32-finetuned-vqa")
>>> model = AutoModelForVisualQuestionAnswering.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个指向 *tensorflow 索引检查点文件*的路径或 URL（例如 ./tf_model/model.ckpt.index）。在这种情况下，`from_tf` 应设置为 `True`，并且应提供一个配置对象作为 `config` 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后再加载 PyTorch 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 `__init__()` 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 该模型是由库提供的模型（使用预训练模型的*模型 ID* 字符串加载）。
- 该模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 该模型通过提供一个本地目录作为 `pretrained_model_name_or_path` 来加载，并且在该目录中找到了一个名为 *config.json* 的配置 JSON 文件。
state_dict (dict[str, torch.Tensor], 可选) — 要使用的状态字典，而不是从保存的权重文件中加载的状态字典。

如果您想从预训练的配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录的路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 `pretrained_model_name_or_path` 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖现有的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认恢复。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每个请求上使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 `revision` 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 `True`，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则用于 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 `revision` 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如 output_attentions=True）。其行为因是否提供了 `config` 或自动加载而异：
- 如果通过 `config` 提供了配置，`**kwargs` 将直接传递给底层模型的 `__init__` 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，`kwargs` 将首先传递给配置类的初始化函数 (from_pretrained())。`kwargs` 中与配置属性对应的每个键都将用于使用提供的 `kwargs` 值覆盖该属性。不对应任何配置属性的其余键将被传递给底层模型的 `__init__` 函数。

从预训练模型实例化库中的一个模型类（带有视觉问答头）。

blip — BlipForQuestionAnswering (BLIP 模型)
blip-2 — Blip2ForConditionalGeneration (BLIP-2 模型)
vilt — ViltForQuestionAnswering (ViLT 模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForVisualQuestionAnswering

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa")

>>> # Update configuration during loading
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained("dandelin/vilt-b32-finetuned-vqa", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/vilt_tf_model_config.json")
>>> model = AutoModelForVisualQuestionAnswering.from_pretrained(
...     "./tf_model/vilt_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

AutoModelForVision2Seq

class transformers.AutoModelForVision2Seq

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中带有视觉到文本建模头的模型类之一。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- Blip2Config 配置类：Blip2ForConditionalGeneration (BLIP-2 模型)
- BlipConfig 配置类：BlipForConditionalGeneration (BLIP 模型)
- ChameleonConfig 配置类：ChameleonForConditionalGeneration (Chameleon 模型)
- GitConfig 配置类：GitForCausalLM (GIT 模型)
- Idefics2Config 配置类：Idefics2ForConditionalGeneration (Idefics2 模型)
- Idefics3Config 配置类：Idefics3ForConditionalGeneration (Idefics3 模型)
- InstructBlipConfig 配置类：InstructBlipForConditionalGeneration (InstructBLIP 模型)
- InstructBlipVideoConfig 配置类：InstructBlipVideoForConditionalGeneration (InstructBlipVideo 模型)
- Kosmos2Config 配置类：Kosmos2ForConditionalGeneration (KOSMOS-2 模型)
- LlavaConfig 配置类：LlavaForConditionalGeneration (LLaVa 模型)
- LlavaNextConfig 配置类：LlavaNextForConditionalGeneration (LLaVA-NeXT 模型)
- LlavaNextVideoConfig 配置类：LlavaNextVideoForConditionalGeneration (LLaVa-NeXT-Video 模型)
- LlavaOnevisionConfig 配置类：LlavaOnevisionForConditionalGeneration (LLaVA-Onevision 模型)
- Mistral3Config 配置类：Mistral3ForConditionalGeneration (Mistral3 模型)
- MllamaConfig 配置类：MllamaForConditionalGeneration (Mllama 模型)
- PaliGemmaConfig 配置类：PaliGemmaForConditionalGeneration (PaliGemma 模型)
- Pix2StructConfig 配置类：Pix2StructForConditionalGeneration (Pix2Struct 模型)
- Qwen2VLConfig 配置类：Qwen2VLForConditionalGeneration (Qwen2VL 模型)
- Qwen2_5_VLConfig 配置类：Qwen2_5_VLForConditionalGeneration (Qwen2_5_VL 模型)
- VideoLlavaConfig 配置类：VideoLlavaForConditionalGeneration (VideoLlava 模型)
- VipLlavaConfig 配置类：VipLlavaForConditionalGeneration (VipLlava 模型)
- VisionEncoderDecoderConfig 配置类：VisionEncoderDecoderModel (视觉编码器解码器模型)
attn_implementation (str, 可选) — 在模型中使用的注意力实现方式（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有视觉到文本建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForVision2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForVision2Seq.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个指向 *tensorflow 索引检查点文件*的路径或 URL（例如 ./tf_model/model.ckpt.index）。在这种情况下，`from_tf` 应设置为 `True`，并且应提供一个配置对象作为 `config` 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后再加载 PyTorch 模型要慢。
model_args (额外的位置参数, 可选) — 将传递给底层模型的 `__init__()` 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 该模型是由库提供的模型（使用预训练模型的*模型 ID* 字符串加载）。
- 该模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 该模型通过提供一个本地目录作为 `pretrained_model_name_or_path` 来加载，并且在该目录中找到了一个名为 *config.json* 的配置 JSON 文件。
state_dict (dict[str, torch.Tensor], 可选) — 要使用的状态字典，而不是从保存的权重文件中加载的状态字典。

如果您想从预训练的配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存到的目录的路径。
from_tf (bool, 可选, 默认为 False) — 从 TensorFlow 检查点保存文件中加载模型权重（请参阅 `pretrained_model_name_or_path` 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖现有的缓存版本。
resume_download — 已弃用并被忽略。现在所有下载在可能的情况下都会默认恢复。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每个请求上使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 `revision` 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 `True`，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果代码位于与模型其余部分不同的仓库中，则用于 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统来存储模型和其他工件，所以 `revision` 可以是 git 允许的任何标识符。
kwargs (额外的关键字参数, 可选) — 可用于更新配置对象（加载后）并初始化模型（例如 output_attentions=True）。其行为因是否提供了 `config` 或自动加载而异：
- 如果通过 `config` 提供了配置，`**kwargs` 将直接传递给底层模型的 `__init__` 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，`kwargs` 将首先传递给配置类的初始化函数 (from_pretrained())。`kwargs` 中与配置属性对应的每个键都将用于使用提供的 `kwargs` 值覆盖该属性。不对应任何配置属性的其余键将被传递给底层模型的 `__init__` 函数。

从预训练模型实例化库中的一个模型类（带有视觉到文本建模头）。

blip — BlipForConditionalGeneration (BLIP 模型)
blip-2 — Blip2ForConditionalGeneration (BLIP-2 模型)
chameleon — ChameleonForConditionalGeneration (Chameleon 模型)
git — GitForCausalLM (GIT 模型)
idefics2 — Idefics2ForConditionalGeneration (Idefics2 模型)
idefics3 — Idefics3ForConditionalGeneration (Idefics3 模型)
instructblip — InstructBlipForConditionalGeneration (InstructBLIP 模型)
instructblipvideo — InstructBlipVideoForConditionalGeneration (InstructBlipVideo 模型)
kosmos-2 — Kosmos2ForConditionalGeneration (KOSMOS-2 模型)
llava — LlavaForConditionalGeneration (LLaVa 模型)
llava_next — LlavaNextForConditionalGeneration (LLaVA-NeXT 模型)
llava_next_video — LlavaNextVideoForConditionalGeneration (LLaVa-NeXT-Video 模型)
llava_onevision — LlavaOnevisionForConditionalGeneration (LLaVA-Onevision 模型)
mistral3 — Mistral3ForConditionalGeneration (Mistral3 模型)
mllama — MllamaForConditionalGeneration (Mllama 模型)
paligemma — PaliGemmaForConditionalGeneration (PaliGemma 模型)
pix2struct — Pix2StructForConditionalGeneration (Pix2Struct 模型)
qwen2_5_vl — Qwen2_5_VLForConditionalGeneration (Qwen2_5_VL 模型)
qwen2_vl — Qwen2VLForConditionalGeneration (Qwen2VL 模型)
video_llava — VideoLlavaForConditionalGeneration (VideoLlava 模型)
vipllava — VipLlavaForConditionalGeneration (VipLlava 模型)
vision-encoder-decoder — VisionEncoderDecoderModel (视觉编码器解码器模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForVision2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForVision2Seq.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

TFAutoModelForVision2Seq

class transformers.TFAutoModelForVision2Seq

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中带有视觉到文本建模头的模型类之一。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- BlipConfig 配置类：TFBlipForConditionalGeneration (BLIP 模型)
- VisionEncoderDecoderConfig 配置类：TFVisionEncoderDecoderModel (视觉编码器-解码器模型)
attn_implementation (str, 可选) — 模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一个。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有视觉到文本建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, TFAutoModelForVision2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = TFAutoModelForVision2Seq.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 上的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个指向*PyTorch state_dict 保存文件*的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型然后加载 TensorFlow 模型要慢。
model_args (其他位置参数, 可选) — 将传递给底层模型的 `__init__()` 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID* 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型是通过提供本地目录作为 pretrained_model_name_or_path 加载的，并且在目录中找到了名为 *config.json* 的配置 JSON 文件。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件中加载模型权重（请参阅 `pretrained_model_name_or_path` 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。所有下载现在在可能时默认恢复。将在 Transformers v5 中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每个请求上使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型文件中定义自定义模型。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果 Hub 上的代码与模型的其余部分位于不同的仓库中，则使用特定的代码修订版。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (其他关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如，`output_attentions=True`）。行为因是否提供 `config` 或自动加载而异：
- 如果使用 `config` 提供了配置，`**kwargs` 将直接传递给底层模型的 `__init__` 方法（我们假设所有对配置的相关更新已经完成）
- 如果没有提供配置，`kwargs` 将首先传递给配置类的初始化函数（from_pretrained()）。`kwargs` 中与配置属性对应的每个键将用于使用提供的 `kwargs` 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 `__init__` 函数。

从预训练模型实例化库中的一个模型类（带有视觉到文本建模头）。

blip — TFBlipForConditionalGeneration (BLIP 模型)
vision-encoder-decoder — TFVisionEncoderDecoderModel (视觉编码器-解码器模型)

示例

>>> from transformers import AutoConfig, TFAutoModelForVision2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = TFAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = TFAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = TFAutoModelForVision2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

FlaxAutoModelForVision2Seq

class transformers.FlaxAutoModelForVision2Seq

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中带有视觉到文本建模头的模型类之一。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- VisionEncoderDecoderConfig 配置类：FlaxVisionEncoderDecoderModel (视觉编码器-解码器模型)
attn_implementation (str, 可选) — 模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一个。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认是手动的 "eager" 实现。

根据配置实例化库中的一个模型类（带有视觉到文本建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, FlaxAutoModelForVision2Seq

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = FlaxAutoModelForVision2Seq.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str 或 os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 上的模型仓库中的预训练模型的*模型 ID*。
- 一个包含使用 save_pretrained() 保存的模型权重的*目录*路径，例如 ./my_model_directory/。
- 一个指向*PyTorch state_dict 保存文件*的路径或 URL（例如，./pt_model/pytorch_model.bin）。在这种情况下，from_pt 应设置为 True，并且应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 PyTorch 模型转换为 TensorFlow 模型然后加载 TensorFlow 模型要慢。
model_args (其他位置参数, 可选) — 将传递给底层模型的 `__init__()` 方法。
config (PretrainedConfig, 可选) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID* 字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 模型是通过提供本地目录作为 pretrained_model_name_or_path 加载的，并且在目录中找到了名为 *config.json* 的配置 JSON 文件。
cache_dir (str 或 os.PathLike, 可选) — 如果不应使用标准缓存，则为下载的预训练模型配置应缓存的目录路径。
from_pt (bool, 可选, 默认为 False) — 从 PyTorch 检查点保存文件中加载模型权重（请参阅 `pretrained_model_name_or_path` 参数的文档字符串）。
force_download (bool, 可选, 默认为 False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。所有下载现在在可能时默认恢复。将在 Transformers v5 中移除。
proxies (dict[str, str], 可选) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每个请求上使用。
output_loading_info(bool, 可选, 默认为 False) — 是否同时返回一个包含缺失键、意外键和错误消息的字典。
local_files_only(bool, 可选, 默认为 False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, 可选, 默认为 "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, 可选, 默认为 False) — 是否允许在 Hub 上的自定义模型文件中定义自定义模型。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上的代码。
code_revision (str, 可选, 默认为 "main") — 如果 Hub 上的代码与模型的其余部分位于不同的仓库中，则使用特定的代码修订版。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (其他关键字参数, 可选) — 可用于更新配置对象（加载后）和初始化模型（例如，`output_attentions=True`）。行为因是否提供 `config` 或自动加载而异：
- 如果使用 `config` 提供了配置，`**kwargs` 将直接传递给底层模型的 `__init__` 方法（我们假设所有对配置的相关更新已经完成）
- 如果没有提供配置，`kwargs` 将首先传递给配置类的初始化函数（from_pretrained()）。`kwargs` 中与配置属性对应的每个键将用于使用提供的 `kwargs` 值覆盖该属性。不对应任何配置属性的其余键将传递给底层模型的 `__init__` 函数。

从预训练模型实例化库中的一个模型类（带有视觉到文本建模头）。

vision-encoder-decoder — FlaxVisionEncoderDecoderModel (视觉编码器-解码器模型)

示例

>>> from transformers import AutoConfig, FlaxAutoModelForVision2Seq

>>> # Download model and configuration from huggingface.co and cache.
>>> model = FlaxAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = FlaxAutoModelForVision2Seq.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a PyTorch checkpoint file instead of a TensorFlow model (slower)
>>> config = AutoConfig.from_pretrained("./pt_model/bert_pt_model_config.json")
>>> model = FlaxAutoModelForVision2Seq.from_pretrained(
...     "./pt_model/bert_pytorch_model.bin", from_pt=True, config=config
... )

AutoModelForImageTextToText

class transformers.AutoModelForImageTextToText

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有图像-文本到文本建模头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- AriaConfig 配置类：AriaForConditionalGeneration (Aria 模型)
- AyaVisionConfig 配置类：AyaVisionForConditionalGeneration (AyaVision 模型)
- Blip2Config 配置类：Blip2ForConditionalGeneration (BLIP-2 模型)
- BlipConfig 配置类：BlipForConditionalGeneration (BLIP 模型)
- ChameleonConfig 配置类：ChameleonForConditionalGeneration (Chameleon 模型)
- Emu3Config 配置类：Emu3ForConditionalGeneration (Emu3 模型)
- FuyuConfig 配置类：FuyuForCausalLM (Fuyu 模型)
- Gemma3Config 配置类：Gemma3ForConditionalGeneration (Gemma3ForConditionalGeneration 模型)
- Gemma3nConfig 配置类：Gemma3nForConditionalGeneration (Gemma3nForConditionalGeneration 模型)
- GitConfig 配置类：GitForCausalLM (GIT 模型)
- Glm4vConfig 配置类：Glm4vForConditionalGeneration (GLM4V 模型)
- GotOcr2Config 配置类：GotOcr2ForConditionalGeneration (GOT-OCR2 模型)
- Idefics2Config 配置类：Idefics2ForConditionalGeneration (Idefics2 模型)
- Idefics3Config 配置类：Idefics3ForConditionalGeneration (Idefics3 模型)
- IdeficsConfig 配置类：IdeficsForVisionText2Text (IDEFICS 模型)
- InstructBlipConfig 配置类：InstructBlipForConditionalGeneration (InstructBLIP 模型)
- InternVLConfig 配置类：InternVLForConditionalGeneration (InternVL 模型)
- JanusConfig 配置类：JanusForConditionalGeneration (Janus 模型)
- Kosmos2Config 配置类：Kosmos2ForConditionalGeneration (KOSMOS-2 模型)
- Llama4Config 配置类：Llama4ForConditionalGeneration (Llama4 模型)
- LlavaConfig 配置类：LlavaForConditionalGeneration (LLaVa 模型)
- LlavaNextConfig 配置类：LlavaNextForConditionalGeneration (LLaVA-NeXT 模型)
- LlavaNextVideoConfig 配置类：LlavaNextVideoForConditionalGeneration (LLaVa-NeXT-Video 模型)
- LlavaOnevisionConfig 配置类：LlavaOnevisionForConditionalGeneration (LLaVA-Onevision 模型)
- Mistral3Config 配置类：Mistral3ForConditionalGeneration (Mistral3 模型)
- MllamaConfig 配置类：MllamaForConditionalGeneration (Mllama 模型)
- PaliGemmaConfig 配置类：PaliGemmaForConditionalGeneration (PaliGemma 模型)
- Pix2StructConfig 配置类：Pix2StructForConditionalGeneration (Pix2Struct 模型)
- PixtralVisionConfig 配置类：LlavaForConditionalGeneration (Pixtral 模型)
- Qwen2VLConfig 配置类：Qwen2VLForConditionalGeneration (Qwen2VL 模型)
- Qwen2_5_VLConfig 配置类：Qwen2_5_VLForConditionalGeneration (Qwen2_5_VL 模型)
- ShieldGemma2Config 配置类：Gemma3ForConditionalGeneration (Shieldgemma2 模型)
- SmolVLMConfig 配置类：SmolVLMForConditionalGeneration (SmolVLM 模型)
- UdopConfig 配置类：UdopForConditionalGeneration (UDOP 模型)
- VipLlavaConfig 配置类：VipLlavaForConditionalGeneration (VipLlava 模型)
- VisionEncoderDecoderConfig 配置类：VisionEncoderDecoderModel (Vision Encoder decoder 模型)
attn_implementation (str, optional) — 要在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认值为手动的 "eager" 实现。

从一个配置实例化库中的一个模型类（带有图像-文本到文本建模头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForImageTextToText

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForImageTextToText.from_config(config)

from_pretrained

( *model_args **kwargs )

参数

pretrained_model_name_or_path (str or os.PathLike) — 可以是以下之一：
- 一个字符串，即托管在 huggingface.co 上的模型仓库中的预训练模型的*模型 ID*。
- 一个指向使用 save_pretrained() 保存的模型权重*目录*的路径，例如 ./my_model_directory/。
- 一个指向*tensorflow索引检查点文件*的路径或 URL（例如，./tf_model/model.ckpt.index）。在这种情况下，应将 from_tf 设置为 True，并应提供一个配置对象作为 config 参数。这种加载路径比使用提供的转换脚本将 TensorFlow 检查点转换为 PyTorch 模型然后加载 PyTorch 模型要慢。
model_args (其他位置参数, optional) — 将传递给底层模型的 __init__() 方法。
config (PretrainedConfig, optional) — 用于模型的配置，而不是自动加载的配置。在以下情况下可以自动加载配置：
- 模型是库提供的模型（使用预训练模型的*模型 ID*字符串加载）。
- 模型是使用 save_pretrained() 保存的，并通过提供保存目录重新加载。
- 通过提供本地目录作为 pretrained_model_name_or_path 加载模型，并且在目录中找到名为 *config.json* 的配置 JSON 文件。
state_dict (dict[str, torch.Tensor], optional) — 要使用的状态字典，而不是从保存的权重文件中加载的状态字典。

如果您想从预训练配置创建模型但加载自己的权重，可以使用此选项。但在这种情况下，您应该检查使用 save_pretrained() 和 from_pretrained() 是否不是一个更简单的选项。
cache_dir (str or os.PathLike, optional) — 下载的预训练模型配置应缓存的目录路径，如果不应使用标准缓存。
from_tf (bool, optional, defaults to False) — 从 TensorFlow 检查点保存文件加载模型权重（请参阅 pretrained_model_name_or_path 参数的文档字符串）。
force_download (bool, optional, defaults to False) — 是否强制（重新）下载模型权重和配置文件，覆盖已存在的缓存版本。
resume_download — 已弃用并忽略。现在所有下载在可能的情况下都会默认恢复。将在 Transformers 的 v5 版本中移除。
proxies (dict[str, str], optional) — 按协议或端点使用的代理服务器字典，例如 {'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理在每个请求上使用。
output_loading_info(bool, optional, defaults to False) — 是否同时返回一个包含缺失键、意外键和错误信息的字典。
local_files_only(bool, optional, defaults to False) — 是否只查看本地文件（例如，不尝试下载模型）。
revision (str, optional, defaults to "main") — 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
trust_remote_code (bool, optional, defaults to False) — 是否允许在 Hub 上的自定义模型在其自己的建模文件中定义。此选项只应为您信任且已阅读其代码的仓库设置为 True，因为它将在您的本地计算机上执行 Hub 上存在的代码。
code_revision (str, optional, defaults to "main") — 如果代码位于与模型其余部分不同的仓库中，则要使用的 Hub 上代码的特定修订版。它可以是分支名称、标签名称或提交 ID，因为我们使用基于 git 的系统在 huggingface.co 上存储模型和其他工件，所以 revision 可以是 git 允许的任何标识符。
kwargs (其他关键字参数, optional) — 可用于更新配置对象（加载后）并初始化模型（例如，output_attentions=True）。其行为取决于是否提供了 config 或自动加载：
- 如果通过 config 提供了配置，**kwargs 将直接传递给底层模型的 __init__ 方法（我们假设所有相关的配置更新已经完成）。
- 如果没有提供配置，kwargs 将首先传递给配置类的初始化函数（from_pretrained()）。kwargs 中与配置属性对应的每个键将用于使用提供的 kwargs 值覆盖该属性。不对应任何配置属性的剩余键将传递给底层模型的 __init__ 函数。

从预训练模型实例化库中的一个模型类（带有图像-文本到文本建模头）。

aria — AriaForConditionalGeneration (Aria 模型)
aya_vision — AyaVisionForConditionalGeneration (AyaVision 模型)
blip — BlipForConditionalGeneration (BLIP 模型)
blip-2 — Blip2ForConditionalGeneration (BLIP-2 模型)
chameleon — ChameleonForConditionalGeneration (Chameleon 模型)
emu3 — Emu3ForConditionalGeneration (Emu3 模型)
fuyu — FuyuForCausalLM (Fuyu 模型)
gemma3 — Gemma3ForConditionalGeneration (Gemma3ForConditionalGeneration 模型)
gemma3n — Gemma3nForConditionalGeneration (Gemma3nForConditionalGeneration 模型)
git — GitForCausalLM (GIT 模型)
glm4v — Glm4vForConditionalGeneration (GLM4V 模型)
got_ocr2 — GotOcr2ForConditionalGeneration (GOT-OCR2 模型)
idefics — IdeficsForVisionText2Text (IDEFICS 模型)
idefics2 — Idefics2ForConditionalGeneration (Idefics2 模型)
idefics3 — Idefics3ForConditionalGeneration (Idefics3 模型)
instructblip — InstructBlipForConditionalGeneration (InstructBLIP 模型)
internvl — InternVLForConditionalGeneration (InternVL 模型)
janus — JanusForConditionalGeneration (Janus 模型)
kosmos-2 — Kosmos2ForConditionalGeneration (KOSMOS-2 模型)
llama4 — Llama4ForConditionalGeneration (Llama4 模型)
llava — LlavaForConditionalGeneration (LLaVa 模型)
llava_next — LlavaNextForConditionalGeneration (LLaVA-NeXT 模型)
llava_next_video — LlavaNextVideoForConditionalGeneration (LLaVa-NeXT-Video 模型)
llava_onevision — LlavaOnevisionForConditionalGeneration (LLaVA-Onevision 模型)
mistral3 — Mistral3ForConditionalGeneration (Mistral3 模型)
mllama — MllamaForConditionalGeneration (Mllama 模型)
paligemma — PaliGemmaForConditionalGeneration (PaliGemma 模型)
pix2struct — Pix2StructForConditionalGeneration (Pix2Struct 模型)
pixtral — LlavaForConditionalGeneration (Pixtral 模型)
qwen2_5_vl — Qwen2_5_VLForConditionalGeneration (Qwen2_5_VL 模型)
qwen2_vl — Qwen2VLForConditionalGeneration (Qwen2VL 模型)
shieldgemma2 — Gemma3ForConditionalGeneration (Shieldgemma2 模型)
smolvlm — SmolVLMForConditionalGeneration (SmolVLM 模型)
udop — UdopForConditionalGeneration (UDOP 模型)
vipllava — VipLlavaForConditionalGeneration (VipLlava 模型)
vision-encoder-decoder — VisionEncoderDecoderModel (视觉编码器解码器模型)

默认情况下，模型通过 model.eval() 设置为评估模式（例如，dropout 模块被禁用）。要训练模型，您应该首先使用 model.train() 将其设置回训练模式。

示例

>>> from transformers import AutoConfig, AutoModelForImageTextToText

>>> # Download model and configuration from huggingface.co and cache.
>>> model = AutoModelForImageTextToText.from_pretrained("google-bert/bert-base-cased")

>>> # Update configuration during loading
>>> model = AutoModelForImageTextToText.from_pretrained("google-bert/bert-base-cased", output_attentions=True)
>>> model.config.output_attentions
True

>>> # Loading from a TF checkpoint file instead of a PyTorch model (slower)
>>> config = AutoConfig.from_pretrained("./tf_model/bert_tf_model_config.json")
>>> model = AutoModelForImageTextToText.from_pretrained(
...     "./tf_model/bert_tf_checkpoint.ckpt.index", from_tf=True, config=config
... )

时间序列

AutoModelForTimeSeriesPrediction

class transformers.AutoModelForTimeSeriesPrediction

（ *args **kwargs ）

这是一个通用的模型类，当使用 from_pretrained() 类方法或 from_config() 类方法创建时，它将被实例化为库中的一个模型类（带有时间序列预测头）。

这个类不能直接使用 __init__() 进行实例化（会抛出错误）。

from_config

（ **kwargs ）

参数

config (PretrainedConfig) — 要实例化的模型类是根据配置类选择的：
- TimesFmConfig 配置类：TimesFmModelForPrediction (TimesFm 模型)
attn_implementation (str, optional) — 要在模型中使用的注意力实现（如果相关）。可以是 "eager"（注意力的手动实现）、"sdpa"（使用 F.scaled_dot_product_attention）或 "flash_attention_2"（使用 Dao-AILab/flash-attention）中的任何一种。默认情况下，如果可用，SDPA 将用于 torch>=2.1.1。否则，默认值为手动的 "eager" 实现。

从一个配置实例化库中的一个模型类（带有时间序列预测头）。

注意：从其配置文件加载模型并不会加载模型权重。它只影响模型的配置。请使用 from_pretrained() 来加载模型权重。

示例

>>> from transformers import AutoConfig, AutoModelForTimeSeriesPrediction

>>> # Download configuration from huggingface.co and cache.
>>> config = AutoConfig.from_pretrained("google-bert/bert-base-cased")
>>> model = AutoModelForTimeSeriesPrediction.from_config(config)

from_pretrained