Transformers 文档

RoBERTa-PreLayerNorm

Transformers

加入 Hugging Face 社区

并获得增强的文档体验

在模型、数据集和 Spaces 上进行协作

通过加速推理获得更快的示例

切换文档主题

开始使用

RoBERTa-PreLayerNorm

概述

RoBERTa-PreLayerNorm 模型由 Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, Michael Auli 在 fairseq: 用于序列建模的快速可扩展工具包中提出。它与在 fairseq 中使用 --encoder-normalize-before 标志相同。

论文摘要如下：

fairseq 是一个开源的序列建模工具包，允许研究人员和开发人员训练用于翻译、摘要、语言建模和其他文本生成任务的自定义模型。该工具包基于 PyTorch，支持多 GPU 和多机器上的分布式训练。我们还支持在现代 GPU 上进行快速混合精度训练和推理。

此模型由 andreasmadsen 贡献。原始代码可在此处找到。

使用技巧

其实现与 Roberta 相同，不同之处在于它使用 Norm and Add 而不是 Add and Norm。Add 和 Norm 是指 Attention Is All You Need 中描述的 Addition 和 LayerNormalization。
这与在 fairseq 中使用 --encoder-normalize-before 标志相同。

Transformers

RoBERTa-PreLayerNorm

概述

使用技巧

资源

RobertaPreLayerNormConfig

class transformers.RobertaPreLayerNormConfig

RobertaPreLayerNormModel

class transformers.RobertaPreLayerNormModel

前向传播

RobertaPreLayerNormForCausalLM

class transformers.RobertaPreLayerNormForCausalLM

前向传播

RobertaPreLayerNormForMaskedLM

class transformers.RobertaPreLayerNormForMaskedLM

前向传播

RobertaPreLayerNormForSequenceClassification

class transformers.RobertaPreLayerNormForSequenceClassification

前向传播

RobertaPreLayerNormForMultipleChoice

class transformers.RobertaPreLayerNormForMultipleChoice

前向传播

RobertaPreLayerNormForTokenClassification

class transformers.RobertaPreLayerNormForTokenClassification

前向传播

RobertaPreLayerNormForQuestionAnswering

class transformers.RobertaPreLayerNormForQuestionAnswering

前向传播

TFRobertaPreLayerNormModel

class transformers.TFRobertaPreLayerNormModel

调用

TFRobertaPreLayerNormForCausalLM

class transformers.TFRobertaPreLayerNormForCausalLM

调用

TFRobertaPreLayerNormForMaskedLM

class transformers.TFRobertaPreLayerNormForMaskedLM

调用

TFRobertaPreLayerNormForSequenceClassification

class transformers.TFRobertaPreLayerNormForSequenceClassification

调用

TFRobertaPreLayerNormForMultipleChoice

class transformers.TFRobertaPreLayerNormForMultipleChoice

调用

TFRobertaPreLayerNormForTokenClassification

class transformers.TFRobertaPreLayerNormForTokenClassification

调用

TFRobertaPreLayerNormForQuestionAnswering

class transformers.TFRobertaPreLayerNormForQuestionAnswering

调用

FlaxRobertaPreLayerNormModel

class transformers.FlaxRobertaPreLayerNormModel

__call__

FlaxRobertaPreLayerNormForCausalLM

class transformers.FlaxRobertaPreLayerNormForCausalLM

__call__

FlaxRobertaPreLayerNormForMaskedLM

class transformers.FlaxRobertaPreLayerNormForMaskedLM

__call__

FlaxRobertaPreLayerNormForSequenceClassification

class transformers.FlaxRobertaPreLayerNormForSequenceClassification

__call__

FlaxRobertaPreLayerNormForMultipleChoice

class transformers.FlaxRobertaPreLayerNormForMultipleChoice

__call__

FlaxRobertaPreLayerNormForTokenClassification

class transformers.FlaxRobertaPreLayerNormForTokenClassification

__call__

FlaxRobertaPreLayerNormForQuestionAnswering

class transformers.FlaxRobertaPreLayerNormForQuestionAnswering

__call__

call

call

call

call

call

call

call