Transformers

加入 Hugging Face 社区

并获取增强的文档体验

协作处理模型、数据集和 Spaces

通过加速推理获得更快的示例

切换文档主题

开始使用

LUKE

概述

LUKE 模型在 LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention 中被提出，作者是 Ikuya Yamada、Akari Asai、Hiroyuki Shindo、Hideaki Takeda 和 Yuji Matsumoto。它基于 RoBERTa，并添加了实体嵌入以及实体感知自注意力机制，这有助于提高在各种下游任务中的性能，这些任务涉及实体推理，例如命名实体识别、抽取式和完形填空式问题回答、实体类型分类和关系分类。

该论文的摘要如下：

实体表示在涉及实体的自然语言任务中非常有用。在本文中，我们提出了基于双向 Transformer 的单词和实体的新型预训练上下文表示。所提出的模型将给定文本中的单词和实体视为独立的 tokens，并输出它们的上下文表示。我们的模型使用基于 BERT 的掩码语言模型的新预训练任务进行训练。该任务涉及预测从维基百科检索的大型实体注释语料库中随机掩码的单词和实体。我们还提出了一种实体感知自注意力机制，它是 Transformer 的自注意力机制的扩展，并在计算注意力分数时考虑 tokens 的类型（单词或实体）。所提出的模型在各种实体相关任务上取得了令人印象深刻的实证性能。特别是，它在五个著名的基准数据集上获得了最先进的结果：Open Entity（实体类型分类）、TACRED（关系分类）、CoNLL-2003（命名实体识别）、ReCoRD（完形填空式问题回答）和 SQuAD 1.1（抽取式问题回答）。

此模型由 ikuyamada 和 nielsr 贡献。原始代码可以在这里找到。

使用技巧

此实现与 RobertaModel 相同，但增加了实体嵌入和实体感知自注意力机制，从而提高了在涉及实体推理的任务上的性能。
LUKE 将实体视为输入 tokens；因此，它需要 entity_ids、entity_attention_mask、entity_token_type_ids 和 entity_position_ids 作为额外的输入。您可以使用 LukeTokenizer 获取这些输入。
LukeTokenizer 接受 entities 和 entity_spans（输入文本中实体的基于字符的起始和结束位置）作为额外的输入。entities 通常由 [MASK] 实体或维基百科实体组成。输入这些实体时的简要说明如下：
- 输入 [MASK] 实体以计算实体表示：[MASK] 实体用于掩码在预训练期间要预测的实体。当 LUKE 接收到 [MASK] 实体时，它会尝试通过从输入文本中收集有关实体的信息来预测原始实体。因此，[MASK] 实体可用于解决需要文本中实体信息的下游任务，例如实体类型分类、关系分类和命名实体识别。
- 输入维基百科实体以计算知识增强的 token 表示：LUKE 在预训练期间学习了有关维基百科实体的丰富信息（或知识），并将信息存储在其实体嵌入中。通过使用维基百科实体作为输入 tokens，LUKE 输出由存储在这些实体的嵌入中的信息丰富的 token 表示。这对于需要真实世界知识的任务（例如问题回答）特别有效。
前一种用例有三种头部模型：
- LukeForEntityClassification，用于对输入文本中的单个实体进行分类的任务，例如实体类型分类，例如 Open Entity 数据集。此模型在输出实体表示之上放置一个线性头。
- LukeForEntityPairClassification，用于对两个实体之间的关系进行分类的任务，例如关系分类，例如 TACRED 数据集。此模型在给定实体对的连接输出表示之上放置一个线性头。
- LukeForEntitySpanClassification，用于对实体跨度序列进行分类的任务，例如命名实体识别 (NER)。此模型在输出实体表示之上放置一个线性头。您可以通过将文本中所有可能的实体跨度输入到模型来解决 NER 问题。
LukeTokenizer 有一个 task 参数，使您可以通过指定 task="entity_classification"、task="entity_pair_classification" 或 task="entity_span_classification" 轻松地为这些头部模型创建输入。请参阅每个头部模型的示例代码。

使用示例

>>> from transformers import LukeTokenizer, LukeModel, LukeForEntityPairClassification

>>> model = LukeModel.from_pretrained("studio-ousia/luke-base")
>>> tokenizer = LukeTokenizer.from_pretrained("studio-ousia/luke-base")
# Example 1: Computing the contextualized entity representation corresponding to the entity mention "Beyoncé"

>>> text = "Beyoncé lives in Los Angeles."
>>> entity_spans = [(0, 7)]  # character-based entity span corresponding to "Beyoncé"
>>> inputs = tokenizer(text, entity_spans=entity_spans, add_prefix_space=True, return_tensors="pt")
>>> outputs = model(**inputs)
>>> word_last_hidden_state = outputs.last_hidden_state
>>> entity_last_hidden_state = outputs.entity_last_hidden_state
# Example 2: Inputting Wikipedia entities to obtain enriched contextualized representations

>>> entities = [
...     "Beyoncé",
...     "Los Angeles",
... ]  # Wikipedia entity titles corresponding to the entity mentions "Beyoncé" and "Los Angeles"
>>> entity_spans = [(0, 7), (17, 28)]  # character-based entity spans corresponding to "Beyoncé" and "Los Angeles"
>>> inputs = tokenizer(text, entities=entities, entity_spans=entity_spans, add_prefix_space=True, return_tensors="pt")
>>> outputs = model(**inputs)
>>> word_last_hidden_state = outputs.last_hidden_state
>>> entity_last_hidden_state = outputs.entity_last_hidden_state
# Example 3: Classifying the relationship between two entities using LukeForEntityPairClassification head model

>>> model = LukeForEntityPairClassification.from_pretrained("studio-ousia/luke-large-finetuned-tacred")
>>> tokenizer = LukeTokenizer.from_pretrained("studio-ousia/luke-large-finetuned-tacred")
>>> entity_spans = [(0, 7), (17, 28)]  # character-based entity spans corresponding to "Beyoncé" and "Los Angeles"
>>> inputs = tokenizer(text, entity_spans=entity_spans, return_tensors="pt")
>>> outputs = model(**inputs)
>>> logits = outputs.logits
>>> predicted_class_idx = int(logits[0].argmax())
>>> print("Predicted class:", model.config.id2label[predicted_class_idx])

Transformers

LUKE

概述

使用技巧

资源

LukeConfig

class transformers.LukeConfig

LukeTokenizer

class transformers.LukeTokenizer

__call__

save_vocabulary

LukeModel

类 transformers.LukeModel

forward

LukeForMaskedLM

类 transformers.LukeForMaskedLM

forward

LukeForEntityClassification

class transformers.LukeForEntityClassification

forward

LukeForEntityPairClassification

class transformers.LukeForEntityPairClassification

forward

LukeForEntitySpanClassification

class transformers.LukeForEntitySpanClassification

forward

LukeForSequenceClassification

class transformers.LukeForSequenceClassification

forward

LukeForMultipleChoice

class transformers.LukeForMultipleChoice

forward

LukeForTokenClassification

class transformers.LukeForTokenClassification

forward

LukeForQuestionAnswering

class transformers.LukeForQuestionAnswering

forward

call