Lighteval 文档
贡献多语言评估
加入 Hugging Face 社区
并获取增强的文档体验
开始使用
贡献多语言评估
贡献少量翻译
我们定义了 19 个 literals
,即在自动创建评估提示时使用的基本关键词或标点符号,例如 yes
、no
、because
等。
我们欢迎您贡献您的语言翻译!
要贡献,您需要
- 打开 translation_literals 文件
- 编辑文件,为您感兴趣的语言添加或扩展文字。
Language.ENGLISH: TranslationLiterals(
language=Language.ENGLISH,
question_word="question", # Usage: "Question: How are you?"
answer="answer", # Usage: "Answer: I am fine"
confirmation_word="right", # Usage: "He is smart, right?"
yes="yes", # Usage: "Yes, he is"
no="no", # Usage: "No, he is not"
also="also", # Usage: "Also, she is smart."
cause_word="because", # Usage: "She is smart, because she is tall"
effect_word="therefore", # Usage: "He is tall therefore he is smart"
or_word="or", # Usage: "He is tall or small"
true="true", # Usage: "He is smart, true, false or neither?"
false="false", # Usage: "He is smart, true, false or neither?"
neither="neither", # Usage: "He is smart, true, false or neither?"
# Punctuation and spacing: only adjust if your language uses something different than in English
full_stop=".",
comma=",",
question_mark="?",
exclamation_mark="!",
word_space=" ",
sentence_space=" ",
colon=":",
# The first characters of your alphabet used in enumerations, if different from English
indices=["A", "B", "C", ...]
)
- 打开一个包含您的修改的 PR!就完成了!
贡献新的多语言任务
您应该首先阅读我们的 添加自定义任务 指南,以更好地理解我们使用的不同参数。
然后,您应该查看当前的 多语言任务 文件,以了解它们是如何定义的。对于多语言评估,prompt_function
应该由语言适配的模板实现。模板将负责正确的格式化,正确且一致地使用语言调整后的提示锚点(例如问题/答案)和标点符号。
浏览 此处 的所有模板列表,以查看哪些最适合您的任务。
然后,准备就绪后,要定义您自己的任务,您应该
- 按照上述指南中指示的方式创建一个 Python 文件
- 为您的任务类型导入相关模板(XNLI、Copa、多项选择、问答等)
- 使用我们可参数化的 LightevalTaskConfig 类,为每种相关语言和评估公式(对于多项选择)定义一个或多个任务
your_tasks = [
LightevalTaskConfig(
# Name of your evaluation
name=f"evalname_{language.value}_{formulation.name.lower()}",
# The evaluation is community contributed
suite=["community"],
# This will automatically get the correct metrics for your chosen formulation
metric=get_metrics_for_formulation(
formulation,
[
loglikelihood_acc_metric(normalization=None),
loglikelihood_acc_metric(normalization=LogProbTokenNorm()),
loglikelihood_acc_metric(normalization=LogProbCharNorm()),
],
),
# In this function, you choose which template to follow and for which language and formulation
prompt_function=get_template_prompt_function(
language=language,
# then use the adapter to define the mapping between the
# keys of the template (left), and the keys of your dataset
# (right)
# To know which template keys are required and available,
# consult the appropriate adapter type and doc-string.
adapter=lambda line: {
"key": line["relevant_key"],
...
},
formulation=formulation,
),
# You can also add specific filters to remove irrelevant samples
hf_filter=lambda line: line["label"] in <condition>,
# You then select your huggingface dataset as well as
# the splits available for evaluation
hf_repo=<dataset>,
hf_subset=<subset>,
evaluation_splits=["train"],
hf_avail_splits=["train"],
)
for language in [
Language.YOUR_LANGUAGE, ...
]
for formulation in [MCFFormulation(), CFFormulation(), HybridFormulation()]
]
- 然后,您可以返回指南以测试您的任务是否已正确实施!
所有 LightevalTaskConfig 参数都是强类型化的,包括模板函数的输入。确保利用您的 IDE 的功能,以便更轻松地正确填写这些参数。
一切就绪后,打开一个 PR,我们将很乐意审核它!
< > 在 GitHub 上更新