🤗 在 Argilla 中使用 HuggingFace 推理端点进行 LLM 建议

社区文章发布于 2023 年 9 月 20 日

我们很高兴能使用 Hugging Face 推理端点在 Argilla 中展示建议功能！从 Argilla v1.13.0 开始，只需几行代码，任何人都可以为 Feedback Dataset 记录添加建议。这通过将标注任务转化为快速验证和纠正过程，从而缩短了生成高质量数据集的时间。

Hugging Face 的推理端点让 Hub 上的任何 ML 模型部署都变得前所未有的简单。您只需选择要部署的模型、您偏好的云提供商和区域，以及要使用的实例类型。几分钟之内，您就可以拥有一个正常运行的推理端点。

得益于 Argilla 与 Hugging Face Spaces 可用模板的集成（之前发布在🚀 在 Hugging Face Spaces 上启动 Argilla），您只需点击几下即可启动 Argilla 实例。这使您可以将整个工作流程保留在 Hugging Face 的生态系统中。

在这篇文章中，我们将演示如何在 Hugging Face Spaces 中设置 Argilla 实例，部署 Hugging Face 推理端点以提供 Llama 2 7B Chat 服务，并将其集成到 Argilla 中，以便为 Argilla 数据集添加建议。

只需不到 10 行代码，您就可以使用 Hugging Face 推理端点自动将 LLM 驱动的建议添加到 Argilla 数据集中的记录中！

🚀 在 Spaces 中部署 Argilla

您可以使用多种部署选项自托管 Argilla，注册 Argilla Cloud，或通过此一键部署按钮在 Hugging Face Spaces 上启动 Argilla 实例

🍱 将数据集推送到 Argilla

我们将使用 Alpaca 的一个子集，这是一个由 OpenAI 的 text-davinci-003 引擎使用 Self-Instruct 框架生成的数据集，包含 52,000 条指令和演示，并进行了一些修改，具体描述在 Alpaca 的数据集卡片中。

我们将使用的 Alpaca 子集由 Hugging Face H4 团队收集，每个拆分（训练和测试）包含 100 行，其中包含提示和完成。

根据我们想要标注的数据，我们定义要推送到 Argilla 的 Feedback Dataset，这意味着需要定义每条记录的字段、用户需要回答的问题，最后是标注指南。更多信息请参阅 Argilla 文档 - 创建 Feedback Dataset。

最后一步是遍历 Alpaca 子集中的行，并将它们添加到 Feedback Dataset 中，以便推送到 Argilla 以开始标注过程。

import argilla as rg
from datasets import load_dataset

rg.init(api_url="<ARGILLA_API_URL>", api_key="<ARGILLA_API_KEY>")

dataset = rg.FeedbackDataset(
    fields=[
      rg.TextField(name="prompt"),
      rg.TextField(name="completion"),
    ],
    questions=[
        rg.LabelQuestion(name="prompt-quality", title="Is the prompt clear?", labels=["yes", "no"]),
        rg.LabelQuestion(name="completion-quality", title="Is the completion correct?", labels=["yes", "no"]),
        rg.TextQuestion(
          name="completion-edit",
          title="If you feel like the completion could be improved, provide a new one",
          required=False,
        ),
    ],
    guidelines=(
      "You are asked to evaluate the following prompt-completion pairs quality,"
      " and provide a new completion if applicable."
    ),
)

alpaca_dataset = load_dataset("HuggingFaceH4/testing_alpaca_small", split="train")
dataset.add_records([rg.FeedbackRecord(fields=row) for row in in alpaca_dataset])

dataset.push_to_argilla(name="alpaca-small", workspace="admin")

如果我们现在导航到我们的 Argilla 实例，我们将看到以下 UI

🚀 部署 Llama 2 推理端点

现在，我们可以设置 Hugging Face 推理端点。这使我们能够轻松地在专用、完全托管的基础设施上提供任何模型服务，同时通过其安全、合规且灵活的生产解决方案降低成本。

如前所述，我们将使用 Hugging Face 格式的 Llama 2 7B 参数变体，并针对聊天完成进行了微调。您可以在 meta-llama/llama-2-7b-chat-hf 找到此模型。其他变体也可在 Hugging Face Hub 的 https://huggingface.co/meta-llama 上获得。

注意： 在撰写本文时，要使用 Llama 2，用户需要访问 Meta 网站并接受其许可条款和可接受使用策略，然后才能通过 Hugging Face Hub 在 Meta 的 Llama 2 组织请求访问 Llama 2 模型。

首先，我们需要确保推理端点已启动并正在运行。一旦获取到 URL，我们就可以开始向其发送请求。

✨ 为 Argilla 生成建议

在向推理端点发送请求之前，我们应该提前了解需要使用的系统提示以及如何格式化我们的提示。在这种情况下，由于我们使用的是 meta-llama/llama-2-7b-chat-hf，我们需要查找用于微调它的提示，并在发送推理请求时复制相同的格式。有关 Llama 2 的更多信息，请参阅 Hugging Face 博客 - Llama 2 已发布 - 在 Hugging Face 上获取。

system_prompt = (
  "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible,"
  " while being safe. Your answers should not include any harmful, unethical, racist, sexist,"
  " toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased"
  " and positive in nature.\nIf a question does not make any sense, or is not factually coherent,"
  " explain why instead of answering something not correct. If you don't know the answer to a"
  " question, please don't share false information."
)
base_prompt = "<s>[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n{prompt} [/INST]"

定义好提示后，我们就可以实例化 huggingface_hub 中的 InferenceClient，以便稍后通过 text_generation 方法向已部署的推理端点发送请求。

以下代码片段展示了如何从我们的 Argilla 实例中检索现有的 Feedback Dataset，以及如何使用 huggingface_hub 中的 InferenceClient 向已部署的推理端点发送请求，以为数据集中的记录添加建议。

import argilla as rg
from huggingface_hub import InferenceClient

rg.init(api_url="<ARGILLA_SPACE_URL>", api_key="<ARGILLA_OWNER_API_KEY")
dataset = rg.FeedbackDataset.from_argilla("<ARGILLA_DATASET>", workspace="<ARGILLA_WORKSPACE>")

client = InferenceClient("<HF_INFERENCE_ENDPOINT_URL>", token="<HF_TOKEN>")

system_prompt = (
  "You are a helpful, respectful and honest assistant. Always answer as helpfully as possible,"
  " while being safe. Your answers should not include any harmful, unethical, racist, sexist,"
  " toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased"
  " and positive in nature.\nIf a question does not make any sense, or is not factually coherent,"
  " explain why instead of answering something not correct. If you don't know the answer to a"
  " question, please don't share false information."
)
base_prompt = "<s>[INST] <<SYS>>\n{system_prompt}\n<</SYS>>\n\n{prompt} [/INST]"

def generate_response(prompt: str) -> str:
  prompt = base_prompt.format(system_prompt=system_prompt, prompt=prompt)
  response = client.text_generation(
    prompt, details=True, max_new_tokens=512, top_k=30, top_p=0.9,
    temperature=0.2, repetition_penalty=1.02, stop_sequences=["</s>"],
  )
  return response.generated_text

for record in dataset.records:
  record.update(
    suggestions=[
      {
        "question_name": "response",
        "value": generate_response(prompt=record.fields["prompt"]),
        "type": "model",
        "agent": "llama-2-7b-hf-chat",
      },
    ],
  )

注意：预定义的系统提示可能不适用于某些用例，因此我们可以应用提示工程技术来使其适应我们的特定用例。

如果我们在使用推理端点生成建议后回到 Argilla 实例，我们将在 UI 中看到以下内容

最后，是时候让标注人员审查 Argilla 数据集中的记录，回答问题，并根据需要提交、编辑或丢弃建议了。

➡️ 下一步

使用 Hugging Face 推理端点将机器学习生成的建议注入 Argilla 既快速又简单。现在，您可以随意尝试您喜欢的机器学习框架，并生成适合您特定用例的建议！

可以为任何问题生成建议，您只需找到最适合您的用例和 Argilla 中 Feedback Dataset 中定义的问题的模型即可。

建议有很多用例，我们对机器反馈在 LLM 用例中的作用感到非常兴奋，我们很乐意听取您的想法！我们强烈建议加入我们精彩的 Slack 社区，分享您对这篇文章或您想讨论的任何其他内容的看法！

社区

通过拖放到文本输入框、粘贴或点击此处上传图片、音频和视频。

点击或粘贴此处以上传图片

· 注册或登录以评论