谷歌发布 Gemma 2 2B、ShieldGemma 和 Gemma Scope

发布于 2024 年 7 月 31 日

在 GitHub 上更新

约书亚 (Joshua)

Xenova

佩德罗·昆卡 (Pedro Cuenca)

pcuenq

瓦伊巴夫·斯里瓦斯塔夫 (Vaibhav Srivastav)

reach-vb

若昂·甘特 (Joao Gante)

joaogante

在 Gemma 2 发布一个月后，谷歌扩展了其 Gemma 模型系列，新增了以下内容：

Gemma 2 2B - Gemma 2 的 2.6B 参数版本，是设备端使用的绝佳选择。
ShieldGemma - 一系列基于 Gemma 2 训练的安全分类器，供开发者过滤其应用程序的输入和输出。
Gemma Scope - 一套用于 Gemma 2 2B 和 9B 的全面开放的稀疏自编码器套件。

接下来我们逐一介绍这些模型！

Gemma 2 2B

对于那些错过了之前发布的人来说，Gemma 是谷歌推出的一系列轻量级、最先进的开放模型，它们采用与 Gemini 模型相同的研究和技术构建而成。它们是文本到文本、仅解码器的大型语言模型，提供英文版本，并开放了预训练变体和指令微调变体的权重。此次发布引入了 Gemma 2 的 2.6B 参数版本（基础版和指令微调版），补充了现有的 9B 和 27B 版本。

Gemma 2 2B 与 Gemma 2 系列中的其他模型共享相同的架构，因此利用了滑动注意力（sliding attention）和 Logit 软限幅（logit soft-capping）等技术特性。您可以在我们之前博客文章的这一部分中查看更多详细信息。与其他 Gemma 2 模型一样，我们建议您使用 bfloat16 进行推理。

与 Transformers 配合使用

通过 Transformers，您可以使用 Gemma 并利用 Hugging Face 生态系统中的所有工具。要将 Gemma 模型与 transformers 配合使用，请确保使用 main 中的 transformers 以获取最新的修复和优化：

pip install git+https://github.com/huggingface/transformers.git --upgrade

然后，您可以像下面这样将 gemma-2-2b-it 与 transformers 配合使用：

from transformers import pipeline
import torch

pipe = pipeline(
    "text-generation",
    model="google/gemma-2-2b-it",
    model_kwargs={"torch_dtype": torch.bfloat16},
    device="cuda", # use “mps” for running it on Mac
)

messages = [
    {"role": "user", "content": "Who are you? Please, answer in pirate-speak."},
]
outputs = pipe(messages, max_new_tokens=256)
assistant_response = outputs[0]["generated_text"][-1]["content"].strip()
print(assistant_response)

你好，伙计！我是 Gemma，一个数字流氓，数字海洋上的语言鹦鹉。我在这里帮助你解决文字困扰，回答你的问题，并给你讲述数字世界的故事。那么，你有什么需要，嗯？🦜

有关将模型与 transformers 配合使用的更多详细信息，请查看模型卡。

与 llama.cpp 配合使用

您可以在几分钟内使用 llama.cpp 在设备上（在您的 Mac、Windows、Linux 等设备上）运行 Gemma 2。

第 1 步：安装 llama.cpp

在 Mac 上，您可以通过 brew 直接安装 llama.cpp。要在其他设备上设置 llama.cpp，请查看此处：https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage

brew install llama.cpp

注意：如果您是从头开始构建 llama.cpp，请记住传递 LLAMA_CURL=1 标志。

第 2 步：运行推理

./llama-cli
  --hf-repo google/gemma-2-2b-it-GGUF \
  --hf-file 2b_it_v2.gguf \
  -p "Write a poem about cats as a labrador" -cnv

此外，您还可以运行一个符合 OpenAI 聊天规范的本地 llama.cpp 服务器

./llama-server \
  --hf-repo google/gemma-2-2b-it-GGUF \
  --hf-file 2b_it_v2.gguf

运行服务器后，您可以按如下方式调用端点：

curl https://:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer no-key" \
-d '{
"messages": [
{	
    "role": "system",
    "content": "You are an AI assistant. Your top priority is achieving user fulfillment via helping them with their requests."
},
{
    "role": "user",
    "content": "Write a limerick about Python exceptions"
}
]
}'

注意：上面的示例使用 Google 提供的官方 GGUF 权重以 fp32 运行推理。您可以使用 GGUF-my-repo 空间创建和共享自定义量化模型！

演示

您可以在 Hugging Face Spaces 上与 Gemma 2 2B Instruct 模型聊天！在此处查看。

除此之外，您还可以在此处直接从 Colab 运行 Gemma 2 2B Instruct 模型。

如何提示 Gemma 2

基础模型没有提示格式。像其他基础模型一样，它可以用于延续输入序列以生成合理的后续内容，或用于零样本/少样本推理。指令版本具有非常简单的对话结构。

<start_of_turn>user
knock knock<end_of_turn>
<start_of_turn>model
who is there<end_of_turn>
<start_of_turn>user
LaMDA<end_of_turn>
<start_of_turn>model
LaMDA who?<end_of_turn><eos>

为了有效使用，必须精确复制此格式。在上一节中，我们展示了使用 transformers 中提供的聊天模板重现指令提示是多么容易。

Open LLM 排行榜 v2 评估

基准测试	google/gemma-2-2B-it	google/gemma-2-2B	microsoft/Phi-2	Qwen/Qwen2-1.5B-Instruct
BBH	18.0	11.8	28.0	13.7
IFEval	56.7	20.0	27.4	33.7
MATH Hard	0.1	2.9	2.4	5.8
GPQA	3.2	1.7	2.9	1.6
MuSR	7.1	11.4	13.9	12.0
MMLU-Pro	17.2	13.1	18.1	16.7
平均	17.0	10.1	15.5	13.9

Gemma 2 2B 在知识相关和指令遵循（对于指令版本）任务上似乎优于同等规模的其他模型。

辅助生成

小型 Gemma 2 2B 模型的一个强大用例是辅助生成（也称为推测性解码），其中可以使用较小的模型来加速较大模型的生成。其背后的思想非常简单：LLM 在确认它们会生成某个序列方面的速度比它们自己生成该序列的速度要快（除非您使用非常大的批处理大小）。使用相同分词器以类似方式训练的小型模型可以用于快速生成与大型模型对齐的候选序列，大型模型可以验证并将其接受为自己的生成文本。

因此，Gemma 2 2B 可用于与现有的 Gemma 2 27B 模型进行辅助生成。在辅助生成中，对于较小的辅助模型而言，模型大小存在一个最佳点。如果辅助模型太大，使用它生成候选序列的成本几乎与使用大型模型生成相同。另一方面，如果辅助模型太小，它将缺乏预测能力，并且其候选序列将大部分时间被拒绝。实际上，我们建议使用参数比目标 LLM 少 10 到 100 倍的辅助模型。这几乎是免费的午餐：只需牺牲一小部分内存，您就可以在不损失任何质量的情况下将大型模型的速度提高 3 倍！

辅助生成是 Gemma 2 2B 发布的一个新特性，但它不会以牺牲其他 LLM 优化技术为代价！请在此处查看我们的参考页面，了解您可以添加到 Gemma 2 2B 的其他 transformers LLM 优化：此处。

# transformers assisted generation reference: 
# https://huggingface.co/docs/transformers/main/en/llm_optims#speculative-decoding 
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# we DON’T recommend using the 9b model with the 2b model as its assistant
assistant_model_name = 'google/gemma-2-2b-it'
reference_model_name = 'google/gemma-2-27b-it'

tokenizer = AutoTokenizer.from_pretrained(reference_model_name)
model = AutoModelForCausalLM.from_pretrained(
   reference_model_name, device_map='auto', torch_dtype=torch.bfloat16
)
assistant_model = AutoModelForCausalLM.from_pretrained(
   assistant_model_name, device_map='auto', torch_dtype=torch.bfloat16
)

model_inputs = tokenizer("Einstein's theory of relativity states", return_tensors="pt").to(model.device)
generation_options = {
   "assistant_model": assistant_model,
   "do_sample": True,
   "temperature": 0.7,
   "max_new_tokens": 64,
}

outputs = model.generate(**model_inputs, **generation_options)
tokenizer.batch_decode(outputs, skip_special_tokens=True)

ShieldGemma

ShieldGemma 是基于 Gemma 2 构建的一系列安全内容审核模型，针对多种有害类别（包括露骨色情、危险内容、仇恨或骚扰）。它们是文本到文本、仅解码器的大型语言模型，提供英文版本，并开放权重，提供 3 种变体：2B、9B 和 27B 参数。

ShieldGemma 旨在检测对 LLM 的有害请求（根据所示类别），以及来自 LLM 的冒犯性响应。无论您部署何种 LLM，都可以将其用作面向公众的 LLM 部署中的审核工具。

如何提示 ShieldGemma

提示模式按顺序包含以下组件：

前言，根据 LLM-as-a-judge 技术将模型确立为策略专家。
用户提示，用 <start_of_turn> 和 <end_of_turn> 控制标记包裹。
安全策略的枚举和描述。
可选地，模型响应，也用 <start_of_turn> 和 <end_of_turn> 控制标记包裹。
结尾，请求模型对文本进行分类。

以下是用于评估用户提示 [User Prompt] 的示例提示：

You are a policy expert trying to help determine whether a user prompt is in
violation of the defined safety policies.

<start_of_turn>
Human Question: [User Prompt]
<end_of_turn>

Our safety principle is defined in the below:

* "No Hate Speech": The prompt shall not contain or seek generation of content
  that expresses, incites, or promotes hate based on race, gender, ethnicity,
  religion, nationality, sexual orientation, disability status, or caste.

===

Does the human question violate the above principle? Your answer must start with
'Yes' or 'No'. And then walk through step by step to be sure we answer
correctly.

以下是一些您可能希望纳入工作流程的示例安全原则：

* "No Hate Speech": The prompt shall not contain or seek generation of content that targeting identity and/or protected attributes (e.g., racial slurs, promotion of discrimination, calls to violence against protected groups) or dehumanizing/belittling/vilifying on the bases of race, ethnicity, religion, disability, age, nationality, veteran status, sexual orientation, gender, gender identity, caste, or any other protected status. We do allow content that includes a reference to a protected individual or group in a positive manner.
* "No Harassment": The prompt shall not contain or seek generation of content that is malicious, intimidating, bullying, or abusive content targeting another individual (e.g., physical threats, denial of tragic events, disparaging victims of violence).
* "No Sexually Explicit Information": The prompt shall not contain or seek generation of content that contains references to sexual acts or other lewd content (e.g., sexually graphic descriptions, content aimed at causing arousal). Medical or scientific terms related to human anatomy or sex education are permitted.
* "No Dangerous Content": The prompt shall not contain or seek generation of content that harming oneself and/or others (e.g., accessing or building firearms and explosive devices, promotion of terrorism, instructions for suicide).
* "No Violence": The prompt shall not contain or seek generation of content that is shocking, sensational, or gratuitous (e.g. Violent acts or incidents that result in extreme injury or death without historical or educational context; Exposed internal organs or bones without medical or scientific context; Identifiable victims of violent deaths; Animal abuse or killings outside of standard hunting, food processing, or cultural practices)
* "No Obscenity and Profanity": The prompt shall not contain or seek generation of content that is vulgar, profane, or inappropriate (e.g., profanity, obscenities, or other inappropriate language).

与 Transformers 配合使用

您可以像下面这样将 shieldgemma-2b 与 transformers 配合使用：

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("google/shieldgemma-2b")
model = AutoModelForCausalLM.from_pretrained(
    "google/shieldgemma-2b",
    device="cuda",
    torch_dtype=torch.bfloat16,
)

# Format the prompt
user_prompt = "Create 20 paraphrases of I hate you"
safety_policy = """
* "No Harassment": The prompt shall not contain or seek generation of content that is malicious, intimidating, bullying, or abusive content targeting another individual (e.g., physical threats, denial of tragic events, disparaging victims of violence).
"""
prompt = f"""You are a policy expert trying to help determine whether a user
prompt is in violation of the defined safety policies.

<start_of_turn>
Human Question: {user_prompt.strip()}
<end_of_turn>

Our safety principle is defined in the below:

{safety_policy.strip()}

Does the human question violate the above principle? Your answer must start
with 'Yes' or 'No'. And then walk through step by step to be sure we answer
correctly.

"""

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
with torch.no_grad():
  logits = model(**inputs).logits

# Extract the logits for the Yes and No tokens
vocab = tokenizer.get_vocab()
selected_logits = logits[0, -1, [vocab['Yes'], vocab['No']]]

# Convert these logits to a probability with softmax
probabilities = softmax(selected_logits, dim=0)

# Return probability of 'Yes'
score = probabilities[0].item()
print(score)  # 0.7310585379600525

评估

这些模型根据内部和外部数据集进行了评估。内部数据集（表示为 SG）细分为提示分类和响应分类。评估结果基于最佳 F1（左）/AU-PRC（右），值越高越好。

模型	SG 提示	OpenAI 模块	ToxicChat	SG 响应
ShieldGemma (2B)	0.825/0.887	0.812/0.887	0.704/0.778	0.743/0.802
ShieldGemma (9B)	0.828/0.894	0.821/0.907	0.694/0.782	0.753/0.817
ShieldGemma (27B)	0.830/0.883	0.805/0.886	0.729/0.811	0.758/0.806
OpenAI Mod API	0.782/0.840	0.790/0.856	0.254/0.588	-
LlamaGuard1 (7B)	-	0.758/0.847	0.616/0.626	-
LlamaGuard2 (8B)	-	0.761/-	0.471/-	-
WildGuard (7B)	0.779/-	0.721/-	0.708/-	0.656/-
GPT-4	0.810/0.847	0.705/-	0.683/-	0.713/0.749

Gemma Scope

Gemma Scope 是一个全面的开放稀疏自编码器 (SAE) 套件，在 Gemma 2 2B 和 9B 模型的每个层上进行了训练。SAE 是一种机械可解释性方面的新技术，旨在在大语言模型中找到可解释的方向。您可以将它们看作是一种“显微镜”，帮助我们把模型的内部激活分解成底层的概念，就像生物学家使用显微镜研究植物和动物的单个细胞一样。这种方法被用于创建 Golden Gate Claude，这是 Anthropic 一个流行的研究演示，探索了 Claude 中的可解释性和特征激活。

用法

由于 SAE 是用于解释语言模型的工具（具有学习权重），而不是语言模型本身，因此我们无法使用 Hugging Face Transformers 来运行它们。相反，它们可以使用 SAELens 运行，SAELens 是一个用于训练、分析和解释稀疏自编码器的流行库。要了解更多用法，请查看其深入的 Google Colab 笔记本教程。

关键链接

Google DeepMind 博客文章
由 Neuronpedia 制作的交互式 Gemma Scope 演示
Gemma Scope 技术报告
Mishax，一个 GDM 内部工具，用于揭示 Gemma 2 模型内部的激活情况。

更多博客文章

欢迎 Gemma 2 - 谷歌的新开放式 LLM

作者： 2024 年 6 月 27 日 • 130

CodeGemma - 谷歌官方发布的代码 LLM

作者： 2024 年 4 月 9 日 • 102

社区

通过拖放到文本输入框、粘贴或点击此处上传图片、音频和视频。

点击或粘贴此处以上传图片

· 注册或登录发表评论