Transformers 文档

模型调试工具箱

Hugging Face's logo
加入 Hugging Face 社区

并获得增强的文档体验

开始使用

模型调试工具箱

本页列出了库使用的所有调试和模型添加工具,以及它提供的实用函数。

这些工具大多只在您向库中添加新模型时才有用。

模型添加调试器

模型添加调试器 - 模型添加者的上下文管理器

此上下文管理器是为模型添加者设计的强力工具。它会跟踪模型前向传播中的所有前向调用,并在嵌套的 JSON 中记录每个输入和输出的切片。值得注意的是,此上下文管理器强制执行 torch.no_grad()

原理

将模型移植到 Transformers 时,即使是从 Python 到 Python,模型添加者也常常需要进行大量手动操作,包括保存和加载张量、比较数据类型等。这个小工具希望能节省一些时间。

使用方法

按如下方式添加此上下文管理器以调试模型

import torch
from PIL import Image
import requests
from transformers import LlavaProcessor, LlavaForConditionalGeneration
from transformers.model_debugging_utils import model_addition_debugger_context
torch.random.manual_seed(673)

# load pretrained model and processor
model_id = "llava-hf/llava-1.5-7b-hf"
processor = LlavaProcessor.from_pretrained(model_id)
model = LlavaForConditionalGeneration.from_pretrained(model_id)

# create random image input
random_image = Image.fromarray(torch.randint(0, 256, (224, 224, 3), dtype=torch.uint8).numpy())

# prompt
prompt = "<image>Describe this image."

# process inputs
inputs = processor(text=prompt, images=random_image, return_tensors="pt")

# call forward method (not .generate!)
with model_addition_debugger_context(
    model,
    debug_path="optional_path_to_your_directory",
    do_prune_layers=False # This will output ALL the layers of a model.
):
    output = model.forward(**inputs)

读取结果

调试器会从前向调用生成两个文件,它们具有相同的基本名称,但分别以 _SUMMARY.json_FULL_TENSORS.json 结尾。

第一个文件将包含每个模块的*输入*和*输出*张量值和形状的摘要。

{
  "module_path": "MolmoForConditionalGeneration",
  "inputs": {
    "args": [],
    "kwargs": {
      "input_ids": {
        "shape": "torch.Size([1, 589])",
        "dtype": "torch.int64"
      },
      "attention_mask": {
        "shape": "torch.Size([1, 589])",
        "dtype": "torch.int64"
      },
      "pixel_values": {
        "shape": "torch.Size([1, 5, 576, 588])",
        "dtype": "torch.float32",
        "mean": "tensor(-8.9514e-01, device='cuda:0')",
        "std": "tensor(9.2586e-01, device='cuda:0')",
        "min": "tensor(-1.7923e+00, device='cuda:0')",
        "max": "tensor(1.8899e+00, device='cuda:0')"
    }
  },
  "children": [
    {
      "module_path": "MolmoForConditionalGeneration.language_model.model.embed_tokens",
      "inputs": {
        "args": [
          {
            "shape": "torch.Size([1, 589])",
            "dtype": "torch.int64"
          }
        ]
      },
      "outputs": {
        "shape": "torch.Size([1, 589, 3584])",
        "dtype": "torch.float32",
        "mean": "tensor(6.5460e-06, device='cuda:0')",
        "std": "tensor(2.3807e-02, device='cuda:0')",
        "min": "tensor(-3.3398e-01, device='cuda:0')",
        "max": "tensor(3.9453e-01, device='cuda:0')"
      }
    },
    {
      "module_path": "MolmoForConditionalGeneration.vision_tower",
      "inputs": {
        "args": [
          {
            "shape": "torch.Size([5, 1, 576, 588])",
            "dtype": "torch.float32",
            "mean": "tensor(-8.9514e-01, device='cuda:0')",
            "std": "tensor(9.2586e-01, device='cuda:0')",
            "min": "tensor(-1.7923e+00, device='cuda:0')",
            "max": "tensor(1.8899e+00, device='cuda:0')"
          }
        ],
        "kwargs": {
          "output_hidden_states": "True"
        }
      },
      "children": [
        { ... and so on

_FULL_TENSORS.json 文件将显示所有张量的完整视图,这对于比较两个文件很有用。

      "pixel_values": {
        "shape": "torch.Size([1, 5, 576, 588])",
        "dtype": "torch.float32",
        "value": [
          "tensor([[[[-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
          "          [-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
          "          [-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
          "          ...,",
          "          [-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
          "          [-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
          "          [-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00]],",
          "",
          "         [[-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
          "          [-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
          "          [-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
          "          ...,",
          "          [-1.4857e+00, -1.4820e+00, -1.2100e+00,  ..., -6.0979e-01, -5.9650e-01, -3.8527e-01],",
          "          [-1.6755e+00, -1.7221e+00, -1.4518e+00,  ..., -7.5577e-01, -7.4658e-01, -5.5592e-01],",
          "          [-7.9957e-01, -8.2162e-01, -5.7014e-01,  ..., -1.3689e+00, -1.3169e+00, -1.0678e+00]],",
          "",
          "         [[-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
          "          [-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
          "          [-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
          "          ...,",
          "          [-3.0322e-01, -5.0645e-01, -5.8436e-01,  ..., -6.2439e-01, -7.9160e-01, -8.1188e-01],",
          "          [-4.4921e-01, -6.5653e-01, -7.2656e-01,  ..., -3.4702e-01, -5.2146e-01, -5.1326e-01],",
          "          [-3.4702e-01, -5.3647e-01, -5.4170e-01,  ..., -1.0915e+00, -1.1968e+00, -1.0252e+00]],",
          "",
          "         [[-1.1207e+00, -1.2718e+00, -1.0678e+00,  ..., 1.2013e-01, -1.3126e-01, -1.7197e-01],",
          "          [-6.9738e-01, -9.1166e-01, -8.5454e-01,  ..., -5.5050e-02, -2.8134e-01, -4.2793e-01],",
          "          [-3.4702e-01, -5.5148e-01, -5.8436e-01,  ..., 1.9312e-01, -8.6235e-02, -2.1463e-01],",
          "          ...,",
          "          [-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
          "          [-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
          "          [-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00]],",
          "",
          "         [[-1.0039e+00, -9.5669e-01, -6.5546e-01,  ..., -1.4711e+00, -1.4219e+00, -1.1389e+00],",
          "          [-1.0039e+00, -9.5669e-01, -6.5546e-01,  ..., -1.7193e+00, -1.6771e+00, -1.4091e+00],",
          "          [-1.6317e+00, -1.6020e+00, -1.2669e+00,  ..., -1.2667e+00, -1.2268e+00, -8.9720e-01],",
          "          ...,",
          "          [-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
          "          [-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00],",
          "          [-1.7923e+00, -1.7521e+00, -1.4802e+00,  ..., -1.7923e+00, -1.7521e+00, -1.4802e+00]]]], device='cuda:0')"
        ],
        "mean": "tensor(-8.9514e-01, device='cuda:0')",
        "std": "tensor(9.2586e-01, device='cuda:0')",
        "min": "tensor(-1.7923e+00, device='cuda:0')",
        "max": "tensor(1.8899e+00, device='cuda:0')"
      },

将张量保存到磁盘

一些模型添加者可能会从将完整的张量值记录到磁盘中受益,例如,支持跨实现的数值分析。

use_repr=False 设置为使用 SafeTensors 将张量写入磁盘。

with model_addition_debugger_context(
    model,
    debug_path="optional_path_to_your_directory",
    do_prune_layers=False,
    use_repr=False,   # Defaults to True
):
    output = model.forward(**inputs)

当使用 use_repr=False 时,张量会写入与 _SUMMARY.json_FULL_TENSORS.json 文件相同的磁盘位置。_FULL_TENSORS.json 文件中条目的 value 属性将包含对关联的 .safetensors 文件的相对路径引用。每个张量都作为状态字典的 data 属性写入自己的文件。文件名使用 module_path 作为前缀,并带有一些递归构建的可能后缀。

  • 模块输入用 _inputs 表示,输出用 _outputs 表示。
  • listtuple 实例,例如 args 或函数返回值,将以 _{index} 作为后缀。
  • dict 实例将以 _{key} 作为后缀。

不同实现之间的比较

一旦调试器跟踪了两个模型的前向传播,就可以比较 json 输出文件。如下所示:我们可以看到这两个实现的键投影层之间存在细微差异。输入基本相同,但并不完全一致。通过查看文件差异,可以更容易地找出哪个层是错误的。

download-icon

局限性和范围

此功能仅适用于基于 torch 的模型,对于通常编译的基于 jax 的模型则需要更多的工作和逐案处理。严重依赖外部内核调用的模型可能有效,但跟踪可能会遗漏一些东西。无论如何,任何旨在模仿另一个实现的 Python 实现都可以一次性进行跟踪,而不是重复运行 N 次并设置断点。

如果您将 do_prune_layers=False 传递给您的模型调试器,则所有层都将输出到 json。否则,将只显示第一层和最后一层。这在某些层(通常是交叉注意力)仅在 N 层之后才出现时非常有用。

transformers.model_addition_debugger_context

< >

( model debug_path: typing.Optional[str] = None do_prune_layers: typing.Optional[bool] = True use_repr: typing.Optional[bool] = True )

模型添加调试器 - 模型添加者的上下文管理器

此上下文管理器是为模型添加者设计的强力工具。

它跟踪模型前向传播中的所有前向调用,并在嵌套的 JSON 文件中记录每个输入和输出的切片。如果 use_repr=True(默认值),JSON 文件将记录张量的 repr() 化版本,作为字符串列表。如果 use_repr=False,完整的张量将存储在单独的 SafeTensors 文件中,JSON 文件将提供指向该文件的相对路径。

值得注意的是,此上下文管理器强制执行 torch.no_grad()

使用方法

将上下文管理器添加到模型以进行调试

import torch

from PIL import Image
from transformers import LlavaProcessor, LlavaForConditionalGeneration, model_addition_debugger_context

torch.random.manual_seed(673)

# load pretrained model and processor
model_id = "llava-hf/llava-1.5-7b-hf"
processor = LlavaProcessor.from_pretrained(model_id)
model = LlavaForConditionalGeneration.from_pretrained(model_id)

# create random image input
random_image = Image.fromarray(torch.randint(0, 256, (224, 224, 3), dtype=torch.uint8).numpy())

# prompt
prompt = "<image>Describe this image."

# process inputs
inputs = processor(text=prompt, images=random_image, return_tensors="pt")

# call forward method (not .generate!)
with model_addition_debugger_context(model, debug_path="Your_debug_path", do_prune_layers=False):
    output = model.forward(**inputs)
< > 在 GitHub 上更新