Optimum 推理与 Furiosa NPU

Optimum Furiosa 是一个实用工具包，用于构建和运行基于 Furiosa NPU 的推理。Optimum 可用于从 Hugging Face Hub 加载优化模型，并创建pipeline以运行加速推理，而无需重写您的 API。

从 Transformers 切换到 Optimum Furiosa

optimum.furiosa.FuriosaAIModelForXXX 模型类与 Hugging Face 模型 API 兼容。这意味着您只需将您的 AutoModelForXXX 类替换为 optimum.furiosa 中对应的 FuriosaAIModelForXXX 类即可。

您无需修改代码即可使其与 FuriosaAIModelForXXX 类一起使用

因为您要使用的模型可能尚未转换为 ONNX，FuriosaAIModel 包含一个将原始 Hugging Face 模型转换为 ONNX 模型的方法。只需将 export=True 传递给 from_pretrained 方法，您的模型将被加载并动态转换为 ONNX。

加载和推理原始 Transformers 模型

import requests
from PIL import Image

- from transformers import AutoModelForImageClassification
+ from optimum.furiosa import FuriosaAIModelForImageClassification
from transformers import AutoFeatureExtractor, pipeline

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

model_id = "microsoft/resnet-50"
- model = AutoModelForImageClassification.from_pretrained(model_id)
+ model = FuriosaAIModelForImageClassification.from_pretrained(model_id, export=True, input_shape_dict={"pixel_values": [1, 3, 224, 224]}, output_shape_dict={"logits": [1, 1000]},)
feature_extractor = AutoFeatureExtractor.from_pretrained(model_id)
cls_pipe = pipeline("image-classification", model=model, feature_extractor=feature_extractor)
outputs = cls_pipe(image)

将编译后的模型推送到 Hugging Face Hub

与常规 PreTrainedModel 一样，您也可以将您的 FurisoaAIModelForXXX 推送到 Hugging Face Model Hub

>>> from optimum.furiosa import FuriosaAIModelForImageClassification

>>> # Load the model from the hub
>>> model = FuriosaAIModelForImageClassification.from_pretrained(
...     "microsoft/resnet-50", export=True, input_shape_dict={"pixel_values": [1, 3, 224, 224]}, output_shape_dict={"logits": [1, 1000]},
... )

>>> # Save the converted model
>>> model.save_pretrained("a_local_path_for_compiled_model")

# Push the compiled model to HF Hub
>>> model.push_to_hub(
...   "a_local_path_for_compiled_model", repository_id="my-furiosa-repo", use_auth_token=True
... )