HF 推理

所有支持的 HF 推理模型都可以在这里找到。

HF 推理是 Hugging Face 提供支持的无服务器推理 API。在推理提供商之前，此服务曾被称为“推理 API (无服务器)”。如果您有兴趣将模型部署到 Hugging Face 管理的专用且可自动扩缩的基础设施，请查看推理端点。

截至 2025 年 7 月，hf-inference 主要关注 CPU 推理（例如嵌入、文本排名、文本分类，或具有历史重要性的小型 LLM，如 BERT 或 GPT-2）。

支持的任务

自动语音识别

了解更多关于自动语音识别的信息，请点击此处。

语言

客户端

提供商

设置

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

output = client.automatic_speech_recognition("sample1.flac", model="openai/whisper-large-v3")

聊天完成（LLM）

了解更多关于聊天补全 (LLM) 的信息，请点击这里。

语言

客户端

提供商

设置

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

completion = client.chat.completions.create(
    model="HuggingFaceTB/SmolLM3-3B",
    messages=[
        {
            "role": "user",
            "content": "What is the capital of France?"
        }
    ],
)

print(completion.choices[0].message)

特征提取

了解更多关于特征提取的信息，请点击这里。

语言

客户端

提供商

设置

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

result = client.feature_extraction(
    "Today is a sunny day and I will get some ice cream.",
    model="intfloat/multilingual-e5-large",
)

完形填空

了解更多关于完形填空的信息，请点击此处。

语言

客户端

提供商

设置

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

result = client.fill_mask(
    "The answer to the universe is undefined.",
    model="google-bert/bert-base-uncased",
)

图像分类

了解更多关于图像分类的信息，请点击此处。

语言

客户端

提供商

设置

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

output = client.image_classification("cats.jpg", model="Falconsai/nsfw_image_detection")

图像分割

了解更多关于图像分割的信息，请点击此处。

语言

客户端

提供商

设置

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

output = client.image_segmentation("cats.jpg", model="jonathandinu/face-parsing")

目标检测

了解更多关于目标检测的信息，请点击此处。

语言

客户端

提供商

设置

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

output = client.object_detection("cats.jpg", model="facebook/detr-resnet-50")

问答

了解更多关于问答的信息，请点击此处。

提供商

设置

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

answer = client.question_answering(
    question="What is my name?",
    context="My name is Clara and I live in Berkeley.",
    model="deepset/roberta-base-squad2",
)

摘要

了解更多关于摘要的信息，请点击此处。

语言

客户端

提供商

设置

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

result = client.summarization(
    "The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct.",
    model="facebook/bart-large-cnn",
)

表格问答

了解更多关于表格问答的信息，请点击此处。

提供商

设置

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

answer = client.table_question_answering(
    query="How many stars does the transformers repository have?",
    table={"Repository":["Transformers","Datasets","Tokenizers"],"Stars":["36542","4512","3934"],"Contributors":["651","77","34"],"Programming language":["Python","Python","Rust, Python and NodeJS"]},
    model="google/tapas-base-finetuned-wtq",
)

文本分类

了解更多关于文本分类的信息，请点击此处。

语言

客户端

提供商

设置

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

result = client.text_classification(
    "I like you. I love you",
    model="tabularisai/multilingual-sentiment-analysis",
)

文本生成

了解更多关于文本生成的信息，请点击这里。

语言

客户端

提供商

设置

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

completion = client.chat.completions.create(
    model="HuggingFaceTB/SmolLM3-3B",
    messages="\"Can you please let us know more details about your \"",
)

print(completion.choices[0].message)

文本转图像

了解更多关于文本到图像的信息，请点击这里。

语言

客户端

提供商

设置

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

# output is a PIL.Image object
image = client.text_to_image(
    "Astronaut riding a horse",
    model="black-forest-labs/FLUX.1-dev",
)

标记分类

了解更多关于标记分类的信息，请点击此处。

语言

客户端

提供商

设置

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

result = client.token_classification(
    "My name is Sarah Jessica Parker but you can call me Jessica",
    model="dslim/bert-base-NER",
)

翻译

了解更多关于翻译的信息，请点击此处。

语言

客户端

提供商

设置

import os
from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key=os.environ["HF_TOKEN"],
)

result = client.translation(
    "Меня зовут Вольфганг и я живу в Берлине",
    model="google-t5/t5-small",
)

零样本分类

了解更多关于零样本分类的信息，请点击此处。

语言

提供商

设置

import os
import requests

API_URL = "https://router.huggingface.co/hf-inference/models/facebook/bart-large-mnli"
headers = {
    "Authorization": f"Bearer {os.environ['HF_TOKEN']}",
}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()

output = query({
    "inputs": "Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!",
    "parameters": {"candidate_labels": ["refund", "legal", "faq"]},
})

< > 在 GitHub 上更新

推理服务提供商