推理提供商文档
HF 推理
加入 Hugging Face 社区
并获得增强的文档体验
开始使用
HF 推理
所有支持的 HF 推理模型都可以在 这里 找到
HF 推理是 Hugging Face 提供支持的无服务器推理 API。该服务在推理提供商之前被称为“推理 API (serverless)”。如果您有兴趣在 Hugging Face 管理的专用和自动伸缩基础设施上部署模型,请改用 推理端点。
截至 2025 年 7 月,hf-inference 主要专注于 CPU 推理(例如,嵌入、文本排名、文本分类,或像 BERT 或 GPT-2 这样具有历史意义的小型 LLM)。
支持的任务
自动语音识别
详细了解自动语音识别 此处。
语言
客户端
提供商
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
output = client.automatic_speech_recognition("sample1.flac", model="openai/whisper-large-v3")聊天补全 (LLM)
了解更多关于聊天补全 (LLM) 的信息,请点击这里。
语言
客户端
提供商
import os
from openai import OpenAI
client = OpenAI(
base_url="https://router.huggingface.co/v1",
api_key=os.environ["HF_TOKEN"],
)
completion = client.chat.completions.create(
model="HuggingFaceTB/SmolLM3-3B:hf-inference",
messages=[
{
"role": "user",
"content": "What is the capital of France?"
}
],
)
print(completion.choices[0].message)特征提取
详细了解特征提取 此处。
语言
客户端
提供商
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
result = client.feature_extraction(
"Today is a sunny day and I will get some ice cream.",
model="intfloat/multilingual-e5-large",
)填空
详细了解填空 此处。
语言
客户端
提供商
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
result = client.fill_mask(
"The answer to the universe is undefined.",
model="google-bert/bert-base-uncased",
)图像分类
详细了解图像分类 此处。
语言
客户端
提供商
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
output = client.image_classification("cats.jpg", model="Falconsai/nsfw_image_detection")图像分割
详细了解图像分割 此处。
语言
客户端
提供商
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
output = client.image_segmentation("cats.jpg", model="fashn-ai/fashn-human-parser")目标检测
详细了解目标检测 此处。
语言
客户端
提供商
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
output = client.object_detection("cats.jpg", model="facebook/detr-resnet-50")问答
详细了解问答 此处。
提供商
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
answer = client.question_answering(
question="What is my name?",
context="My name is Clara and I live in Berkeley.",
model="deepset/roberta-base-squad2",
)摘要
详细了解摘要 此处。
语言
客户端
提供商
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
result = client.summarization(
"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct.",
model="facebook/bart-large-cnn",
)表格问答
详细了解表格问答 此处。
提供商
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
answer = client.table_question_answering(
query="How many stars does the transformers repository have?",
table={"Repository":["Transformers","Datasets","Tokenizers"],"Stars":["36542","4512","3934"],"Contributors":["651","77","34"],"Programming language":["Python","Python","Rust, Python and NodeJS"]},
model="google/tapas-base-finetuned-wtq",
)文本分类
详细了解文本分类 此处。
语言
客户端
提供商
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
result = client.text_classification(
"I like you. I love you",
model="BAAI/bge-reranker-v2-m3",
)文本生成
详细了解文本生成 此处。
语言
客户端
提供商
import os
from openai import OpenAI
client = OpenAI(
base_url="https://router.huggingface.co/hf-inference/models/HuggingFaceTB/SmolLM3-3B",
api_key=os.environ["HF_TOKEN"],
)
completion = client.chat.completions.create(
model="HuggingFaceTB/SmolLM3-3B",
messages="\"Can you please let us know more details about your \"",
)
print(completion.choices[0].message)文本到图像
详细了解文本到图像 此处。
语言
客户端
提供商
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
# output is a PIL.Image object
image = client.text_to_image(
"Astronaut riding a horse",
model="black-forest-labs/FLUX.1-dev",
)令牌分类
详细了解令牌分类 此处。
语言
客户端
提供商
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
result = client.token_classification(
"My name is Sarah Jessica Parker but you can call me Jessica",
model="dslim/bert-base-NER",
)翻译
详细了解翻译 此处。
语言
客户端
提供商
import os
from huggingface_hub import InferenceClient
client = InferenceClient(
provider="hf-inference",
api_key=os.environ["HF_TOKEN"],
)
result = client.translation(
"Меня зовут Вольфганг и я живу в Берлине",
model="google-t5/t5-large",
)零样本分类
详细了解零样本分类 此处。
语言
提供商
import os
import requests
API_URL = "https://router.huggingface.co/hf-inference/models/facebook/bart-large-mnli"
headers = {
"Authorization": f"Bearer {os.environ['HF_TOKEN']}",
}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({
"inputs": "Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!",
"parameters": {"candidate_labels": ["refund", "legal", "faq"]},
})
