🤗 Hugging Face 推理端点
使用 TypeScript 驱动的 Hugging Face 推理端点 API 的包装器。了解更多关于推理端点信息,请访问 Hugging Face。它与 推理 API (服务器端无状态) 和 推理端点 (专用) 都兼容。
您还可以尝试一个动态的 交互式笔记本,在 hf.co/huggingfacejs 上查看一些演示,或查看 Scrimba 教程 解释推理端点的工作原理。
入门
安装
Node
npm install @huggingface/inference pnpm add @huggingface/inference yarn add @huggingface/inference
Deno
// esm.sh
import { HfInference } from "https://esm.sh/@huggingface/inference"
// or npm:
import { HfInference } from "npm:@huggingface/inference"
初始化
import { HfInference } from '@huggingface/inference'
const hf = new HfInference('your access token')
❗重要提示: 开始使用时,使用访问令牌是可选的,但最终您会受到配额限制。加入 Hugging Face,然后访问 访问令牌 生成您的免费访问令牌。
您应该将访问令牌保持私密。如果您需要在前端应用程序中保护它,我们建议设置一个代理服务器来存储访问令牌。
Tree-shaking
您可以直接从模块导入所需的函数,而不是使用 HfInference
类。
import { textGeneration } from "@huggingface/inference";
await textGeneration({
accessToken: "hf_...",
model: "model_or_endpoint",
inputs: ...,
parameters: ...
})
这将启用打包器进行Tree-shaking。
自然语言处理
文生文
从输入提示生成文本。
await hf.textGeneration({
model: 'gpt2',
inputs: 'The answer to the universe is'
})
for await (const output of hf.textGenerationStream({
model: "google/flan-t5-xxl",
inputs: 'repeat "one two three four"',
parameters: { max_new_tokens: 250 }
})) {
console.log(output.token.text, output.generated_text);
}
文生文(与Chat Completion API兼容)
使用 chatCompletion
方法,您可以使用与OpenAI Chat Completion API兼容的模型生成文本。所有由Hugging Face TGI 提供的模型都支持Messages API。
// Non-streaming API
const out = await hf.chatCompletion({
model: "mistralai/Mistral-7B-Instruct-v0.2",
messages: [{ role: "user", content: "Complete the this sentence with words one plus one is equal " }],
max_tokens: 500,
temperature: 0.1,
seed: 0,
});
// Streaming API
let out = "";
for await (const chunk of hf.chatCompletionStream({
model: "mistralai/Mistral-7B-Instruct-v0.2",
messages: [
{ role: "user", content: "Complete the equation 1+1= ,just the answer" },
],
max_tokens: 500,
temperature: 0.1,
seed: 0,
})) {
if (chunk.choices && chunk.choices.length > 0) {
out += chunk.choices[0].delta.content;
}
}
也可以直接调用Mistral或OpenAI端点。
const openai = new HfInference(OPENAI_TOKEN).endpoint("https://api.openai.com");
let out = "";
for await (const chunk of openai.chatCompletionStream({
model: "gpt-3.5-turbo",
messages: [
{ role: "user", content: "Complete the equation 1+1= ,just the answer" },
],
max_tokens: 500,
temperature: 0.1,
seed: 0,
})) {
if (chunk.choices && chunk.choices.length > 0) {
out += chunk.choices[0].delta.content;
}
}
// For mistral AI:
// endpointUrl: "https://api.mistral.ai"
// model: "mistral-tiny"
填空
尝试用缺失的词(确切来说是标记)填充空缺。
await hf.fillMask({
model: 'bert-base-uncased',
inputs: '[MASK] world!'
})
摘要
将较长的文本压缩为较短的文本。请注意,一些模型有输入长度的最大限制。
await hf.summarization({
model: 'facebook/bart-large-cnn',
inputs:
'The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930.',
parameters: {
max_length: 100
}
})
问答
根据您提供的上下文回答问题。
await hf.questionAnswering({
model: 'deepset/roberta-base-squad2',
inputs: {
question: 'What is the capital of France?',
context: 'The capital of France is Paris.'
}
})
表格问答
await hf.tableQuestionAnswering({
model: 'google/tapas-base-finetuned-wtq',
inputs: {
query: 'How many stars does the transformers repository have?',
table: {
Repository: ['Transformers', 'Datasets', 'Tokenizers'],
Stars: ['36542', '4512', '3934'],
Contributors: ['651', '77', '34'],
'Programming language': ['Python', 'Python', 'Rust, Python and NodeJS']
}
}
})
文本分类
常用于情感分析,此方法会对给定的文本赋予标签和概率分数。
await hf.textClassification({
model: 'distilbert-base-uncased-finetuned-sst-2-english',
inputs: 'I like you. I love you.'
})
标记分类
用于句法解析或命名实体识别(NER),以理解文本中包含的关键词。
await hf.tokenClassification({
model: 'dbmdz/bert-large-cased-finetuned-conll03-english',
inputs: 'My name is Sarah Jessica Parker but you can call me Jessica'
})
翻译
将文本从一种语言转换为另一种语言。
await hf.translation({
model: 't5-base',
inputs: 'My name is Wolfgang and I live in Berlin'
})
await hf.translation({
model: 'facebook/mbart-large-50-many-to-many-mmt',
inputs: textToTranslate,
parameters: {
"src_lang": "en_XX",
"tgt_lang": "fr_XX"
}
})
零样本分类
检查输入文本与你提供的标签集的匹配程度。
await hf.zeroShotClassification({
model: 'facebook/bart-large-mnli',
inputs: [
'Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!'
],
parameters: { candidate_labels: ['refund', 'legal', 'faq'] }
})
对话
此任务对应于任何聊天机器人-like 结构。模型通常具有较短的 max_length,请在使用特定模型时小心检查是否需要长距离依赖。
await hf.conversational({
model: 'microsoft/DialoGPT-large',
inputs: {
past_user_inputs: ['Which movie is the best ?'],
generated_responses: ['It is Die Hard for sure.'],
text: 'Can you explain why ?'
}
})
句子相似度
计算一个文本与一组其他句子之间的语义相似度。
await hf.sentenceSimilarity({
model: 'sentence-transformers/paraphrase-xlm-r-multilingual-v1',
inputs: {
source_sentence: 'That is a happy person',
sentences: [
'That is a happy dog',
'That is a very happy person',
'Today is a sunny day'
]
}
})
将音频文件中的语音转录。
await hf.automaticSpeechRecognition({
model: 'facebook/wav2vec2-large-960h-lv60-self',
data: readFileSync('test/sample1.flac')
})
为给定的音频分配标签,并附带该标签的概率评分。
await hf.audioClassification({
model: 'superb/hubert-large-superb-er',
data: readFileSync('test/sample1.flac')
})
从文本输入生成自然声音的语音。
await hf.textToSpeech({
model: 'espnet/kan-bayashi_ljspeech_vits',
inputs: 'Hello world!'
})
音频转音频
从一个输入音频生成一个或多个音频输出,常用于语音增强和源分离。
await hf.audioToAudio({
model: 'speechbrain/sepformer-wham',
data: readFileSync('test/sample1.flac')
})
计算机视觉
图像分类
为给定图像分配标签,并附带相应标签的概率得分。
await hf.imageClassification({
data: readFileSync('test/cheetah.png'),
model: 'google/vit-base-patch16-224'
})
目标检测
在图像中检测物体,并返回相应标签的边界框和概率得分。
await hf.objectDetection({
data: readFileSync('test/cats.png'),
model: 'facebook/detr-resnet-50'
})
图像分割
检测图像中的区域并返回有相应边界框和概率分数的标签。
await hf.imageSegmentation({
data: readFileSync('test/cats.png'),
model: 'facebook/detr-resnet-50-panoptic'
})
图像转文本
从给定图像输出文本,通常用于字幕或光学字符识别。
await hf.imageToText({
data: readFileSync('test/cats.png'),
model: 'nlpconnect/vit-gpt2-image-captioning'
})
文本转图像
根据文本提示创建图像。
await hf.textToImage({
inputs: 'award winning high resolution photo of a giant tortoise/((ladybird)) hybrid, [trending on artstation]',
model: 'stabilityai/stable-diffusion-2',
parameters: {
negative_prompt: 'blurry',
}
})
图像转图像
图像到图像的任务是将源图像转换为与目标图像或目标图像域的特征相匹配的图像。
await hf.imageToImage({
inputs: new Blob([readFileSync("test/stormtrooper_depth.png")]),
parameters: {
prompt: "elmo's lecture",
},
model: "lllyasviel/sd-controlnet-depth",
});
零样本图像分类
检查输入的图像与您提供的标签集合拟合得如何。
await hf.zeroShotImageClassification({
model: 'openai/clip-vit-large-patch14-336',
inputs: {
image: await (await fetch('https://placekitten.com/300/300')).blob()
},
parameters: {
candidate_labels: ['cat', 'dog']
}
})
多模态
特征提取
这项任务读取一些文本并输出原始浮点数值,这些数值通常用作语义数据库/语义搜索的一部分。
await hf.featureExtraction({
model: "sentence-transformers/distilbert-base-nli-mean-tokens",
inputs: "That is a happy person",
});
视觉问答
视觉问答是根据图像回答开放式问题的任务。它们对自然语言问题输出自然语言响应。
await hf.visualQuestionAnswering({
model: 'dandelin/vilt-b32-finetuned-vqa',
inputs: {
question: 'How many cats are lying down?',
image: await (await fetch('https://placekitten.com/300/300')).blob()
}
})
文档问答
文档问答模型以(文档,问题)对作为输入,并返回自然语言的答案。
await hf.documentQuestionAnswering({
model: 'impira/layoutlm-document-qa',
inputs: {
question: 'Invoice number?',
image: await (await fetch('https://huggingface.co/spaces/impira/docquery/resolve/2359223c1837a7587402bda0f2643382a6eefeab/invoice.png')).blob(),
}
})
表格
表格回归
表格回归是给定一组属性预测数值的任务。
await hf.tabularRegression({
model: "scikit-learn/Fish-Weight",
inputs: {
data: {
"Height": ["11.52", "12.48", "12.3778"],
"Length1": ["23.2", "24", "23.9"],
"Length2": ["25.4", "26.3", "26.5"],
"Length3": ["30", "31.2", "31.1"],
"Species": ["Bream", "Bream", "Bream"],
"Width": ["4.02", "4.3056", "4.6961"]
},
},
})
表格分类
表格分类是根据一组属性将目标类别(一个组)分类的任务。
await hf.tabularClassification({
model: "vvmnnnkv/wine-quality",
inputs: {
data: {
"fixed_acidity": ["7.4", "7.8", "10.3"],
"volatile_acidity": ["0.7", "0.88", "0.32"],
"citric_acid": ["0", "0", "0.45"],
"residual_sugar": ["1.9", "2.6", "6.4"],
"chlorides": ["0.076", "0.098", "0.073"],
"free_sulfur_dioxide": ["11", "25", "5"],
"total_sulfur_dioxide": ["34", "67", "13"],
"density": ["0.9978", "0.9968", "0.9976"],
"pH": ["3.51", "3.2", "3.23"],
"sulphates": ["0.56", "0.68", "0.82"],
"alcohol": ["9.4", "9.8", "12.6"]
},
},
})
自定义调用
针对带有自定义参数/输出的模型。
await hf.request({
model: 'my-custom-model',
inputs: 'hello world',
parameters: {
custom_param: 'some magic',
}
})
// Custom streaming call, for models with custom parameters / outputs
for await (const output of hf.streamingRequest({
model: 'my-custom-model',
inputs: 'hello world',
parameters: {
custom_param: 'some magic',
}
})) {
...
}
您可以使用与Chat Completion API兼容的任何提供者,并通过chatCompletion方法使用它。
// Chat Completion Example
const MISTRAL_KEY = process.env.MISTRAL_KEY;
const hf = new HfInference(MISTRAL_KEY);
const ep = hf.endpoint("https://api.mistral.ai");
const stream = ep.chatCompletionStream({
model: "mistral-tiny",
messages: [{ role: "user", content: "Complete the equation one + one = , just the answer" }],
});
let out = "";
for await (const chunk of stream) {
if (chunk.choices && chunk.choices.length > 0) {
out += chunk.choices[0].delta.content;
console.log(out);
}
}
自定义推理端点
了解有关使用自己的推理端点的更多信息这里
const gpt2 = hf.endpoint('https://xyz.eu-west-1.aws.endpoints.huggingface.cloud/gpt2');
const { generated_text } = await gpt2.textGeneration({inputs: 'The answer to the universe is'});
// Chat Completion Example
const ep = hf.endpoint(
"https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.2"
);
const stream = ep.chatCompletionStream({
model: "tgi",
messages: [{ role: "user", content: "Complete the equation 1+1= ,just the answer" }],
max_tokens: 500,
temperature: 0.1,
seed: 0,
});
let out = "";
for await (const chunk of stream) {
if (chunk.choices && chunk.choices.length > 0) {
out += chunk.choices[0].delta.content;
console.log(out);
}
}
默认情况下,所有对推理端点的调用都将等待模型加载完成。当在端点上启用缩小到0时,这可能导致非平凡的等待时间。如果您想禁用此行为并自行处理端点返回的500 HTTP错误,可以按以下方式操作
const gpt2 = hf.endpoint('https://xyz.eu-west-1.aws.endpoints.huggingface.cloud/gpt2');
const { generated_text } = await gpt2.textGeneration(
{inputs: 'The answer to the universe is'},
{retry_on_error: false},
);
运行测试
HF_TOKEN="your access token" pnpm run test
寻找合适的模型
我们有一个名为Tasks的富有信息性的文档项目,该项目列出了每个任务可用的模型,并详细说明了每个任务的工作方式。
它还包含演示、示例输出和其他资源,如果您想深入了解机器学习(ML)方面的话。
依赖项
@huggingface/tasks
: 仅类型定义