AWS Trainium & Inferentia 文档

Sentence Transformers 🤗

Hugging Face's logo
加入 Hugging Face 社区

并获得增强的文档体验

开始使用

Sentence Transformers 🤗

SentenceTransformers 🤗 是一个用于最先进的句子、文本和图像嵌入的 Python 框架。它可用于使用 Sentence Transformer 模型计算嵌入或使用 Cross-Encoder(又称 reranker)模型计算相似度分数。这开启了广泛的应用,包括语义搜索、语义文本相似度和释义挖掘。Optimum Neuron 提供 API,以简化在 AWS Neuron 设备上使用 SentenceTransformers。

导出到 Neuron

选项 1:命令行界面

  • 示例 - 文本嵌入
optimum-cli export neuron -m BAAI/bge-large-en-v1.5 --sequence_length 384 --batch_size 1 --task feature-extraction bge_emb_neuron/
  • 示例 - 图像搜索
optimum-cli export neuron -m sentence-transformers/clip-ViT-B-32 --sequence_length 64 --text_batch_size 3 --image_batch_size 1 --num_channels 3 --height 224 --width 224 --task feature-extraction --subfolder 0_CLIPModel clip_emb_neuron/

选项 2:Python API

  • 示例 - 文本嵌入
from optimum.neuron import NeuronModelForSentenceTransformers

# configs for compiling model
input_shapes = {
    "batch_size": 1,
    "sequence_length": 384,
}
compiler_args = {"auto_cast": "matmul", "auto_cast_type": "bf16"}

neuron_model = NeuronModelForSentenceTransformers.from_pretrained(
    "BAAI/bge-large-en-v1.5", 
    export=True, 
    **input_shapes,
    **compiler_args,
)

# Save locally
neuron_model.save_pretrained("bge_emb_neuron/")

# Upload to the HuggingFace Hub
neuron_model.push_to_hub(
    "bge_emb_neuron/", repository_id="optimum/bge-base-en-v1.5-neuronx"  # Replace with your HF Hub repo id
)
  • 示例 - 图像搜索
from optimum.neuron import NeuronModelForSentenceTransformers

# configs for compiling model
input_shapes = {
    "num_channels": 3,
    "height": 224,
    "width": 224,
    "text_batch_size": 3,
    "image_batch_size": 1,
    "sequence_length": 64,
}
compiler_args = {"auto_cast": "matmul", "auto_cast_type": "bf16"}

neuron_model = NeuronModelForSentenceTransformers.from_pretrained(
    "sentence-transformers/clip-ViT-B-32", 
    subfolder="0_CLIPModel", 
    export=True, 
    dynamic_batch_size=False, 
    **input_shapes,
    **compiler_args,
)

# Save locally
neuron_model.save_pretrained("clip_emb_neuron/")

# Upload to the HuggingFace Hub
neuron_model.push_to_hub(
    "clip_emb_neuron/", repository_id="optimum/clip_vit_emb_neuronx"  # Replace with your HF Hub repo id
)

NeuronModelForSentenceTransformers

class optimum.neuron.NeuronModelForSentenceTransformers

< >

( model: ScriptModule config: PretrainedConfig model_save_dir: str | pathlib.Path | tempfile.TemporaryDirectory | None = None model_file_name: str | None = None preprocessors: list | None = None neuron_config: NeuronDefaultConfig | None = None **kwargs )

参数

  • config (transformers.PretrainedConfig) — PretrainedConfig 是模型配置类,包含模型的所有参数。使用配置文件进行初始化不会加载与模型相关的权重,只加载配置。请查看 optimum.neuron.modeling.NeuronTracedModel.from_pretrained 方法以加载模型权重。
  • model (torch.jit._script.ScriptModule) — torch.jit._script.ScriptModule 是包含由 neuron(x) 编译器编译的 NEFF(Neuron 可执行文件格式)的 TorchScript 模块。

用于 Sentence Transformers 的 Neuron 模型。

此模型继承自 ~neuron.modeling.NeuronTracedModel。请查看超类文档以了解库为其所有模型实现的通用方法(例如下载或保存)。

Neuron 设备上的 Sentence Transformers 模型。

forward

< >

( input_ids: Tensor attention_mask: Tensor pixel_values: torch.Tensor | None = None token_type_ids: torch.Tensor | None = None **kwargs )

参数

  • input_ids (形状为 (batch_size, sequence_length)torch.Tensor) — 词汇表中输入序列标记的索引。可以使用 AutoTokenizer 获取索引。有关详细信息,请参阅 PreTrainedTokenizer.encodePreTrainedTokenizer.__call__什么是输入 ID?
  • attention_mask (形状为 (batch_size, sequence_length)torch.Tensor | None,默认为 None) — 用于避免对填充标记索引执行注意力操作的掩码。掩码值选择在 [0, 1] 之间:
  • token_type_ids (形状为 (batch_size, sequence_length)torch.Tensor | None,默认为 None) — 片段标记索引,用于指示输入的第一部分和第二部分。索引选择范围为 [0, 1]

NeuronModelForSentenceTransformers forward 方法覆盖了 __call__ 特殊方法。它只接受编译步骤中跟踪的输入。推理期间提供的任何额外输入都将被忽略。要包含额外输入,请使用这些输入重新编译模型。

文本示例

>>> from transformers import AutoTokenizer
>>> from optimum.neuron import NeuronModelForSentenceTransformers

>>> tokenizer = AutoTokenizer.from_pretrained("optimum/bge-base-en-v1.5-neuronx")
>>> model = NeuronModelForSentenceTransformers.from_pretrained("optimum/bge-base-en-v1.5-neuronx")

>>> inputs = tokenizer("In the smouldering promise of the fall of Troy, a mythical world of gods and mortals rises from the ashes.", return_tensors="pt")

>>> outputs = model(**inputs)
>>> token_embeddings = outputs.token_embeddings
>>> sentence_embedding = = outputs.sentence_embedding

图像示例

>>> from PIL import Image
>>> from transformers import AutoProcessor
>>> from sentence_transformers import util
>>> from optimum.neuron import NeuronModelForSentenceTransformers

>>> processor = AutoProcessor.from_pretrained("optimum/clip_vit_emb_neuronx")
>>> model = NeuronModelForSentenceTransformers.from_pretrained("optimum/clip_vit_emb_neuronx")
>>> util.http_get("https://github.com/UKPLab/sentence-transformers/raw/master/examples/sentence_transformer/applications/image-search/two_dogs_in_snow.jpg", "two_dogs_in_snow.jpg")
>>> inputs = processor(
>>>     text=["Two dogs in the snow", 'A cat on a table', 'A picture of London at night'], images=Image.open("two_dogs_in_snow.jpg"), return_tensors="pt", padding=True
>>> )

>>> outputs = model(**inputs)
>>> cos_scores = util.cos_sim(outputs.image_embeds, outputs.text_embeds)  # Compute cosine similarities