Pipelines

The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of the complex code from the library, offering a simple API dedicated to several tasks, including Named Entity Recognition, Masked Language Modeling, Sentiment Analysis, Feature Extraction and Question Answering. See the task summary for examples of use.

There are two categories of pipeline abstractions to be aware about

The pipeline() which is the most powerful object encapsulating all other pipelines.
Task-specific pipelines are available for audio, computer vision, natural language processing, and multimodal tasks.

The pipeline abstraction

The pipeline abstraction is a wrapper around all the other available pipelines. It is instantiated as any other pipeline but can provide additional quality of life.

Simple call on one item

>>> pipe = pipeline("text-classification")
>>> pipe("This restaurant is awesome")
[{'label': 'POSITIVE', 'score': 0.9998743534088135}]

If you want to use a specific model from the hub you can ignore the task if the model on the hub already defines it

>>> pipe = pipeline(model="FacebookAI/roberta-large-mnli")
>>> pipe("This restaurant is awesome")
[{'label': 'NEUTRAL', 'score': 0.7313136458396912}]

To call a pipeline on many items, you can call it with a list.

>>> pipe = pipeline("text-classification")
>>> pipe(["This restaurant is awesome", "This restaurant is awful"])
[{'label': 'POSITIVE', 'score': 0.9998743534088135},
 {'label': 'NEGATIVE', 'score': 0.9996669292449951}]

To iterate over full datasets it is recommended to use a dataset directly. This means you don’t need to allocate the whole dataset at once, nor do you need to do batching yourself. This should work just as fast as custom loops on GPU. If it doesn’t don’t hesitate to create an issue.

import datasets
from transformers import pipeline
from transformers.pipelines.pt_utils import KeyDataset
from tqdm.auto import tqdm

pipe = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h", device=0)
dataset = datasets.load_dataset("superb", name="asr", split="test")

# KeyDataset (only *pt*) will simply return the item in the dict returned by the dataset item
# as we're not interested in the *target* part of the dataset. For sentence pair use KeyPairDataset
for out in tqdm(pipe(KeyDataset(dataset, "file"))):
    print(out)
    # {"text": "NUMBER TEN FRESH NELLY IS WAITING ON YOU GOOD NIGHT HUSBAND"}
    # {"text": ....}
    # ....

For ease of use, a generator is also possible

from transformers import pipeline

pipe = pipeline("text-classification")


def data():
    while True:
        # This could come from a dataset, a database, a queue or HTTP request
        # in a server
        # Caveat: because this is iterative, you cannot use `num_workers > 1` variable
        # to use multiple threads to preprocess data. You can still have 1 thread that
        # does the preprocessing while the main runs the big inference
        yield "This is a test"


for out in pipe(data()):
    print(out)
    # {"text": "NUMBER TEN FRESH NELLY IS WAITING ON YOU GOOD NIGHT HUSBAND"}
    # {"text": ....}
    # ....

Transformers

Pipelines

The pipeline abstraction

transformers.pipeline

Pipeline 批处理

Pipeline 块批处理

Pipeline FP16 推理

Pipeline 自定义代码

实现 pipeline

音频

AudioClassificationPipeline

class transformers.AudioClassificationPipeline

__call__

AutomaticSpeechRecognitionPipeline

class transformers.AutomaticSpeechRecognitionPipeline

__call__

TextToAudioPipeline

class transformers.TextToAudioPipeline

__call__

ZeroShotAudioClassificationPipeline

class transformers.ZeroShotAudioClassificationPipeline

__call__

计算机视觉

DepthEstimationPipeline

class transformers.DepthEstimationPipeline

__call__

ImageClassificationPipeline

class transformers.ImageClassificationPipeline

__call__

ImageSegmentationPipeline

class transformers.ImageSegmentationPipeline

__call__

ImageToImagePipeline

class transformers.ImageToImagePipeline

__call__

ObjectDetectionPipeline

class transformers.ObjectDetectionPipeline

__call__

VideoClassificationPipeline

class transformers.VideoClassificationPipeline

__call__

ZeroShotImageClassificationPipeline

class transformers.ZeroShotImageClassificationPipeline

__call__

ZeroShotObjectDetectionPipeline

class transformers.ZeroShotObjectDetectionPipeline

__call__

自然语言处理

FillMaskPipeline

class transformers.FillMaskPipeline

__call__

QuestionAnsweringPipeline

class transformers.QuestionAnsweringPipeline

__call__

create_sample

span_to_answer

SummarizationPipeline

class transformers.SummarizationPipeline

__call__

TableQuestionAnsweringPipeline

class transformers.TableQuestionAnsweringPipeline

__call__

TextClassificationPipeline

class transformers.TextClassificationPipeline

__call__

TextGenerationPipeline

class transformers.TextGenerationPipeline

__call__

Text2TextGenerationPipeline

class transformers.Text2TextGenerationPipeline

__call__

check_inputs

TokenClassificationPipeline

class transformers.TokenClassificationPipeline

__call__

aggregate_words

gather_pre_entities

group_entities

group_sub_entities

TranslationPipeline

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call