Transformers documentation

Pipelines

Transformers

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

Pipelines

pipelines是使用模型进行推理的一种简单方法。这些pipelines是抽象了库中大部分复杂代码的对象，提供了一个专用于多个任务的简单API，包括专名识别、掩码语言建模、情感分析、特征提取和问答等。请参阅任务摘要以获取使用示例。

有两种pipelines抽象类需要注意：

pipeline()，它是封装所有其他pipelines的最强大的对象。
针对特定任务pipelines，适用于音频、计算机视觉、自然语言处理和多模态任务。

pipeline抽象类

pipeline抽象类是对所有其他可用pipeline的封装。它可以像任何其他pipeline一样实例化，但进一步提供额外的便利性。

简单调用一个项目：

>>> pipe = pipeline("text-classification")
>>> pipe("This restaurant is awesome")
[{'label': 'POSITIVE', 'score': 0.9998743534088135}]

如果您想使用 hub 上的特定模型，可以忽略任务，如果hub上的模型已经定义了该任务：

>>> pipe = pipeline(model="FacebookAI/roberta-large-mnli")
>>> pipe("This restaurant is awesome")
[{'label': 'NEUTRAL', 'score': 0.7313136458396912}]

要在多个项目上调用pipeline，可以使用列表调用它。

>>> pipe = pipeline("text-classification")
>>> pipe(["This restaurant is awesome", "This restaurant is awful"])
[{'label': 'POSITIVE', 'score': 0.9998743534088135},
 {'label': 'NEGATIVE', 'score': 0.9996669292449951}]

为了遍历整个数据集，建议直接使用 dataset。这意味着您不需要一次性分配整个数据集，也不需要自己进行批处理。这应该与GPU上的自定义循环一样快。如果不是，请随时提出issue。

import datasets
from transformers import pipeline
from transformers.pipelines.pt_utils import KeyDataset
from tqdm.auto import tqdm

pipe = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h", device=0)
dataset = datasets.load_dataset("superb", name="asr", split="test")

# KeyDataset (only *pt*) will simply return the item in the dict returned by the dataset item
# as we're not interested in the *target* part of the dataset. For sentence pair use KeyPairDataset
for out in tqdm(pipe(KeyDataset(dataset, "file"))):
    print(out)
    # {"text": "NUMBER TEN FRESH NELLY IS WAITING ON YOU GOOD NIGHT HUSBAND"}
    # {"text": ....}
    # ....

为了方便使用，也可以使用生成器：

from transformers import pipeline

pipe = pipeline("text-classification")


def data():
    while True:
        # This could come from a dataset, a database, a queue or HTTP request
        # in a server
        # Caveat: because this is iterative, you cannot use `num_workers > 1` variable
        # to use multiple threads to preprocess data. You can still have 1 thread that
        # does the preprocessing while the main runs the big inference
        yield "This is a test"


for out in pipe(data()):
    print(out)
    # {"text": "NUMBER TEN FRESH NELLY IS WAITING ON YOU GOOD NIGHT HUSBAND"}
    # {"text": ....}
    # ....

Transformers

Pipelines

pipeline抽象类

transformers.pipeline

Pipeline batching

Pipeline chunk batching

Pipeline自定义

实现一个pipeline

音频

AudioClassificationPipeline

class transformers.AudioClassificationPipeline

__call__

AutomaticSpeechRecognitionPipeline

class transformers.AutomaticSpeechRecognitionPipeline

__call__

TextToAudioPipeline

class transformers.TextToAudioPipeline

__call__

ZeroShotAudioClassificationPipeline

class transformers.ZeroShotAudioClassificationPipeline

__call__

计算机视觉

DepthEstimationPipeline

class transformers.DepthEstimationPipeline

__call__

ImageClassificationPipeline

class transformers.ImageClassificationPipeline

__call__

ImageSegmentationPipeline

class transformers.ImageSegmentationPipeline

__call__

ImageToImagePipeline

class transformers.ImageToImagePipeline

__call__

ObjectDetectionPipeline

class transformers.ObjectDetectionPipeline

__call__

VideoClassificationPipeline

class transformers.VideoClassificationPipeline

__call__

ZeroShotImageClassificationPipeline

class transformers.ZeroShotImageClassificationPipeline

__call__

ZeroShotObjectDetectionPipeline

class transformers.ZeroShotObjectDetectionPipeline

__call__

自然语言处理

FillMaskPipeline

class transformers.FillMaskPipeline

__call__

NerPipeline

class transformers.TokenClassificationPipeline

aggregate_words

gather_pre_entities

group_entities

group_sub_entities

QuestionAnsweringPipeline

class transformers.QuestionAnsweringPipeline

__call__

create_sample

span_to_answer

SummarizationPipeline

class transformers.SummarizationPipeline

__call__

TableQuestionAnsweringPipeline

class transformers.TableQuestionAnsweringPipeline

__call__

TextClassificationPipeline

class transformers.TextClassificationPipeline

__call__

TextGenerationPipeline

class transformers.TextGenerationPipeline

__call__

Text2TextGenerationPipeline

class transformers.Text2TextGenerationPipeline

__call__

check_inputs

TokenClassificationPipeline

class transformers.TokenClassificationPipeline

__call__

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call

call