Tasks

Text Classification

Text Classification is the task of assigning a label or class to a given text. Some use cases are sentiment analysis, natural language inference, and assessing grammatical correctness.

Inputs
Input

I love Hugging Face!

Text Classification Model
Output
POSITIVE
0.900
NEUTRAL
0.100
NEGATIVE
0.000

About Text Classification

Use Cases

Sentiment Analysis on Customer Reviews

You can track the sentiments of your customers from the product reviews using sentiment analysis models. This can help understand churn and retention by grouping reviews by sentiment, to later analyze the text and make strategic decisions based on this knowledge.

Task Variants

Natural Language Inference (NLI)

In NLI the model determines the relationship between two given texts. Concretely, the model takes a premise and a hypothesis and returns a class that can either be:

  • entailment, which means the hypothesis is true.
  • contraction, which means the hypothesis is false.
  • neutral, which means there's no relation between the hypothesis and the premise.

The benchmark dataset for this task is GLUE (General Language Understanding Evaluation). NLI models have different variants, such as Multi-Genre NLI, Question NLI and Winograd NLI.

Multi-Genre NLI (MNLI)

MNLI is used for general NLI. Here are som examples:

Example 1:
    Premise: A man inspects the uniform of a figure in some East Asian country.
    Hypothesis: The man is sleeping.
    Label: Contradiction

Example 2:
    Premise: Soccer game with multiple males playing.
    Hypothesis: Some men are playing a sport.
    Label: Entailment

Inference

You can use the 🤗 Transformers library text-classification pipeline to infer with NLI models.

from transformers import pipeline

classifier = pipeline("text-classification", model = "roberta-large-mnli")
classifier("A soccer game with multiple males playing. Some men are playing a sport.")
## [{'label': 'ENTAILMENT', 'score': 0.98}]

Question Natural Language Inference (QNLI)

QNLI is the task of determining if the answer to a certain question can be found in a given document. If the answer can be found the label is “entailment”. If the answer cannot be found the label is “not entailment".

Question: What percentage of marine life died during the extinction?
Sentence: It is also known as the “Great Dying” because it is considered the largest mass extinction in the Earth’s history.
Label: not entailment

Question: Who was the London Weekend Television’s Managing Director?
Sentence: The managing director of London Weekend Television (LWT), Greg Dyke, met with the representatives of the "big five" football clubs in England in 1990.
Label: entailment

Inference

You can use the 🤗 Transformers library text-classification pipeline to infer with QNLI models. The model returns the label and the confidence.

from transformers import pipeline

classifier = pipeline("text-classification", model = "cross-encoder/qnli-electra-base")
classifier("Where is the capital of France?, Paris is the capital of France.")
## [{'label': 'entailment', 'score': 0.997}]

Sentiment Analysis

In Sentiment Analysis, the classes can be polarities like positive, negative, neutral, or sentiments such as happiness or anger.

Inference

You can use the 🤗 Transformers library with the sentiment-analysis pipeline to infer with Sentiment Analysis models. The model returns the label with the score.

from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier("I loved Star Wars so much!")
##  [{'label': 'POSITIVE', 'score': 0.99}

Quora Question Pairs

Quora Question Pairs models assess whether two provided questions are paraphrases of each other. The model takes two questions and returns a binary value, with 0 being mapped to “not paraphrase” and 1 to “paraphrase". The benchmark dataset is Quora Question Pairs inside the GLUE benchmark. The dataset consists of question pairs and their labels.

Question1: “How can I increase the speed of my internet connection while using a VPN?”
Question2: How can Internet speed be increased by hacking through DNS?
Label: Not paraphrase

Question1: “What can make Physics easy to learn?”
Question2: “How can you make physics easy to learn?”
Label: Paraphrase

Inference

You can use the 🤗 Transformers library text-classification pipeline to infer with QQPI models.

from transformers import pipeline

classifier = pipeline("text-classification", model = "textattack/bert-base-uncased-QQP")
classifier("Which city is the capital of France?, Where is the capital of France?")
## [{'label': 'paraphrase', 'score': 0.998}]

You can use huggingface.js to infer text classification models on Hugging Face Hub.

import { HfInference } from "@huggingface/inference";

const inference = new HfInference(HF_TOKEN);
await inference.conversational({
    model: "distilbert-base-uncased-finetuned-sst-2-english",
    inputs: "I love this movie!",
});

Grammatical Correctness

Linguistic Acceptability is the task of assessing the grammatical acceptability of a sentence. The classes in this task are “acceptable” and “unacceptable”. The benchmark dataset used for this task is Corpus of Linguistic Acceptability (CoLA). The dataset consists of texts and their labels.

Example: Books were sent to each other by the students.
Label: Unacceptable

Example: She voted for herself.
Label: Acceptable.

Inference

from transformers import pipeline

classifier = pipeline("text-classification", model = "textattack/distilbert-base-uncased-CoLA")
classifier("I will walk to home when I went through the bus.")
##  [{'label': 'unacceptable', 'score': 0.95}]

Useful Resources

Would you like to learn more about the topic? Awesome! Here you can find some curated resources that you may find helpful!

Notebooks

Scripts for training

Documentation

Compatible libraries

Text Classification demo
Models for Text Classification
Browse Models (82,558)
Datasets for Text Classification
Browse Datasets (4,976)

Note A widely used dataset used to benchmark multiple variants of text classification.

Note A text classification dataset used to benchmark natural language inference models

Spaces using Text Classification

Note An application that can classify financial sentiment.

Note A dashboard that contains various text classification tasks.

Note An application that analyzes user reviews in healthcare.

Metrics for Text Classification
accuracy
Accuracy is the proportion of correct predictions among the total number of cases processed. It can be computed with: Accuracy = (TP + TN) / (TP + TN + FP + FN) Where: TP: True positive TN: True negative FP: False positive FN: False negative
recall
Recall is the fraction of the positive examples that were correctly labeled by the model as positive. It can be computed with the equation: Recall = TP / (TP + FN) Where TP is the true positives and FN is the false negatives.
precision
Precision is the fraction of correctly labeled positive examples out of all of the examples that were labeled as positive. It is computed via the equation: Precision = TP / (TP + FP) where TP is the True positives (i.e. the examples correctly labeled as positive) and FP is the False positive examples (i.e. the examples incorrectly labeled as positive).
f1
The F1 metric is the harmonic mean of the precision and recall. It can be calculated as: F1 = 2 * (precision * recall) / (precision + recall)