metadata

language: ISO 639-1 code for your language, or `multilingual`
thumbnail: url to a thumbnail used in social sharing
tags:
  - array
  - of
  - tags
license: any valid license identifier
datasets:
  - array of dataset identifiers
metrics:
  - array of metric identifiers
widget:
  - text: >-
      Plagiarism is the representation of another author's writing, thoughts,
      ideas, or expressions as one's own work.

T5-large for Word Sense Disambiguation

This is the checkpoint for T5-large after being trained on the Machine-Paraphrased Plagiarism Dataset:

Additional information about this model:

The model can be loaded to perform Plagiarism like so:

from transformers import AutoModelForSequenceClassification, AutoTokenizer

AutoModelForSequenceClassification("jpelhaw/longformer-base-plagiarism-detection")
AutoTokenizer.from_pretrained("jpelhaw/longformer-base-plagiarism-detection")

input = 'Plagiarism is the representation of another author's writing, thoughts, ideas, or expressions as one's own work.'


example = tokenizer.tokenize(input, add_special_tokens=True)

answer = model(**example)
                                
# "plagiarised"