Model Card for TrOCR_Math_handwritten
Model Details
TrOCR model fine-tuned on a part of the mathwriting dataset converted from InkML files into images. It was introduced in the paper TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models by Li et al. and first released in this repository.
- Developed by: [More Information Needed]
- Model type: Transformer OCR
- License: afl-3.0
- Finetuned from model [optional]: TrOCR_large_stage1
Uses
Here is how to use this model in PyTorch:
from transformers import TrOCRProcessor, VisionEncoderDecoderModel
from PIL import Image
import requests
url = "path/to/image"
image = Image.open(requests.get(url, stream=True).raw).convert("RGB")
processor = TrOCRProcessor.from_pretrained('fhswf/TrOCR_Math_handwritten')
model = VisionEncoderDecoderModel.from_pretrained('fhswf/TrOCR_Math_handwritten')
pixel_values = processor(images=image, return_tensors="pt").pixel_values
generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
Bias, Risks, and Limitations
You can use the raw model for optical character recognition (OCR) on images containing one mathematical formula.
Training Details
Training Data
This model was finetuned on a part of the mathwriting dataset converted from InkML files into images.
Evaluation
Percentage of correct recognition: 77.8%
Percentage of correct recognition with one error: 85.7%
Percentage of correct recognition with two error: 89.9%
BibTeX:
@misc{li2021trocr,
title={TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models},
author={Minghao Li and Tengchao Lv and Lei Cui and Yijuan Lu and Dinei Florencio and Cha Zhang and Zhoujun Li and Furu Wei},
year={2021},
eprint={2109.10282},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 266
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.