OleehyO commited on
Commit
0212f78
1 Parent(s): ae6fc38

Upload ./README_zh.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README_zh.md +18 -4
README_zh.md CHANGED
@@ -1,7 +1,21 @@
1
- # 关于TexTeller
 
 
 
 
 
 
 
2
 
3
- TexTeller是一个基于ViT的端到端公式识别模型,可以把图片转换为对应的latex公式
4
 
5
- TexTeller用了550K的图片-公式对进行训练(数据集可以在[这里](https://huggingface.co/datasets/OleehyO/latex-formulas)获取),相比于[LaTeX-OCR](https://github.com/lukas-blecher/LaTeX-OCR)(使用了一个100K的数据集),TexTeller具有**更强的泛化能力**以及**更高的精确度**,可以**覆盖你大部分的使用场景**。
6
 
7
- > 详情信息请参阅[TexTeller的github仓库](https://github.com/OleehyO/TexTeller?tab=readme-ov-file)
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - OleehyO/latex-formulas
5
+ metrics:
6
+ - bleu
7
+ pipeline_tag: image-to-text
8
+ ---
9
 
10
+ [中文版本](./README_zh.md)
11
 
12
+ # About TexTeller
13
 
14
+ * 📮[2024-03-25] TexTeller 2.0 released! The training data for TexTeller 2.0 has been increased to 7.5M (about **15 times more** than TexTeller 1.0 and also improved in data quality). The trained TexTeller 2.0 demonstrated **superior performance** in the test set, especially in recognizing rare symbols, complex multi-line formulas, and matrices.
15
+ > [There](https://github.com/OleehyO/TexTeller/blob/main/assets/test.pdf) are more test images here and a horizontal comparison of recognition models from different companies.
16
+
17
+ TexTeller is a ViT-based model designed for end-to-end formula recognition. It can recognize formulas in natural images and convert them into LaTeX-style formulas.
18
+
19
+ TexTeller is trained on a larger dataset of image-formula pairs (a 550K dataset available [here](https://huggingface.co/datasets/OleehyO/latex-formulas)), **exhibits superior generalization ability and higher accuracy compared to [LaTeX-OCR](https://github.com/lukas-blecher/LaTeX-OCR)**, which uses approximately 100K data points. This larger dataset enables TexTeller to cover most usage scenarios more effectively.
20
+
21
+ > For more details, please refer to the 𝐓𝐞𝐱𝐓𝐞𝐥𝐥𝐞𝐫 [GitHub repository](https://github.com/OleehyO/TexTeller?tab=readme-ov-file).