jadechoghari
/

textnet-base

Inference Endpoints

Model card Files Files and versions Community

jadechoghari commited on Dec 21, 2024

Commit

af495e2

·

verified ·

1 Parent(s): e382b82

Create README.md

Files changed (1) hide show

README.md +53 -0

README.md ADDED Viewed

	@@ -0,0 +1,53 @@

+## TextNet-T/S/B: Efficient Text Detection Models
+### **Overview**
+TextNet is a lightweight and efficient architecture designed specifically for text detection, offering superior performance compared to traditional models like MobileNetV3. With variants **TextNet-T**, **TextNet-S**, and **TextNet-B** (6.8M, 8.0M, and 8.9M parameters respectively), it achieves an excellent balance between accuracy and inference speed.
+### **Performance**
+TextNet achieves state-of-the-art results in text detection, outperforming hand-crafted models in both accuracy and speed. Its architecture is highly efficient, making it ideal for GPU-based applications.
+### How to use
+### Transformers
+```bash
+pip install transformers
+```
+```python
+import torch
+import requests
+from PIL import Image
+from transformers import AutoImageProcessor, AutoBackbone
+url = "http://images.cocodataset.org/val2017/000000039769.jpg"
+image = Image.open(requests.get(url, stream=True).raw)
+processor = AutoImageProcessor.from_pretrained("jadechoghari/textnet-base")
+model = AutoBackbone.from_pretrained("jadechoghari/textnet-base")
+inputs = processor(image, return_tensors="pt")
+with torch.no_grad():
+  outputs = model(**inputs)
+```
+### **Training**
+We first compare TextNet with representative hand-crafted backbones,
+such as ResNets and VGG16. For a fair comparison,
+all models are first pre-trained on IC17-MLT [52] and then
+finetuned on Total-Text. The proposed
+TextNet models achieve a better trade-off between accuracy
+and inference speed than previous hand-crafted models by a
+significant margin. In addition, notably, our TextNet-T, -S, and
+-B only have 6.8M, 8.0M, and 8.9M parameters respectively,
+which are more parameter-efficient than ResNets and VGG16.
+These results demonstrate that TextNet models are effective for
+text detection on the GPU device.
+### **Applications**
+Perfect for real-world text detection tasks, including:
+- Natural scene text recognition
+- Multi-lingual and multi-oriented text detection
+- Document text region analysis
+### **Contribution**
+This model was contributed by [Raghavan](https://huggingface.co/Raghavan),
+[jadechoghari](https://huggingface.co/jadechoghari)
+and [nielsr](https://huggingface.co/nielsr).