tobiadefami commited on
Commit
fae9a16
·
verified ·
1 Parent(s): 0e20f15

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -3
README.md CHANGED
@@ -1,3 +1,44 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ### Model Description
2
+ DocModel is a document understanding model built on the RoBERTa architecture. It captures both textual content and 2D spatial relationships, making it ideal for tasks that require processing complex document layouts, such as forms, tables, and scanned documents.
3
+
4
+ Developed by: [Oluwatobi Adefami, Madison May]
5
+ Model type: Document Understanding (Information Extraction)
6
+ License: Apache-2.0
7
+ Model Sources
8
+ Repository: [https://github.com/Tobiadefami/docmodel]
9
+ Model Hub: [https://huggingface.co/tobiadefami/docmodel-base]
10
+
11
+ ### Uses
12
+ DocModel can be directly used for document processing, form understanding, and entity extraction from structured and semi-structured documents.
13
+
14
+ ### Out-of-Scope Use
15
+ Not recommended for tasks that involve purely textual data without layout components or heavily distorted document scans.
16
+
17
+ ### Bias, Risks, and Limitations
18
+ DocModel’s performance may degrade on highly noisy or poorly structured documents, such as extreme distortions or low-resolution scans.
19
+
20
+ ### Recommendations
21
+ Users should be mindful of the model’s limitations, particularly in handling documents with severe layout inconsistencies.
22
+
23
+ How to Get Started with the Model
24
+ ``` python
25
+ from transformers import AutoTokenizer, AutoModelForTokenClassification
26
+
27
+ tokenizer = AutoTokenizer.from_pretrained("tobiadefami/docmodel-base")
28
+ model = AutoModelForTokenClassification.from_pretrained("tobiadefami/docmodel-base")
29
+
30
+ # Example usage
31
+ inputs = tokenizer("Your document text here...", return_tensors="pt")
32
+ outputs = model(**inputs)
33
+ ```
34
+ ### Evaluation
35
+
36
+ ##### Metrics
37
+
38
+ Eval Loss: 1.36752
39
+
40
+ F1-Score: 0.84126
41
+
42
+ ### Results
43
+
44
+ DocModel has been evaluated on the FUNSD dataset for information extraction tasks, demonstrating competitive performance in both loss and F1-score.