gabriel-p commited on
Commit
91c98bb
1 Parent(s): c9e8865

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -0
README.md CHANGED
@@ -1,3 +1,37 @@
1
  ---
2
  license: openrail
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: openrail
3
  ---
4
+
5
+
6
+ <h3 align="center">PDF Paragraphs Extraction</h3>
7
+ <p align="center">A model for extracting paragraphs from PDFs</p>
8
+
9
+ This model uses features from the PDF to extract the text and paragraphs from it. It can be used as a service.
10
+
11
+ The paragraphs contain the page number, the position in the page, the size, and the text.
12
+
13
+
14
+ ## Quick Start
15
+
16
+ Download the service that uses the model:
17
+
18
+ git clone https://github.com/huridocs/pdf_paragraphs_extraction.git
19
+ cd pdf_paragraphs_extraction
20
+
21
+ Start the service:
22
+
23
+ ./run start
24
+
25
+ Get the paragraphs from a PDF:
26
+
27
+ curl -X GET -F 'file=@/PATH/TO/PDF/pdf_name.pdf' localhost:5051
28
+
29
+ To stop the server:
30
+
31
+ ./run stop
32
+
33
+
34
+ ## Performance
35
+
36
+ Accuracy: 93.9%
37
+ Speed: tbd