him1411 commited on
Commit
20c945f
1 Parent(s): 6b491e2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +73 -1
README.md CHANGED
@@ -1,3 +1,75 @@
1
  ---
2
- license: apache-2.0
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - finance
7
+ - ContextNER
8
+ - language models
9
+ datasets:
10
+ - him1411/EDGAR10-Q
11
+ metrics:
12
+ - rouge
13
  ---
14
+
15
+ EDGAR-T5-Large
16
+ =============
17
+
18
+ T5 Large model finetuned on [EDGAR10-Q dataset](https://huggingface.co/datasets/him1411/EDGAR10-Q)
19
+
20
+ You may want to check out
21
+ * Our paper: [CONTEXT-NER: Contextual Phrase Generation at Scale](https://arxiv.org/abs/2109.08079/)
22
+ * GitHub: [Click Here](https://github.com/him1411/edgar10q-dataset)
23
+
24
+
25
+
26
+ Direct Use
27
+ =============
28
+
29
+ It is possible to use this model to generate text, which is useful for experimentation and understanding its capabilities. **It should not be directly used for production or work that may directly impact people.**
30
+
31
+ How to Use
32
+ =============
33
+
34
+ You can very easily load the models with Transformers, instead of downloading them manually. The [T5-Large model](https://huggingface.co/t5-large) is the backbone of our model. Here is how to use the model in PyTorch:
35
+
36
+ ```python
37
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
38
+ tokenizer = AutoTokenizer.from_pretrained("him1411/EDGAR-T5-Large")
39
+ model = AutoModelForSeq2SeqLM.from_pretrained("him1411/EDGAR-T5-Large")
40
+ ```
41
+ Or just clone the model repo
42
+ ```
43
+ git lfs install
44
+ git clone https://huggingface.co/him1411/EDGAR-T5-base
45
+ ```
46
+
47
+ Inference Example
48
+ =============
49
+
50
+ Here, we provide an example for the "ContextNER" task. Below is an example of one instance.
51
+
52
+ ```python
53
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
54
+ tokenizer = AutoTokenizer.from_pretrained("him1411/EDGAR-T5-Large")
55
+ model = AutoModelForSeq2SeqLM.from_pretrained("him1411/EDGAR-T5-Large")
56
+ # Input shows how we have appended instruction from our file for HoC dataset with instance.
57
+ input = "14.5 years . The definite lived intangible assets related to the contracts and trade names had estimated weighted average useful lives of 5.9 years and 14.5 years, respectively, at acquisition."
58
+ tokenized_input= tokenizer(input)
59
+ # Ideal output for this input is 'Definite lived intangible assets weighted average remaining useful life'
60
+ output = model(tokenized_input)
61
+ ```
62
+
63
+
64
+ BibTeX Entry and Citation Info
65
+ ===============
66
+ If you are using our model, please cite our paper:
67
+
68
+ ```bibtex
69
+ @article{gupta2021context,
70
+ title={Context-NER: Contextual Phrase Generation at Scale},
71
+ author={Gupta, Himanshu and Verma, Shreyas and Kumar, Tarun and Mishra, Swaroop and Agrawal, Tamanna and Badugu, Amogh and Bhatt, Himanshu Sharad},
72
+ journal={arXiv preprint arXiv:2109.08079},
73
+ year={2021}
74
+ }
75
+ ```