habiakl commited on
Commit
3cbe94e
1 Parent(s): eaf800d

Add model weights

Browse files
README.md ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Financial Relation Extraction
2
+
3
+ ## Process
4
+
5
+ Detecting the presence of a relationship between financial terms and qualifying the relationship in case of its presence. Example use cases:
6
+
7
+ * An A-B trust is a joint trust created by a married couple for the purpose of minimizing estate taxes. (<em>Relationship **exists**, type: **is**</em>)
8
+ * There are no withdrawal penalties. (<em>Relationship **does not exist**, type: **x**</em>)
9
+
10
+ ## Data
11
+ The data consists of financial definitions collected from different sources (Wikimedia, IFRS, Investopedia) for financial indicators. Each definition has been split up into sentences, and term relationships in a sentence have been extracted using the [Stanford Open Information Extraction](https://nlp.stanford.edu/software/openie.html) module.
12
+ A typical row in the dataset consists of a definition sentence and its corresponding relationship label.
13
+ The labels were restricted to the 5 most-widely identified relationships, namely: **x** (no relationship), **has**, **is in**, **is** and **are**.
14
+
15
+
16
+ ## Model
17
+ The model used is a standard Roberta-base transformer model from the Hugging Face library. See [HUGGING FACE DistilBERT base model](https://huggingface.co/distilbert-base-uncased) for more details about the model.
18
+ In addition, the model has been pretrained to initializa weigths that would otherwise be unused if loaded from an existing pretrained stock model.
19
+
20
+ ## Metrics
21
+ The evaluation metrics used are: Precision, Recall and F1-score. The following is the classification report on the test set.
22
+
23
+ | relation | precision | recall | f1-score | support |
24
+ | ------------- |:-------------:|:-------------:|:-------------:| -----:|
25
+ | has | 0.7416 | 0.9674 | 0.8396 | 2362 |
26
+ | is in | 0.7813 | 0.7925 | 0.7869 | 2362 |
27
+ | is | 0.8650 | 0.6863 | 0.7653 | 2362 |
28
+ | are | 0.8365 | 0.8493 | 0.8429 | 2362 |
29
+ | x | 0.9515 | 0.8302 | 0.8867 | 2362 |
30
+ | | | | | |
31
+ | macro avg | 0.8352 | 0.8251 | 0.8243 | 11810 |
32
+ | weighted avg | 0.8352 | 0.8251 | 0.8243 | 11810 |
config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:07f7615afabda7ff754ea77e3a06a2d218132bfcc3aa42e22f22ac1585bd7718
3
+ size 774
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ff72da51e34eb2d892c303c6f5f7beed57b13965a4c9a0e1379fb95654da8d30
3
+ size 267872407
special_tokens_map.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:303df45a03609e4ead04bc3dc1536d0ab19b5358db685b6f3da123d05ec200e3
3
+ size 112
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ecafc709dc78a0d00e3bc20477606e97ccfd239fdfddd7e53fcb24300ba0bc13
3
+ size 466247
tokenizer_config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:87fab29eb94840215d6b277994841550362ceff337f16bbf44e9af30fd2fb62d
3
+ size 291
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e2ab6d3f261a834531ac404acb765265a52a8016338c732b78daa1f299bf6002
3
+ size 2415
vocab.txt ADDED
The diff for this file is too large to render. See raw diff