nie3e commited on
Commit
5b407d9
1 Parent(s): eeeff76

Adds evaluation on another dataset.

Browse files
Files changed (1) hide show
  1. README.md +56 -2
README.md CHANGED
@@ -5,7 +5,15 @@ metrics:
5
  - accuracy
6
  model-index:
7
  - name: sentiment-polish-gpt2-small
8
- results: []
 
 
 
 
 
 
 
 
9
  license: mit
10
  language:
11
  - pl
@@ -39,7 +47,12 @@ Train/test split: 80%/20%
39
  Datacollator:
40
  ```py
41
  from transformers import DataCollatorWithPadding
42
- data_collator = DataCollatorWithPadding(tokenizer=tokenizer, padding="longest", max_length=128, pad_to_multiple_of=8)
 
 
 
 
 
43
  ```
44
 
45
  ## Training procedure
@@ -77,6 +90,47 @@ The following hyperparameters were used during training:
77
  | 0.0069 | 9.0 | 29557 | 0.4529 | 0.9622 |
78
  | 0.0059 | 10.0 | 32840 | 0.4659 | 0.9627 |
79
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
80
 
81
  ### Framework versions
82
 
 
5
  - accuracy
6
  model-index:
7
  - name: sentiment-polish-gpt2-small
8
+ results:
9
+ - task:
10
+ type: text-classification
11
+ dataset:
12
+ type: allegro/klej-polemo2-out
13
+ name: klej-polemo2-out
14
+ metrics:
15
+ - type: accuracy
16
+ value: 98.38%
17
  license: mit
18
  language:
19
  - pl
 
47
  Datacollator:
48
  ```py
49
  from transformers import DataCollatorWithPadding
50
+ data_collator = DataCollatorWithPadding(
51
+ tokenizer=tokenizer,
52
+ padding="longest",
53
+ max_length=128,
54
+ pad_to_multiple_of=8
55
+ )
56
  ```
57
 
58
  ## Training procedure
 
90
  | 0.0069 | 9.0 | 29557 | 0.4529 | 0.9622 |
91
  | 0.0059 | 10.0 | 32840 | 0.4659 | 0.9627 |
92
 
93
+ ### Evaluation
94
+
95
+ Evaluated on [allegro/klej-polemo2-out](https://huggingface.co/datasets/allegro/klej-polemo2-out) test dataset.
96
+ ```py
97
+ from datasets import load_dataset
98
+ from evaluate import evaluator
99
+
100
+ data = load_dataset("allegro/klej-polemo2-out", split="test").shuffle(seed=42)
101
+ task_evaluator = evaluator("text-classification")
102
+
103
+ # fix labels
104
+ l = {
105
+ "__label__meta_zero": 0,
106
+ "__label__meta_minus_m": 1,
107
+ "__label__meta_plus_m": 2,
108
+ "__label__meta_amb": 3
109
+ }
110
+ def fix_labels(examples):
111
+ examples["target"] = l[examples["target"]]
112
+ return examples
113
+ data = data.map(fix_labels)
114
+
115
+ eval_resutls = task_evaluator.compute(
116
+ model_or_pipeline="nie3e/sentiment-polish-gpt2-small",
117
+ data=data,
118
+ label_mapping={"NEUTRAL": 0, "NEGATIVE": 1, "POSITIVE": 2, "AMBIGUOUS": 3},
119
+ input_column="sentence",
120
+ label_column="target"
121
+ )
122
+
123
+ print(eval_resutls)
124
+ ```
125
+
126
+ ```json
127
+ {
128
+ "accuracy": 0.9838056680161943,
129
+ "total_time_in_seconds": 5.2441766999982065,
130
+ "samples_per_second": 94.1997244296076,
131
+ "latency_in_seconds": 0.010615742307688678
132
+ }
133
+ ```
134
 
135
  ### Framework versions
136