File size: 1,837 Bytes
64a9f31
 
 
 
 
 
cebdf34
 
 
 
 
 
 
 
 
b601edd
64a9f31
 
 
 
cebdf34
9f9f516
cebdf34
f69e2f5
7dfdfa4
 
 
 
 
 
 
 
 
64a9f31
 
 
cebdf34
64a9f31
cebdf34
64a9f31
7d01ff9
f69e2f5
7d01ff9
cebdf34
64a9f31
f69e2f5
64a9f31
 
 
 
 
7dfdfa4
 
64a9f31
 
 
5b492dd
f69e2f5
64a9f31
 
f69e2f5
64a9f31
 
a6ed218
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
tags:
- generated_from_trainer
model-index:
- name: ruBert-base-finetuned-pos
  results: []
license: mit
datasets:
- disk0dancer/ru_sentances_pos
language:
- ru
metrics:
- accuracy
- f1
pipeline_tag: token-classification
library_name: transformers
---

# ruBert-base-finetuned-pos

This model was finetuned from [ai-forever/ruBert-base](https://huggingface.co./ai-forever/ruBert-base) on the [disk0dancer/ru_sentances_pos](https://hf.co/datasets/disk0dancer/ru_sentances_pos) dataset.
All docs and code can be found on [Github](https://github.com/disk0Dancer/rubert-finetuned-pos).

It achieves the following results on the evaluation set:
- eval_loss: 0.1544
- eval_precision: 0.8561
- eval_recall: 0.8723
- eval_f1: 0.8642
- eval_accuracy: 0.8822
- eval_runtime: 0.2476
- eval_samples_per_second: 80.775
- eval_steps_per_second: 8.078
- step: 0

## Model description

Bert + Dence + Softmax + Dropout

![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f73a86f9678931cad645df/fnHI0M7WAQ1AkgfXOTIx6.png)


## Training and evaluation data

Model Trained for Token Classification

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5

### Framework versions

- Transformers 4.39.0
- Pytorch 2.2.1+cu121
- Datasets 2.18.0
- Tokenizers 0.15.2


## Cite

```
@misc{churakov2024postagginghighlightskeletalstructure,
      title={POS-tagging to highlight the skeletal structure of sentences}, 
      author={Grigorii Churakov},
      year={2024},
      eprint={2411.14393},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2411.14393}, 
}
```