File size: 2,802 Bytes
c249d4b
 
 
 
 
a142544
 
 
 
 
 
9006a83
a142544
58f7dbc
 
e544983
9006a83
 
 
e544983
9006a83
 
 
 
e544983
 
 
9006a83
e544983
9006a83
 
 
 
e544983
 
58f7dbc
e544983
58f7dbc
e544983
58f7dbc
e544983
58f7dbc
9006a83
58f7dbc
e544983
58f7dbc
e544983
58f7dbc
e544983
58f7dbc
e544983
58f7dbc
e544983
58f7dbc
e544983
58f7dbc
e544983
58f7dbc
e544983
58f7dbc
e544983
58f7dbc
e544983
58f7dbc
e544983
58f7dbc
e544983
58f7dbc
e544983
58f7dbc
e544983
58f7dbc
9006a83
58f7dbc
e544983
 
 
9006a83
58f7dbc
e544983
68b6559
 
 
 
 
a142544
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
---
license: apache-2.0
language:
- hi
- en
base_model: teknium/OpenHermes-2.5
---


Model trained on Hindi and English data.

Try it out: https://colab.research.google.com/drive/1A_hbsq1vrCeAh3dEMvtwxxNxcNZ1BUyW?usp=sharing

For sample responose on different prompts checkout: https://github.com/manishiitg/hi-llm-eval


#### Language Hi

| Model | implicit_hate | flores | indicwikibio | hellaswag-indic | truthfulqa-hi | boolq-hi | indicheadline | indic-arc-easy | indicqa | indic-arc-challenge | indicsentiment | xlsum-hi | indicxparaphrase | mmlu_hi |  
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 
| open-aditi-hi-v2 |  11.5021 | 43.6822 | 0.4846 | 0.2404 | 0.6934 | 0.8541 | 0.4565 | 0.4979 | 0.0795 | 0.4462 | 0.9729 | 0.4213 | 0.6838 | 0.3253 |
| OpenHermes-2.5-Mistral-7B |  0.2068 | 30.3465 | 0.3332 | 0.2485 | 0.3234 | 0.5979 | 0.1996 | 0.3523 | 0.2721 | 0.3396 | 0.9048 | 0.1774 | 0.8766 | 0.2769 |
| open-aditi-hi-v1 |  8.6105 | 40.2376 | 0.4104 | 0.0848 | 0.4230 | 0.3758 | 0.4248 | 0.3889 | 0.1306 | 0.3558 | 0.8798 | 0.4212 | 0.5939 | 0.1398 |
| Airavata |  0.0663 | 58.0555 | 0.0637 | 0.0254 | 0.2122 | 0.0373 | 0.4346 | 0.1128 | 0.1008 | 0.0836 | 0.8437 | 0.4650 | 0.3277 | 0.1336 |

#### Language En

| Model | boolq | hellaswag | mmlu | truthfulqa | xlsum | arc-easy-exact | arc-challenge |  
| --- | --- | --- | --- | --- | --- | --- | --- | 
| OpenHermes-2.5-Mistral-7B |  0.4061 | 0.7999 | 0.5991 | 0.2081 | 0.4328 | 0.8687 | 0.7790 |
| open-aditi-hi-v2 |  0.3982 | 0.4738 | 0.5544 | 0.2999 | 0.4349 | 0.8388 | 0.7235 |
| open-aditi-hi-v1 |  0.0434 | 0.3509 | 0.2597 | 0.3317 | 0.4288 | 0.7588 | 0.6271 |
| Airavata |  0.0437 | 0.0277 | 0.1165 | 0.3586 | 0.4393 | 0.2534 | 0.1630 |

Task: flores Metric: chrf 

Task: implicit_hate Metric: chrf 

Task: indicsentiment Metric: accuracy 

Task: indicxparaphrase Metric: accuracy 

Task: boolq-hi Metric: accuracy 

Task: truthfulqa-hi Metric: accuracy 

Task: indic-arc-easy Metric: accuracy 

Task: indicwikibio Metric: bleurt 

Task: xlsum-hi Metric: bleurt 

Task: indicheadline Metric: bleurt 

Task: indic-arc-challenge Metric: accuracy 

Task: mmlu_hi Metric: average_acc 

Task: indicqa Metric: accuracy 

Task: hellaswag-indic Metric: accuracy 

Task: arc-easy-exact Metric: accuracy 

Task: hellaswag Metric: accuracy 

Task: arc-challenge Metric: accuracy 

Task: mmlu Metric: average_acc 

Task: xlsum Metric: bleurt 

Task: boolq Metric: accuracy 

Task: truthfulqa Metric: accuracy 




Model evaluation on OpenLLM LeaderBoard

![image/png](https://cdn-uploads.huggingface.co/production/uploads/5dfae476da6d0311fd3d5432/ENzZwV2Z98uNlpyUz3Blp.png)

![image/png](https://cdn-uploads.huggingface.co/production/uploads/5dfae476da6d0311fd3d5432/SpSiu5lzA6JKJx8ICX_zd.png)