phmartins commited on
Commit
010b4f8
1 Parent(s): 98c0afb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -13
README.md CHANGED
@@ -37,6 +37,10 @@ language:
37
  - ar
38
  - hi
39
  ---
 
 
 
 
40
  # Model Card for EuroLLM-1.7B-Instruct
41
 
42
 
@@ -105,28 +109,30 @@ Here is a summary of the model hyper-parameters:
105
  We evaluate EuroLLM-1.7B-Instruct on several machine translation benchmarks: FLORES-200, WMT-23, and WMT-24 comparing it with [Gemma-2B](https://huggingface.co/google/gemma-2b) and [Gemma-7B](https://huggingface.co/google/gemma-7b) (also instruction tuned on EuroBlocks).
106
  The results show that EuroLLM-1.7B is substantially better than Gemma-2B in Machine Translation and competitive with Gemma-7B.
107
 
 
108
  #### Flores-200
109
  | Model | AVG | AVG en-xx | AVG xx-en | en-ar | en-bg | en-ca | en-cs | en-da | en-de | en-el | en-es-latam | en-et | en-fi | en-fr | en-ga | en-gl | en-hi | en-hr | en-hu | en-it | en-ja | en-ko | en-lt | en-lv | en-mt | en-nl | en-no | en-pl | en-pt-br | en-ro | en-ru | en-sk | en-sl | en-sv | en-tr | en-uk | en-zh-cn | ar-en | bg-en | ca-en | cs-en | da-en | de-en | el-en | es-latam-en | et-en | fi-en | fr-en | ga-en | gl-en | hi-en | hr-en | hu-en | it-en | ja-en | ko-en | lt-en | lv-en | mt-en | nl-en | no-en | pl-en | pt-br-en | ro-en | ru-en | sk-en | sl-en | sv-en | tr-en | uk-en | zh-cn-en |
110
  |--------------------------------|------|-----------|-----------|-------|-------|-------|-------|-------|-------|-------|--------------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|----------|-------|-------|-------|-------|-------|-------|-------|----------|-------|-------|-------|-------|-------|-------|-------|--------------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|----------|-------|-------|-------|-------|-------|-------|-------|----------|
111
- | EuroLLM-1.7B-Instruct | 86.75| 86.49| 87.01|85.00| 89.37| 84.65| 89.23| 89.54| 87.00| 87.68| 86.43| 88.66| 89.18| 87.65| 74.66| 86.68| 76.94| 84.86| 86.43| 88.19| 89.45| 87.33| 87.73| 87.74| 67.66| 87.16| 90.08| 88.10| 89.39| 89.39| 88.11| 88.13| 87.08| 89.67| 87.21| 87.76| 86.45| 86.15| 87.46| 87.49| 87.97| 89.65| 88.80| 86.88| 86.70| 88.19| 88.66| 88.93| 81.95| 87.38| 87.97| 86.69| 87.15| 87.69| 86.78| 86.76| 85.39| 86.23| 76.93| 87.02| 90.24| 85.54| 89.22| 88.81| 86.02| 87.24| 86.81| 89.55| 87.76| 86.37| 86.00 |
112
- | Gemma-2B-EuroBlocks | 81.56| 78.93 | 84.18 | 75.25 | 82.46 | 83.17 | 82.17 | 84.40 | 83.20 | 79.63 | 84.15 | 72.63 | 81.00 | 85.12 | 38.79 | 82.00 | 67.00 | 81.18 | 78.24 | 84.80 | 87.08 | 82.04 | 73.02 | 68.41 | 56.67 | 83.30 | 86.69 | 83.07 | 86.82 | 84.00 | 84.55 | 77.93 | 76.19 | 80.77 | 79.76 | 84.19 | 84.10 | 83.67 | 85.73 | 86.89 | 86.38 | 88.39 | 88.11 | 84.68 | 86.11 | 83.45 | 86.45 | 88.22 | 50.88 | 86.44 | 85.87 | 85.33 | 85.16 | 86.75 | 85.62 | 85.00 | 81.55 | 81.45 | 67.90 | 85.95 | 89.05 | 84.18 | 88.27 | 87.38 | 85.13 | 85.22 | 83.86 | 87.83 | 84.96 | 85.15 | 85.10 |
113
- | Gemma-7B-EuroBlocks | 86.16| 85.49 | 86.82 | 83.39 | 88.32 | 85.82 | 88.88 | 89.01 | 86.96 | 86.62 | 86.31 | 84.42 | 88.11 | 87.46 | 61.85 | 86.10 | 77.91 | 87.01 | 85.81 | 87.57 | 89.88 | 87.24 | 84.47 | 83.15 | 67.13 | 86.50 | 90.44 | 87.57 | 89.22 | 89.13 | 88.58 | 86.73 | 84.68 | 88.16 | 86.87 | 88.40 | 87.11 | 86.65 | 87.25 | 88.17 | 87.47 | 89.59 | 88.44 | 86.76 | 86.66 | 87.55 | 88.88 | 88.86 | 73.46 | 87.63 | 88.43 | 87.12 | 87.31 | 87.49 | 87.20 | 87.15 | 85.16 | 85.96 | 78.39 | 86.73 | 90.52 | 85.38 | 89.17 | 88.75 | 86.35 | 86.82 | 86.21 | 89.39 | 88.20 | 86.45 | 86.28 |
114
 
115
 
116
  #### WMT-23
117
  | Model | AVG | AVG en-xx | AVG xx-en | AVG xx-xx | en-de | en-cs | en-uk | en-ru | en-zh-cn | de-en | uk-en | ru-en | zh-cn-en | cs-uk |
118
  |--------------------------------|------|-----------|-----------|-----------|-------|-------|-------|-------|----------|-------|-------|-------|----------|-------|
119
- | EuroLLM-1.7B-Instruct | 83.13 | 82.91 | 82.48 | 86.87| 81.33| 85.42| 81.61| 82.57| 83.62| 84.24| 85.36| 81.56| 78.76| 86.87 |
120
- | Gemma-2B-EuroBlocks | 79.86| 78.35 | 81.32 | 81.56 | 76.54 | 76.35 | 77.62 | 78.88 | 82.36 | 82.85 | 83.83 | 80.17 | 78.42 | 81.56 |
121
- | Gemma-7B-EuroBlocks | 83.90| 83.70 | 83.21 | 87.61 | 82.15 | 84.68 | 83.05 | 83.85 | 84.79 | 84.40 | 85.86 | 82.55 | 80.01 | 87.61 |
 
122
 
123
 
124
  #### WMT-24
125
- | Model | AVG | AVG en-xx | AVG xx-xx | en-es-latam | en-cs | en-ru | en-uk | en-ja | en-zh-cn | en-hi | cs-uk | ja-zh-cn |
126
- |---------|------|------|-------|-------|-------|-------|--------|--------|-------|-------|-------|-----|
127
- | EuroLLM-1.7B-Instruct|79.35|79.45|78.96|79.20|81.17|80.82|79.00|80.54|82.39|80.80|71.69|83.16|74.76|
128
- |Gemma-2B-EuroBlocks| 74.71|74.25|76.57|75.21|78.84|70.40|74.44|75.55|78.32|78.70|62.51|79.97|73.17|
129
- |Gemma-7B-EuroBlocks| 80.88|80.45|82.60|80.43|81.91|80.14|80.32|82.17|84.08|81.86|72.71|85.55|79.65|
130
 
131
 
132
  ### General Benchmarks
@@ -137,16 +143,19 @@ Results show that EuroLLM-1.7B is superior to TinyLlama-v1.1 and similar to Gemm
137
  #### Arc Challenge
138
  | Model | Average | English | German | Spanish | French | Italian | Portuguese | Chinese | Russian | Dutch | Arabic | Swedish | Hindi | Hungarian | Romanian | Ukrainian | Danish | Catalan |
139
  |--------------------|---------|---------|--------|---------|--------|---------|------------|---------|---------|-------|--------|---------|--------|-----------|----------|-----------|--------|---------|
140
- | EuroLLM-1.7B | 0.3268 | 0.4070 | 0.3293 | 0.3521 | 0.3370 | 0.3422 | 0.3496 | 0.3060 | 0.3122 | 0.3174 | 0.2866 | 0.3373 | 0.2817 | 0.3031 | 0.3179 | 0.3199 | 0.3248 | 0.3310 |
141
  | TinyLlama-v1.1 | 0.2650 | 0.3712 | 0.2524 | 0.2795 | 0.2883 | 0.2652 | 0.2906 | 0.2410 | 0.2669 | 0.2404 | 0.2310 | 0.2687 | 0.2354 | 0.2449 | 0.2476 | 0.2524 | 0.2494 | 0.2796 |
142
  | Gemma-2B | 0.3617 | 0.4846 | 0.3755 | 0.3940 | 0.4080 | 0.3687 | 0.3872 | 0.3726 | 0.3456 | 0.3328 | 0.3122 | 0.3519 | 0.2851 | 0.3039 | 0.3590 | 0.3601 | 0.3565 | 0.3516 |
 
 
143
  #### Hellaswag
144
  | Model | Average | English | German | Spanish | French | Italian | Portuguese | Russian | Dutch | Arabic | Swedish | Hindi | Hungarian | Romanian | Ukrainian | Danish | Catalan |
145
  |--------------------|---------|---------|--------|---------|--------|---------|------------|---------|--------|--------|---------|--------|-----------|----------|-----------|--------|---------|
146
- | EuroLLM-1.7B | 0.4744 | 0.6084 | 0.4772 | 0.5310 | 0.5260 | 0.5067 | 0.5206 | 0.4674 | 0.4893 | 0.4075 | 0.4813 | 0.3605 | 0.4067 | 0.4598 | 0.4368 | 0.4700 | 0.4405 |
147
  | TinyLlama-v1.1 |0.3674 | 0.6248 | 0.3650 | 0.4137 | 0.4010 | 0.3780 | 0.3892 | 0.3494 | 0.3588 | 0.2880 | 0.3561 | 0.2841 | 0.3073 | 0.3267 | 0.3349 | 0.3408 | 0.3613 |
148
  | Gemma-2B |0.4666 | 0.7165 | 0.4756 | 0.5414 | 0.5180 | 0.4841 | 0.5081 | 0.4664 | 0.4655 | 0.3868 | 0.4383 | 0.3413 | 0.3710 | 0.4316 | 0.4291 | 0.4471 | 0.4448 |
149
 
 
150
  ## Bias, Risks, and Limitations
151
 
152
  EuroLLM-1.7B-Instruct has not been aligned to human preferences, so the model may generate problematic outputs (e.g., hallucinations, harmful content, or false statements).
 
37
  - ar
38
  - hi
39
  ---
40
+
41
+ ## *Model updated on September 24*
42
+
43
+
44
  # Model Card for EuroLLM-1.7B-Instruct
45
 
46
 
 
109
  We evaluate EuroLLM-1.7B-Instruct on several machine translation benchmarks: FLORES-200, WMT-23, and WMT-24 comparing it with [Gemma-2B](https://huggingface.co/google/gemma-2b) and [Gemma-7B](https://huggingface.co/google/gemma-7b) (also instruction tuned on EuroBlocks).
110
  The results show that EuroLLM-1.7B is substantially better than Gemma-2B in Machine Translation and competitive with Gemma-7B.
111
 
112
+
113
  #### Flores-200
114
  | Model | AVG | AVG en-xx | AVG xx-en | en-ar | en-bg | en-ca | en-cs | en-da | en-de | en-el | en-es-latam | en-et | en-fi | en-fr | en-ga | en-gl | en-hi | en-hr | en-hu | en-it | en-ja | en-ko | en-lt | en-lv | en-mt | en-nl | en-no | en-pl | en-pt-br | en-ro | en-ru | en-sk | en-sl | en-sv | en-tr | en-uk | en-zh-cn | ar-en | bg-en | ca-en | cs-en | da-en | de-en | el-en | es-latam-en | et-en | fi-en | fr-en | ga-en | gl-en | hi-en | hr-en | hu-en | it-en | ja-en | ko-en | lt-en | lv-en | mt-en | nl-en | no-en | pl-en | pt-br-en | ro-en | ru-en | sk-en | sl-en | sv-en | tr-en | uk-en | zh-cn-en |
115
  |--------------------------------|------|-----------|-----------|-------|-------|-------|-------|-------|-------|-------|--------------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|----------|-------|-------|-------|-------|-------|-------|-------|----------|-------|-------|-------|-------|-------|-------|-------|--------------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|----------|-------|-------|-------|-------|-------|-------|-------|----------|
116
+ | EuroLLM-1.7B-Instruct |86.89 | 86.53 | 87.25 | 85.17 | 89.42 | 84.72 | 89.13 | 89.47 | 86.90 | 87.60 | 86.29 | 88.95 | 89.40 | 87.69 | 74.89 | 86.41 | 76.92 | 84.79 | 86.78 | 88.17 | 89.76 | 87.70 | 87.27 | 87.62 | 67.84 | 87.10 | 90.00 | 88.18 | 89.29 | 89.49 | 88.32 | 88.18 | 86.85 | 90.00 | 87.31 | 87.89 | 86.60 | 86.34 | 87.45 | 87.57 | 87.95 | 89.72 | 88.80 | 87.00 | 86.77 | 88.34 | 89.09 | 88.95 | 82.69 | 87.80 | 88.37 | 86.71 | 87.20 | 87.81 | 86.79 | 86.79 | 85.62 | 86.48 | 81.10 | 86.97 | 90.25 | 85.75 | 89.20 | 88.88 | 86.00 | 87.38 | 86.76 | 89.61 | 87.94 |
117
+ | Gemma-2B-EuroBlocks | 81.59 | 78.97 | 84.21 | 76.68 | 82.73 | 83.14 | 81.63 | 84.63 | 83.15 | 79.42 | 84.05 | 72.58 | 79.73 | 84.97 | 40.50 | 82.13 | 67.79 | 80.53 | 78.36 | 84.90 | 87.43 | 82.98 | 72.29 | 68.68 | 58.55 | 83.13 | 86.15 | 82.78 | 86.79 | 83.14 | 84.61 | 78.18 | 75.37 | 80.89 | 78.38 | 84.38 | 84.35 | 83.88 | 85.77 | 86.85 | 86.31 | 88.24 | 88.12 | 84.79 | 84.90 | 82.51 | 86.32 | 88.29 | 54.78 | 86.53 | 85.83 | 85.41 | 85.18 | 86.77 | 85.78 | 84.99 | 81.65 | 81.78 | 67.27 | 85.92 | 89.07 | 84.14 | 88.07 | 87.17 | 85.23 | 85.09 | 83.95 | 87.57 | 84.77 |
118
+ | Gemma-7B-EuroBlocks |85.27 | 83.90 | 86.64 | 86.38 | 87.87 | 85.74 | 84.25 | 85.69 | 81.49 | 85.52 | 86.93 | 62.83 | 84.96 | 75.34 | 84.93 | 83.91 | 86.92 | 88.19 | 86.11 | 81.73 | 80.55 | 66.85 | 85.31 | 89.36 | 85.87 | 88.62 | 88.06 | 86.67 | 84.79 | 82.71 | 86.45 | 85.19 | 86.67 | 85.77 | 86.36 | 87.21 | 88.09 | 87.17 | 89.40 | 88.26 | 86.74 | 86.73 | 87.25 | 88.87 | 88.81 | 72.45 | 87.62 | 87.86 | 87.08 | 87.01 | 87.58 | 86.92 | 86.70 | 85.10 | 85.74 | 77.81 | 86.83 | 90.40 | 85.41 | 89.04 | 88.77 | 86.13 | 86.67 | 86.32 | 89.27 | 87.92 |
119
 
120
 
121
  #### WMT-23
122
  | Model | AVG | AVG en-xx | AVG xx-en | AVG xx-xx | en-de | en-cs | en-uk | en-ru | en-zh-cn | de-en | uk-en | ru-en | zh-cn-en | cs-uk |
123
  |--------------------------------|------|-----------|-----------|-----------|-------|-------|-------|-------|----------|-------|-------|-------|----------|-------|
124
+ | EuroLLM-1.7B-Instruct | 82.91 | 83.20 | 81.77 | 86.82 | 81.56 | 85.23 | 81.30 | 82.47 | 83.61 | 85.03 | 84.06 | 85.25 | 81.31 | 78.83 | 79.42 | 86.82 |
125
+ | Gemma-2B-EuroBlocks | 79.96 | 79.01 | 80.86 | 81.15 | 76.82 | 76.05 | 77.92 | 78.98 | 81.58 | 82.73 | 82.71 | 83.99 | 80.35 | 78.27 | 78.99 | 81.15 |
126
+ | Gemma-7B-EuroBlocks | 82.76 | 82.26 | 82.70 | 85.98 | 81.37 | 82.42 | 81.54 | 82.18 | 82.90 | 83.17 | 84.29 | 85.70 | 82.46 | 79.73 | 81.33 | 85.98 |
127
+
128
 
129
 
130
  #### WMT-24
131
+ | Model | AVG | AVG en-xx | AVG xx-xx | en-de | en-es-latam | en-cs | en-ru | en-uk | en-ja | en-zh-cn | en-hi | cs-uk | ja-zh-cn |
132
+ |---------|------|------|-------|----|---|-------|-------|--------|--------|-------|-------|-------|-----|
133
+ | EuroLLM-1.7B-Instruct|79.32 | 79.32 | 79.34 | 79.42 | 80.67 | 80.55 | 78.65 | 80.12 | 82.96 | 80.60 | 71.59 | 83.48 | 75.20 |
134
+ |Gemma-2B-EuroBlocks| 74.72 | 74.41 | 75.97 | 74.93 | 78.81 | 70.54 | 74.90 | 75.84 | 79.48 | 78.06 | 62.70 | 79.87 | 72.07 |
135
+ |Gemma-7B-EuroBlocks| 78.67 | 78.34 | 80.00 | 78.88 | 80.47 | 78.55 | 78.55 | 80.12 | 80.55 | 78.90 | 70.71 | 84.33 | 75.66 |
136
 
137
 
138
  ### General Benchmarks
 
143
  #### Arc Challenge
144
  | Model | Average | English | German | Spanish | French | Italian | Portuguese | Chinese | Russian | Dutch | Arabic | Swedish | Hindi | Hungarian | Romanian | Ukrainian | Danish | Catalan |
145
  |--------------------|---------|---------|--------|---------|--------|---------|------------|---------|---------|-------|--------|---------|--------|-----------|----------|-----------|--------|---------|
146
+ | EuroLLM-1.7B | 0.3496 | 0.4061 | 0.3464 | 0.3684 | 0.3627 | 0.3738 | 0.3855 | 0.3521 | 0.3208 | 0.3507 | 0.3045 | 0.3605 | 0.2928 | 0.3271 | 0.3488 | 0.3516 | 0.3513 | 0.3396 |
147
  | TinyLlama-v1.1 | 0.2650 | 0.3712 | 0.2524 | 0.2795 | 0.2883 | 0.2652 | 0.2906 | 0.2410 | 0.2669 | 0.2404 | 0.2310 | 0.2687 | 0.2354 | 0.2449 | 0.2476 | 0.2524 | 0.2494 | 0.2796 |
148
  | Gemma-2B | 0.3617 | 0.4846 | 0.3755 | 0.3940 | 0.4080 | 0.3687 | 0.3872 | 0.3726 | 0.3456 | 0.3328 | 0.3122 | 0.3519 | 0.2851 | 0.3039 | 0.3590 | 0.3601 | 0.3565 | 0.3516 |
149
+
150
+
151
  #### Hellaswag
152
  | Model | Average | English | German | Spanish | French | Italian | Portuguese | Russian | Dutch | Arabic | Swedish | Hindi | Hungarian | Romanian | Ukrainian | Danish | Catalan |
153
  |--------------------|---------|---------|--------|---------|--------|---------|------------|---------|--------|--------|---------|--------|-----------|----------|-----------|--------|---------|
154
+ | EuroLLM-1.7B | 0.4744 | 0.4760 | 0.6057 | 0.4793 | 0.5337 | 0.5298 | 0.5085 | 0.5224 | 0.4654 | 0.4949 | 0.4104 | 0.4800 | 0.3655 | 0.4097 | 0.4606 | 0.436 | 0.4702 | 0.4445 |
155
  | TinyLlama-v1.1 |0.3674 | 0.6248 | 0.3650 | 0.4137 | 0.4010 | 0.3780 | 0.3892 | 0.3494 | 0.3588 | 0.2880 | 0.3561 | 0.2841 | 0.3073 | 0.3267 | 0.3349 | 0.3408 | 0.3613 |
156
  | Gemma-2B |0.4666 | 0.7165 | 0.4756 | 0.5414 | 0.5180 | 0.4841 | 0.5081 | 0.4664 | 0.4655 | 0.3868 | 0.4383 | 0.3413 | 0.3710 | 0.4316 | 0.4291 | 0.4471 | 0.4448 |
157
 
158
+
159
  ## Bias, Risks, and Limitations
160
 
161
  EuroLLM-1.7B-Instruct has not been aligned to human preferences, so the model may generate problematic outputs (e.g., hallucinations, harmful content, or false statements).