aashish1904 commited on
Commit
5c629e9
1 Parent(s): b5b91a5

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +270 -0
README.md ADDED
@@ -0,0 +1,270 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ language:
5
+ - fr
6
+ - en
7
+ license: mit
8
+ library_name: transformers
9
+ tags:
10
+ - french
11
+ - chocolatine
12
+ datasets:
13
+ - jpacifico/french-orca-dpo-pairs-revised
14
+ pipeline_tag: text-generation
15
+ model-index:
16
+ - name: Chocolatine-14B-Instruct-DPO-v1.2
17
+ results:
18
+ - task:
19
+ type: text-generation
20
+ name: Text Generation
21
+ dataset:
22
+ name: IFEval (0-Shot)
23
+ type: HuggingFaceH4/ifeval
24
+ args:
25
+ num_few_shot: 0
26
+ metrics:
27
+ - type: inst_level_strict_acc and prompt_level_strict_acc
28
+ value: 68.52
29
+ name: strict accuracy
30
+ source:
31
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=jpacifico/Chocolatine-14B-Instruct-DPO-v1.2
32
+ name: Open LLM Leaderboard
33
+ - task:
34
+ type: text-generation
35
+ name: Text Generation
36
+ dataset:
37
+ name: BBH (3-Shot)
38
+ type: BBH
39
+ args:
40
+ num_few_shot: 3
41
+ metrics:
42
+ - type: acc_norm
43
+ value: 49.85
44
+ name: normalized accuracy
45
+ source:
46
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=jpacifico/Chocolatine-14B-Instruct-DPO-v1.2
47
+ name: Open LLM Leaderboard
48
+ - task:
49
+ type: text-generation
50
+ name: Text Generation
51
+ dataset:
52
+ name: MATH Lvl 5 (4-Shot)
53
+ type: hendrycks/competition_math
54
+ args:
55
+ num_few_shot: 4
56
+ metrics:
57
+ - type: exact_match
58
+ value: 17.98
59
+ name: exact match
60
+ source:
61
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=jpacifico/Chocolatine-14B-Instruct-DPO-v1.2
62
+ name: Open LLM Leaderboard
63
+ - task:
64
+ type: text-generation
65
+ name: Text Generation
66
+ dataset:
67
+ name: GPQA (0-shot)
68
+ type: Idavidrein/gpqa
69
+ args:
70
+ num_few_shot: 0
71
+ metrics:
72
+ - type: acc_norm
73
+ value: 10.07
74
+ name: acc_norm
75
+ source:
76
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=jpacifico/Chocolatine-14B-Instruct-DPO-v1.2
77
+ name: Open LLM Leaderboard
78
+ - task:
79
+ type: text-generation
80
+ name: Text Generation
81
+ dataset:
82
+ name: MuSR (0-shot)
83
+ type: TAUR-Lab/MuSR
84
+ args:
85
+ num_few_shot: 0
86
+ metrics:
87
+ - type: acc_norm
88
+ value: 12.35
89
+ name: acc_norm
90
+ source:
91
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=jpacifico/Chocolatine-14B-Instruct-DPO-v1.2
92
+ name: Open LLM Leaderboard
93
+ - task:
94
+ type: text-generation
95
+ name: Text Generation
96
+ dataset:
97
+ name: MMLU-PRO (5-shot)
98
+ type: TIGER-Lab/MMLU-Pro
99
+ config: main
100
+ split: test
101
+ args:
102
+ num_few_shot: 5
103
+ metrics:
104
+ - type: acc
105
+ value: 41.07
106
+ name: accuracy
107
+ source:
108
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=jpacifico/Chocolatine-14B-Instruct-DPO-v1.2
109
+ name: Open LLM Leaderboard
110
+
111
+ ---
112
+
113
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
114
+
115
+
116
+ # QuantFactory/Chocolatine-14B-Instruct-DPO-v1.2-GGUF
117
+ This is quantized version of [jpacifico/Chocolatine-14B-Instruct-DPO-v1.2](https://huggingface.co/jpacifico/Chocolatine-14B-Instruct-DPO-v1.2) created using llama.cpp
118
+
119
+ # Original Model Card
120
+
121
+
122
+ ### Chocolatine-14B-Instruct-DPO-v1.2
123
+
124
+ DPO fine-tuned of [microsoft/Phi-3-medium-4k-instruct](https://huggingface.co/microsoft/Phi-3-medium-4k-instruct) (14B params)
125
+ using the [jpacifico/french-orca-dpo-pairs-revised](https://huggingface.co/datasets/jpacifico/french-orca-dpo-pairs-revised) rlhf dataset.
126
+ Training in French also improves the model in English, surpassing the performances of its base model.
127
+ Window context = 4k tokens
128
+
129
+ * **4-bit quantized version** available here : [jpacifico/Chocolatine-14B-Instruct-DPO-v1.2-Q4_K_M-GGUF](https://huggingface.co/jpacifico/Chocolatine-14B-Instruct-DPO-v1.2-Q4_K_M-GGUF)
130
+
131
+ ### OpenLLM Leaderboard
132
+
133
+ Chocolatine is the best-performing model in size 13B on the [OpenLLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) (last update: 2024/10/18)
134
+
135
+ ![image/png](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Assets/chocolatine_14B_leaderboard_20240901.png?raw=false)
136
+
137
+
138
+ | Metric |Value|
139
+ |-------------------|----:|
140
+ |**Avg.** |**33.3**|
141
+ |IFEval |68.52|
142
+ |BBH |49.85|
143
+ |MATH Lvl 5 |17.98|
144
+ |GPQA |10.07|
145
+ |MuSR |12.35|
146
+ |MMLU-PRO |41.07|
147
+
148
+ ### MT-Bench-French
149
+
150
+ Chocolatine-14B-Instruct-DPO-v1.2 outperforms its previous versions and its base model Phi-3-medium-4k-instruct on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french), used with [multilingual-mt-bench](https://github.com/Peter-Devine/multilingual_mt_bench) and GPT-4-Turbo as LLM-judge.
151
+
152
+ ```
153
+ ########## First turn ##########
154
+ score
155
+ model turn
156
+ gpt-4o-mini 1 9.2875
157
+ Chocolatine-14B-Instruct-4k-DPO 1 8.6375
158
+ Chocolatine-14B-Instruct-DPO-v1.2 1 8.6125
159
+ Phi-3.5-mini-instruct 1 8.5250
160
+ Chocolatine-3B-Instruct-DPO-v1.2 1 8.3750
161
+ Phi-3-medium-4k-instruct 1 8.2250
162
+ gpt-3.5-turbo 1 8.1375
163
+ Chocolatine-3B-Instruct-DPO-Revised 1 7.9875
164
+ Daredevil-8B 1 7.8875
165
+ Meta-Llama-3.1-8B-Instruct 1 7.0500
166
+ vigostral-7b-chat 1 6.7875
167
+ Mistral-7B-Instruct-v0.3 1 6.7500
168
+ gemma-2-2b-it 1 6.4500
169
+ French-Alpaca-7B-Instruct_beta 1 5.6875
170
+ vigogne-2-7b-chat 1 5.6625
171
+
172
+ ########## Second turn ##########
173
+ score
174
+ model turn
175
+ gpt-4o-mini 2 8.912500
176
+ Chocolatine-14B-Instruct-DPO-v1.2 2 8.337500
177
+ Chocolatine-3B-Instruct-DPO-Revised 2 7.937500
178
+ Chocolatine-3B-Instruct-DPO-v1.2 2 7.862500
179
+ Phi-3-medium-4k-instruct 2 7.750000
180
+ Chocolatine-14B-Instruct-4k-DPO 2 7.737500
181
+ gpt-3.5-turbo 2 7.679167
182
+ Phi-3.5-mini-instruct 2 7.575000
183
+ Daredevil-8B 2 7.087500
184
+ Meta-Llama-3.1-8B-Instruct 2 6.787500
185
+ Mistral-7B-Instruct-v0.3 2 6.500000
186
+ vigostral-7b-chat 2 6.162500
187
+ gemma-2-2b-it 2 6.100000
188
+ French-Alpaca-7B-Instruct_beta 2 5.487395
189
+ vigogne-2-7b-chat 2 2.775000
190
+
191
+ ########## Average ##########
192
+ score
193
+ model
194
+ gpt-4o-mini 9.100000
195
+ Chocolatine-14B-Instruct-DPO-v1.2 8.475000
196
+ Chocolatine-14B-Instruct-4k-DPO 8.187500
197
+ Chocolatine-3B-Instruct-DPO-v1.2 8.118750
198
+ Phi-3.5-mini-instruct 8.050000
199
+ Phi-3-medium-4k-instruct 7.987500
200
+ Chocolatine-3B-Instruct-DPO-Revised 7.962500
201
+ gpt-3.5-turbo 7.908333
202
+ Daredevil-8B 7.487500
203
+ Meta-Llama-3.1-8B-Instruct 6.918750
204
+ Mistral-7B-Instruct-v0.3 6.625000
205
+ vigostral-7b-chat 6.475000
206
+ gemma-2-2b-it 6.275000
207
+ French-Alpaca-7B-Instruct_beta 5.587866
208
+ vigogne-2-7b-chat 4.218750
209
+ ```
210
+
211
+ ### Usage
212
+
213
+ You can run this model using my [Colab notebook](https://github.com/jpacifico/Chocolatine-LLM/blob/main/Chocolatine_14B_inference_test_colab.ipynb)
214
+
215
+ You can also run Chocolatine using the following code:
216
+
217
+ ```python
218
+ import transformers
219
+ from transformers import AutoTokenizer
220
+
221
+ # Format prompt
222
+ message = [
223
+ {"role": "system", "content": "You are a helpful assistant chatbot."},
224
+ {"role": "user", "content": "What is a Large Language Model?"}
225
+ ]
226
+ tokenizer = AutoTokenizer.from_pretrained(new_model)
227
+ prompt = tokenizer.apply_chat_template(message, add_generation_prompt=True, tokenize=False)
228
+
229
+ # Create pipeline
230
+ pipeline = transformers.pipeline(
231
+ "text-generation",
232
+ model=new_model,
233
+ tokenizer=tokenizer
234
+ )
235
+
236
+ # Generate text
237
+ sequences = pipeline(
238
+ prompt,
239
+ do_sample=True,
240
+ temperature=0.7,
241
+ top_p=0.9,
242
+ num_return_sequences=1,
243
+ max_length=200,
244
+ )
245
+ print(sequences[0]['generated_text'])
246
+ ```
247
+
248
+ ### Limitations
249
+
250
+ The Chocolatine model is a quick demonstration that a base model can be easily fine-tuned to achieve compelling performance.
251
+ It does not have any moderation mechanism.
252
+
253
+ - **Developed by:** Jonathan Pacifico, 2024
254
+ - **Model type:** LLM
255
+ - **Language(s) (NLP):** French, English
256
+ - **License:** MIT
257
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
258
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_jpacifico__Chocolatine-14B-Instruct-DPO-v1.2)
259
+
260
+ | Metric |Value|
261
+ |-------------------|----:|
262
+ |Avg. |33.30|
263
+ |IFEval (0-Shot) |68.52|
264
+ |BBH (3-Shot) |49.85|
265
+ |MATH Lvl 5 (4-Shot)|17.98|
266
+ |GPQA (0-shot) |10.07|
267
+ |MuSR (0-shot) |12.35|
268
+ |MMLU-PRO (5-shot) |41.07|
269
+
270
+