rhaymison commited on
Commit
8ebebd2
·
verified ·
1 Parent(s): d93999b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -173
README.md CHANGED
@@ -150,200 +150,97 @@ model-index:
150
  name: Open Portuguese LLM Leaderboard
151
  ---
152
 
153
- # Model Card for Model ID
154
 
155
- <!-- Provide a quick summary of what the model is/does. -->
 
 
156
 
157
 
 
 
158
 
159
- ## Model Details
160
 
161
- ### Model Description
 
162
 
163
- <!-- Provide a longer summary of what this model is. -->
164
 
165
- This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
 
 
166
 
167
- - **Developed by:** [More Information Needed]
168
- - **Funded by [optional]:** [More Information Needed]
169
- - **Shared by [optional]:** [More Information Needed]
170
- - **Model type:** [More Information Needed]
171
- - **Language(s) (NLP):** [More Information Needed]
172
- - **License:** [More Information Needed]
173
- - **Finetuned from model [optional]:** [More Information Needed]
174
 
175
- ### Model Sources [optional]
 
 
 
176
 
177
- <!-- Provide the basic links for the model. -->
 
 
 
178
 
179
- - **Repository:** [More Information Needed]
180
- - **Paper [optional]:** [More Information Needed]
181
- - **Demo [optional]:** [More Information Needed]
182
 
183
- ## Uses
 
184
 
185
- <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 
 
 
 
 
 
 
 
 
 
 
 
186
 
187
- ### Direct Use
188
 
189
- <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
 
 
 
 
 
 
 
190
 
191
- [More Information Needed]
 
 
192
 
193
- ### Downstream Use [optional]
 
 
194
 
195
- <!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
196
 
197
- [More Information Needed]
 
 
 
 
 
 
 
 
198
 
199
- ### Out-of-Scope Use
 
 
 
 
200
 
201
- <!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
202
-
203
- [More Information Needed]
204
-
205
- ## Bias, Risks, and Limitations
206
-
207
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
208
-
209
- [More Information Needed]
210
-
211
- ### Recommendations
212
-
213
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
214
-
215
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
216
-
217
- ## How to Get Started with the Model
218
-
219
- Use the code below to get started with the model.
220
-
221
- [More Information Needed]
222
-
223
- ## Training Details
224
-
225
- ### Training Data
226
-
227
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
228
-
229
- [More Information Needed]
230
-
231
- ### Training Procedure
232
-
233
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
234
-
235
- #### Preprocessing [optional]
236
-
237
- [More Information Needed]
238
-
239
-
240
- #### Training Hyperparameters
241
-
242
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
243
-
244
- #### Speeds, Sizes, Times [optional]
245
-
246
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
247
-
248
- [More Information Needed]
249
-
250
- ## Evaluation
251
-
252
- <!-- This section describes the evaluation protocols and provides the results. -->
253
-
254
- ### Testing Data, Factors & Metrics
255
-
256
- #### Testing Data
257
-
258
- <!-- This should link to a Dataset Card if possible. -->
259
-
260
- [More Information Needed]
261
-
262
- #### Factors
263
-
264
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
265
-
266
- [More Information Needed]
267
-
268
- #### Metrics
269
-
270
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
271
-
272
- [More Information Needed]
273
-
274
- ### Results
275
-
276
- [More Information Needed]
277
-
278
- #### Summary
279
-
280
-
281
-
282
- ## Model Examination [optional]
283
-
284
- <!-- Relevant interpretability work for the model goes here -->
285
-
286
- [More Information Needed]
287
-
288
- ## Environmental Impact
289
-
290
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
291
-
292
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
293
-
294
- - **Hardware Type:** [More Information Needed]
295
- - **Hours used:** [More Information Needed]
296
- - **Cloud Provider:** [More Information Needed]
297
- - **Compute Region:** [More Information Needed]
298
- - **Carbon Emitted:** [More Information Needed]
299
-
300
- ## Technical Specifications [optional]
301
-
302
- ### Model Architecture and Objective
303
-
304
- [More Information Needed]
305
-
306
- ### Compute Infrastructure
307
-
308
- [More Information Needed]
309
-
310
- #### Hardware
311
-
312
- [More Information Needed]
313
-
314
- #### Software
315
-
316
- [More Information Needed]
317
-
318
- ## Citation [optional]
319
-
320
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
321
-
322
- **BibTeX:**
323
-
324
- [More Information Needed]
325
-
326
- **APA:**
327
-
328
- [More Information Needed]
329
-
330
- ## Glossary [optional]
331
-
332
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
333
-
334
- [More Information Needed]
335
-
336
- ## More Information [optional]
337
-
338
- [More Information Needed]
339
-
340
- ## Model Card Authors [optional]
341
-
342
- [More Information Needed]
343
-
344
- ## Model Card Contact
345
-
346
- [More Information Needed]
347
 
348
 
349
  # Open Portuguese LLM Leaderboard Evaluation Results
@@ -363,3 +260,17 @@ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-
363
  |PT Hate Speech Binary | 65.76|
364
  |tweetSentBR | 53.32|
365
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
150
  name: Open Portuguese LLM Leaderboard
151
  ---
152
 
153
+ # Phi-3-portuguese-tom-cat-128k-instruct
154
 
155
+ <p align="center">
156
+ <img src="https://raw.githubusercontent.com/rhaymisonbetini/huggphotos/main/tom-cat.webp" width="50%" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
157
+ </p>
158
 
159
 
160
+ This model was trained with a superset of 300,000 instructions in Portuguese.
161
+ The model comes to help fill the gap in models in Portuguese. Tuned from the microsoft/Phi-3-mini-4k.
162
 
163
+ # How to use
164
 
165
+ This model was trained with a superset of 300,000 instructions in Portuguese.
166
+ The model comes to help fill the gap in models in Portuguese. Tuned from the microsoft/Phi-3-mini-4k.
167
 
168
+ # How to use
169
 
170
+ ### FULL MODEL : A100
171
+ ### HALF MODEL: L4
172
+ ### 8bit or 4bit : T4 or V100
173
 
174
+ You can use the model in its normal form up to 4-bit quantization. Below we will use both approaches.
175
+ Remember that verbs are important in your prompt. Tell your model how to act or behave so that you can guide them along the path of their response.
176
+ Important points like these help models (even smaller models like 4b) to perform much better.
 
 
 
 
177
 
178
+ ```python
179
+ !pip install -q -U transformers
180
+ !pip install -q -U accelerate
181
+ !pip install -q -U bitsandbytes
182
 
183
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
184
+ model = AutoModelForCausalLM.from_pretrained("rhaymison/phi-3-portuguese-tom-cat-4k-instruct", device_map= {"": 0})
185
+ tokenizer = AutoTokenizer.from_pretrained("rhaymison/phi-3-portuguese-tom-cat-4k-instruct")
186
+ model.eval()
187
 
188
+ ```
 
 
189
 
190
+ You can use with Pipeline.
191
+ ```python
192
 
193
+ from transformers import pipeline
194
+ pipe = pipeline("text-generation",
195
+ model=model,
196
+ tokenizer=tokenizer,
197
+ do_sample=True,
198
+ max_new_tokens=512,
199
+ num_beams=2,
200
+ temperature=0.3,
201
+ top_k=50,
202
+ top_p=0.95,
203
+ early_stopping=True,
204
+ pad_token_id=tokenizer.eos_token_id,
205
+ )
206
 
 
207
 
208
+ def format_template(question:str):
209
+ system_prompt = "Abaixo está uma instrução que descreve uma tarefa, juntamente com uma entrada que fornece mais contexto. Escreva uma resposta que complete adequadamente o pedido."
210
+ return f"""<s><|system|>
211
+ { system_prompt }
212
+ <|user|>
213
+ { question }
214
+ <|assistant|>
215
+ """
216
 
217
+ question = format_template("E possivel ir de Carro dos Estados unidos ate o japão")
218
+ pipe(question)
219
+ ```
220
 
221
+ If you are having a memory problem such as "CUDA Out of memory", you should use 4-bit or 8-bit quantization.
222
+ For the complete model in colab you will need the A100.
223
+ If you want to use 4bits or 8bits, T4 or L4 will already solve the problem.
224
 
225
+ # 4bits example
226
 
227
+ ```python
228
+ from transformers import BitsAndBytesConfig
229
+ import torch
230
+ nb_4bit_config = BitsAndBytesConfig(
231
+ load_in_4bit=True,
232
+ bnb_4bit_quant_type="nf4",
233
+ bnb_4bit_compute_dtype=torch.bfloat16,
234
+ bnb_4bit_use_double_quant=True
235
+ )
236
 
237
+ model = AutoModelForCausalLM.from_pretrained(
238
+ base_model,
239
+ quantization_config=bnb_config,
240
+ device_map={"": 0}
241
+ )
242
 
243
+ ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
244
 
245
 
246
  # Open Portuguese LLM Leaderboard Evaluation Results
 
260
  |PT Hate Speech Binary | 65.76|
261
  |tweetSentBR | 53.32|
262
 
263
+
264
+ ### Comments
265
+
266
+ Any idea, help or report will always be welcome.
267
+
268
269
+
270
+ <div style="display:flex; flex-direction:row; justify-content:left">
271
+ <a href="https://www.linkedin.com/in/rhaymison-cristian-betini-2b3016175/" target="_blank">
272
+ <img src="https://img.shields.io/badge/LinkedIn-0077B5?style=for-the-badge&logo=linkedin&logoColor=white">
273
+ </a>
274
+ <a href="https://github.com/rhaymisonbetini" target="_blank">
275
+ <img src="https://img.shields.io/badge/GitHub-100000?style=for-the-badge&logo=github&logoColor=white">
276
+ </a>