Pham Van Ngoan commited on
Commit
6682139
1 Parent(s): c4401b6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -19
README.md CHANGED
@@ -19,31 +19,18 @@ tags:
19
  This model is intended for researchers, developers, and enthusiasts who are interested in understanding the performance of the Llama 2 model on Vietnamese. It can be used for generating Vietnamese text based on given instructions or for any other task that requires a Vietnamese language model.
20
 
21
  ## Limitations
22
- Data Size: The model was fine-tuned on a relatively small dataset of 20,000 instruction samples, which might not capture the full complexity and nuances of the Vietnamese language.
23
- Preliminary Model: This is an initial experiment with the Llama 2 architecture on Vietnamese. More refined versions and evaluations will be available soon.
24
- Performance
25
  Specific performance metrics on this fine-tuned model will be provided in the upcoming comprehensive evaluation.
26
 
27
  ## Ethical Considerations
28
- Bias and Fairness: Like any other machine learning model, there is a possibility that this model might reproduce or amplify biases present in the training data.
29
- Use in Critical Systems: As this is a preliminary model, it is recommended not to use it for mission-critical applications without proper validation.
30
- Fine-tuning Data
31
  The model was fine-tuned on a custom dataset of 20,000 instruction samples in Vietnamese. More details about the composition and source of this dataset will be provided in the detailed evaluation report.
32
 
33
- ## Usage
34
- To use this model via the Hugging Face API:
35
 
36
- ```python
37
- from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
38
-
39
- tokenizer = AutoTokenizer.from_pretrained("ngoantech/Llama-2-7b-vi-sample")
40
- model = AutoModelForSeq2SeqLM.from_pretrained("ngoantech/Llama-2-7b-vi-sample")
41
-
42
- inputs = tokenizer.encode("YOUR INSTRUCTION HERE", return_tensors="pt")
43
- outputs = model.generate(inputs)
44
- decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)
45
- print(decoded)
46
- ```
47
 
48
  ## Credits
49
  I would like to express our gratitude to the creators of the Llama 2 architecture and the Hugging Face community for their tools and resources.
 
19
  This model is intended for researchers, developers, and enthusiasts who are interested in understanding the performance of the Llama 2 model on Vietnamese. It can be used for generating Vietnamese text based on given instructions or for any other task that requires a Vietnamese language model.
20
 
21
  ## Limitations
22
+ - Data Size: The model was fine-tuned on a relatively small dataset of 20,000 instruction samples, which might not capture the full complexity and nuances of the Vietnamese language.
23
+ - Preliminary Model: This is an initial experiment with the Llama 2 architecture on Vietnamese. More refined versions and evaluations will be available soon.
24
+ - Performance:
25
  Specific performance metrics on this fine-tuned model will be provided in the upcoming comprehensive evaluation.
26
 
27
  ## Ethical Considerations
28
+ - Bias and Fairness: Like any other machine learning model, there is a possibility that this model might reproduce or amplify biases present in the training data.
29
+ - Use in Critical Systems: As this is a preliminary model, it is recommended not to use it for mission-critical applications without proper validation.
30
+ - Fine-tuning Data:
31
  The model was fine-tuned on a custom dataset of 20,000 instruction samples in Vietnamese. More details about the composition and source of this dataset will be provided in the detailed evaluation report.
32
 
 
 
33
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  ## Credits
36
  I would like to express our gratitude to the creators of the Llama 2 architecture and the Hugging Face community for their tools and resources.