--- library_name: transformers license: llama3 datasets: - VTSNLP/vietnamese_curated_dataset language: - vi - en base_model: - meta-llama/Meta-Llama-3-8B pipeline_tag: text-generation --- # Model Information ## Model Details ### Model Description Llama3-ViettelSolutions-8B is a variant of the Meta Llama-3-8B model, continued pre-trained on the [Vietnamese curated dataset](https://huggingface.co./datasets/VTSNLP/vietnamese_curated_dataset) and supervised fine-tuned on 5 million samples of Vietnamese instruct data. - **Developed by:** Viettel Solutions - **Funded by:** NVIDIA - **Model type:** Autoregressive transformer model - **Language(s) (NLP):** Vietnamese, English - **License:** Llama 3 Community License - **Finetuned from model:** meta-llama/Meta-Llama-3-8B ## Uses Example snippet for usage with Transformers: ``` import transformers import torch model_id = "VTSNLP/Llama3-ViettelSolutions-8B" pipeline = transformers.pipeline( "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto" ) pipeline("Xin chào!") ``` ## Training Details ### Training Data - Dataset for continue pretrain: [Vietnamese curated dataset](https://huggingface.co./datasets/VTSNLP/vietnamese_curated_dataset) - Dataset for supervised fine-tuning: [Instruct general dataset](https://huggingface.co./datasets/VTSNLP/instruct_general_dataset) ### Training Procedure #### Preprocessing [More Information Needed] #### Training Hyperparameters - **Training regime:** bf16 mixed precision - **Data sequence length:** 8192 - **Tensor model parallel size:** 4 - **Pipelinemodel parallel size:** 1 - **Context parallel size:** 1 - **Micro batch size:** 1 - **Global batch size:** 512 ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] #### Summary [More Information Needed] ## Technical Specifications - Compute Infrastructure: NVIDIA DGX - Hardware: 4 x A100 80GB - Software: [NeMo Framework](https://github.com/NVIDIA/NeMo) ## Citation **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## More Information [More Information Needed] ## Model Card Authors [More Information Needed] ## Model Card Contact [More Information Needed]