|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
datasets: |
|
- abideen/Cosmopedia-100k-pretrain |
|
language: |
|
- en |
|
base_model: |
|
- meta-llama/Llama-3.1-8B-Instruct |
|
--- |
|
# π BitNet-Llama3 (from 8B to 2B) Transformation & Training |
|
|
|
This project transforms a Llama3 model from 8B parameters to a BitNet architecture with 2B parameters, applying BitLinear layers. Additionally, the model is trained with a predefined dataset and uploaded to Hugging Face for future use. |
|
|
|
--- |
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
This is the model card of a π€ transformers model that has been pushed on the Hub. This model card has been automatically generated. |
|
|
|
- **Developed by:** [email protected] |
|
- **Funded by [optional]:** ITCL |
|
- **Shared by [optional]:** [More Information Needed] |
|
- **Model type:** LLama3 8B Tramsformed to Bitnet |
|
- **Language(s) (NLP):** Bitnet |
|
- **License:** [More Information Needed] |
|
- **Finetuned from model [optional]:** [More Information Needed] |
|
|
|
### Model Sources [optional] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** ejbejaranos/Bitnet-Llama3-from8BM-now2B |
|
|
|
## π Description |
|
|
|
This repository includes scripts to: |
|
1. π― Transform a Llama3 model to a BitNet architecture. |
|
2. π» Train the model using Hugging Face and Weights & Biases. |
|
3. π Upload the transformed and trained model to Hugging Face for inference and future use. |
|
|
|
--- |
|
|
|
## βοΈ Requirements |
|
|
|
- Python 3.8+ |
|
- Pytorch 1.10+ |
|
- Transformers 4.0+ |
|
- Hugging Face Hub API |
|
- Weights & Biases |
|
|
|
--- |
|
|
|
## π§° Installation |
|
|
|
Make sure you have all required dependencies installed: |
|
|
|
```bash |
|
pip install torch transformers datasets wandb huggingface_hub |
|
``` |
|
|
|
## π₯ How to Use |
|
|
|
1. Using the trained model for inference |
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
from utils.bitnet_transformation import replace_linears_in_hf |
|
|
|
# Load the BitNet model |
|
model = "ejbejaranos/Bitnet-Llama3-from8BM-now2B" |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model, |
|
use_auth_token="YOUR_HF_TOKEN" |
|
) |
|
|
|
# Replace BitNet layers for inference |
|
replace_linears_in_hf(model) |
|
tokenizer = AutoTokenizer.from_pretrained("ejbejaranos/Bitnet-Llama3-from8BM-now2B") |
|
|
|
# Set up for inference |
|
model.to(device="cuda:0") |
|
prompt = "What is Machine Learning?" |
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
generate_ids = model.generate(inputs.input_ids, max_length=50) |
|
output = tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0] |
|
|
|
print(output) |
|
|
|
``` |
|
|
|
|
|
--- |
|
## π§βπ¬ Metrics |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6419c2f6b4adb0e101b17b6c/nCE1-KLDWDqSCmPtDMmWa.png) |
|
|
|
During training, the following metrics will be logged to Weights & Biases: |
|
- `final_loss`: 1.4. |
|
- `final_perplexity`: 4.2. |
|
|
|
--- |
|
|
|
## π― Future Goals |
|
|
|
- Implement additional quantization layers for inference. |
|
- Test the model on different datasets and contexts. |
|
|
|
--- |
|
|
|
## π’ Contact |
|
|
|
If you have questions, suggestions, or improvements, feel free to open an Issue or contact us through [Hugging Face](https://huggingface.co./ejbejaranos). |
|
|
|
--- |
|
|
|
## Environmental Impact |
|
|
|
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly --> |
|
|
|
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). |
|
|
|
- **Hardware Type:** [More Information Needed] |
|
- **Hours used:** [More Information Needed] |
|
- **Cloud Provider:** [More Information Needed] |
|
- **Compute Region:** [More Information Needed] |
|
- **Carbon Emitted:** [More Information Needed] |
|
- |
|
|
|
|
|
## π‘ Acknowledgments |
|
|
|
Thanks to [Hugging Face](https://huggingface.co./) and [Weights & Biases](https://wandb.ai/) for providing support and tools. |