ejbejaranos's picture
Update README.md
937882e verified
---
library_name: transformers
license: apache-2.0
datasets:
- abideen/Cosmopedia-100k-pretrain
language:
- en
base_model:
- meta-llama/Llama-3.1-8B-Instruct
---
# πŸš€ BitNet-Llama3 (from 8B to 2B) Transformation & Training
This project transforms a Llama3 model from 8B parameters to a BitNet architecture with 2B parameters, applying BitLinear layers. Additionally, the model is trained with a predefined dataset and uploaded to Hugging Face for future use.
---
### Model Description
<!-- Provide a longer summary of what this model is. -->
This is the model card of a πŸ€— transformers model that has been pushed on the Hub. This model card has been automatically generated.
- **Developed by:** [email protected]
- **Funded by [optional]:** ITCL
- **Shared by [optional]:** [More Information Needed]
- **Model type:** LLama3 8B Tramsformed to Bitnet
- **Language(s) (NLP):** Bitnet
- **License:** [More Information Needed]
- **Finetuned from model [optional]:** [More Information Needed]
### Model Sources [optional]
<!-- Provide the basic links for the model. -->
- **Repository:** ejbejaranos/Bitnet-Llama3-from8BM-now2B
## πŸ“„ Description
This repository includes scripts to:
1. 🎯 Transform a Llama3 model to a BitNet architecture.
2. πŸ’» Train the model using Hugging Face and Weights & Biases.
3. πŸš€ Upload the transformed and trained model to Hugging Face for inference and future use.
---
## βš™οΈ Requirements
- Python 3.8+
- Pytorch 1.10+
- Transformers 4.0+
- Hugging Face Hub API
- Weights & Biases
---
## 🧰 Installation
Make sure you have all required dependencies installed:
```bash
pip install torch transformers datasets wandb huggingface_hub
```
## πŸ’₯ How to Use
1. Using the trained model for inference
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from utils.bitnet_transformation import replace_linears_in_hf
# Load the BitNet model
model = "ejbejaranos/Bitnet-Llama3-from8BM-now2B"
model = AutoModelForCausalLM.from_pretrained(
model,
use_auth_token="YOUR_HF_TOKEN"
)
# Replace BitNet layers for inference
replace_linears_in_hf(model)
tokenizer = AutoTokenizer.from_pretrained("ejbejaranos/Bitnet-Llama3-from8BM-now2B")
# Set up for inference
model.to(device="cuda:0")
prompt = "What is Machine Learning?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
generate_ids = model.generate(inputs.input_ids, max_length=50)
output = tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
print(output)
```
---
## πŸ§‘β€πŸ”¬ Metrics
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6419c2f6b4adb0e101b17b6c/nCE1-KLDWDqSCmPtDMmWa.png)
During training, the following metrics will be logged to Weights & Biases:
- `final_loss`: 1.4.
- `final_perplexity`: 4.2.
---
## 🎯 Future Goals
- Implement additional quantization layers for inference.
- Test the model on different datasets and contexts.
---
## πŸ“’ Contact
If you have questions, suggestions, or improvements, feel free to open an Issue or contact us through [Hugging Face](https://huggingface.co./ejbejaranos).
---
## Environmental Impact
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
- **Hardware Type:** [More Information Needed]
- **Hours used:** [More Information Needed]
- **Cloud Provider:** [More Information Needed]
- **Compute Region:** [More Information Needed]
- **Carbon Emitted:** [More Information Needed]
-
## πŸ’‘ Acknowledgments
Thanks to [Hugging Face](https://huggingface.co./) and [Weights & Biases](https://wandb.ai/) for providing support and tools.