Update README.md

937882e verified 5 months ago

3.92 kB

	---
	library_name: transformers
	license: apache-2.0
	datasets:
	- abideen/Cosmopedia-100k-pretrain
	language:
	- en
	base_model:
	- meta-llama/Llama-3.1-8B-Instruct
	---
	# 🚀 BitNet-Llama3 (from 8B to 2B) Transformation & Training

	This project transforms a Llama3 model from 8B parameters to a BitNet architecture with 2B parameters, applying BitLinear layers. Additionally, the model is trained with a predefined dataset and uploaded to Hugging Face for future use.

	---
	### Model Description

	<!-- Provide a longer summary of what this model is. -->

	This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.

	- Developed by: [email protected]
	- Funded by [optional]: ITCL
	- Shared by [optional]: [More Information Needed]
	- Model type: LLama3 8B Tramsformed to Bitnet
	- Language(s) (NLP): Bitnet
	- License: [More Information Needed]
	- Finetuned from model [optional]: [More Information Needed]

	### Model Sources [optional]

	<!-- Provide the basic links for the model. -->

	- Repository: ejbejaranos/Bitnet-Llama3-from8BM-now2B

	## 📄 Description

	This repository includes scripts to:
	1. 🎯 Transform a Llama3 model to a BitNet architecture.
	2. 💻 Train the model using Hugging Face and Weights & Biases.
	3. 🚀 Upload the transformed and trained model to Hugging Face for inference and future use.

	---

	## ⚙️ Requirements

	- Python 3.8+
	- Pytorch 1.10+
	- Transformers 4.0+
	- Hugging Face Hub API
	- Weights & Biases

	---

	## 🧰 Installation

	Make sure you have all required dependencies installed:

	```bash
	pip install torch transformers datasets wandb huggingface_hub
	```

	## 💥 How to Use

	1. Using the trained model for inference
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from utils.bitnet_transformation import replace_linears_in_hf

	# Load the BitNet model
	model = "ejbejaranos/Bitnet-Llama3-from8BM-now2B"
	model = AutoModelForCausalLM.from_pretrained(
	model,
	use_auth_token="YOUR_HF_TOKEN"
	)

	# Replace BitNet layers for inference
	replace_linears_in_hf(model)
	tokenizer = AutoTokenizer.from_pretrained("ejbejaranos/Bitnet-Llama3-from8BM-now2B")

	# Set up for inference
	model.to(device="cuda:0")
	prompt = "What is Machine Learning?"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	generate_ids = model.generate(inputs.input_ids, max_length=50)
	output = tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]

	print(output)

	```


	---
	## 🧑‍🔬 Metrics

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6419c2f6b4adb0e101b17b6c/nCE1-KLDWDqSCmPtDMmWa.png)

	During training, the following metrics will be logged to Weights & Biases:
	- `final_loss`: 1.4.
	- `final_perplexity`: 4.2.

	---

	## 🎯 Future Goals

	- Implement additional quantization layers for inference.
	- Test the model on different datasets and contexts.

	---

	## 📢 Contact

	If you have questions, suggestions, or improvements, feel free to open an Issue or contact us through [Hugging Face](https://huggingface.co./ejbejaranos).

	---

	## Environmental Impact

	<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

	- Hardware Type: [More Information Needed]
	- Hours used: [More Information Needed]
	- Cloud Provider: [More Information Needed]
	- Compute Region: [More Information Needed]
	- Carbon Emitted: [More Information Needed]
	-


	## 💡 Acknowledgments

	Thanks to [Hugging Face](https://huggingface.co./) and [Weights & Biases](https://wandb.ai/) for providing support and tools.