SebastianBodza
/

DElefant-MPT

Model card Files Files and versions Community

DElefant-MPT / README.md

SebastianBodza's picture

Update README.md

baf88d1 over 1 year ago

|

history blame contribute delete

2.95 kB

	---
	license: cc-by-nc-sa-4.0
	datasets:
	- SebastianBodza/Ger_WizardLM_evol_instruct_70k_V0
	language:
	- de
	---
	# DElefant-MPT:
	<img src="https://huggingface.co./SebastianBodza/DElefant-MPT/resolve/main/badge_gerlefant.png" style="max-width:200px">
	DElefant is a LLM developed for instruction tuned German interactions. This version is built on top of the MPT-30B model from <a href="https://huggingface.co./mosaicml/mpt-30b">MosaicML</a> with a opus-mt translated and afterwards filtered <a href="https://huggingface.co./datasets/SebastianBodza/Ger_WizardLM_evol_instruct_70k_V0">WizardLM</a> dataset. The evolved dataset led to SOTA english LLMs and we hope by incoperating the translated dataset to a base model we can leverage the capabilities for various tasks in german including Code generation.
	Due to limitation in translation, the comments inside of the code blocks remained english, however the Coding was kept in working condition.

	## Model Description:
	QLoRa-Finetuning of the MPT-30B model on two RTX 3090 with the translated WizardLM Dataset.

	## Roadmap:
	If there is sufficient demand, additional adjustments can be made:
	- Native German generated dataset
	- Full Fine-Tuning of larger LLMs e.g. Falcon, Starcoderplus, ...

	## How to use:
	Prompt-Template:
	```
	{instruction}\n\n### Response:
	```
	Code example for inference:
	```
	import torch
	from peft import PeftModel, PeftConfig
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Load peft config for pre-trained checkpoint etc.
	config = PeftConfig.from_pretrained("SebastianBodza/DElefant-MPT")

	# load base LLM model and tokenizer
	tokenizer = AutoTokenizer.from_pretrained( "mosaicml/mpt-30b",
	padding_side="right",
	use_fast=True)
	model = AutoModelForCausalLM.from_pretrained("mosaicml/mpt-30b", device_map="auto", load_in_8bit=True)

	# Load the Lora model
	model = PeftModel.from_pretrained(model, "SebastianBodza/DElefant-MPT", device_map={"":0})
	model.eval()

	frage = "Wie heißt der Bundeskanzler?"
	prompt = f"{frage}\n\n### Response:"

	txt = tokenizer(prompt, return_tensors="pt").to("cuda")
	txt = model.generate(**txt,
	max_new_tokens=256,
	eos_token_id=tokenizer.eos_token_id)
	tokenizer.decode(txt[0], skip_special_tokens=True)
	```
	## Limitations:
	Gradient-Accumulation led to divergence after a couple of steps. Therefore we reduced the blocksize to 1024 and used two RTX 3090 to get a BS of 4. Probably too small to generalize well.
	## Training:
	Training was based on Llama-X with the adaptions of WizardLMs training script and additional adjustments to QLoRa tune. MPT-Code from <a href="https://huggingface.co./SebastianBodza/mpt-30B-qlora-multi_GPU">SebastianBodza/mpt-30B-qlora-multi_GPU</a>

	<img src="https://huggingface.co./SebastianBodza/DElefant-MPT/resolve/main/train_loss_DElefant.svg" style="max-width:350px">