|
--- |
|
license: cc-by-nc-sa-4.0 |
|
datasets: |
|
- SebastianBodza/Ger_WizardLM_evol_instruct_70k_V0 |
|
language: |
|
- de |
|
--- |
|
# DElefant-MPT: |
|
<img src="https://huggingface.co./SebastianBodza/DElefant-MPT/resolve/main/badge_gerlefant.png" style="max-width:200px"> |
|
DElefant is a LLM developed for instruction tuned German interactions. This version is built on top of the MPT-30B model from <a href="https://huggingface.co./mosaicml/mpt-30b">MosaicML</a> with a opus-mt translated and afterwards filtered <a href="https://huggingface.co./datasets/SebastianBodza/Ger_WizardLM_evol_instruct_70k_V0">WizardLM</a> dataset. The evolved dataset led to SOTA english LLMs and we hope by incoperating the translated dataset to a base model we can leverage the capabilities for various tasks in german including Code generation. |
|
Due to limitation in translation, the comments inside of the code blocks remained english, however the Coding was kept in working condition. |
|
|
|
## Model Description: |
|
QLoRa-Finetuning of the MPT-30B model on two RTX 3090 with the translated WizardLM Dataset. |
|
|
|
## Roadmap: |
|
If there is sufficient demand, additional adjustments can be made: |
|
- Native German generated dataset |
|
- Full Fine-Tuning of larger LLMs e.g. Falcon, Starcoderplus, ... |
|
|
|
## How to use: |
|
Prompt-Template: |
|
``` |
|
{instruction}\n\n### Response: |
|
``` |
|
Code example for inference: |
|
``` |
|
import torch |
|
from peft import PeftModel, PeftConfig |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
# Load peft config for pre-trained checkpoint etc. |
|
config = PeftConfig.from_pretrained("SebastianBodza/DElefant-MPT") |
|
|
|
# load base LLM model and tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained( "mosaicml/mpt-30b", |
|
padding_side="right", |
|
use_fast=True) |
|
model = AutoModelForCausalLM.from_pretrained("mosaicml/mpt-30b", device_map="auto", load_in_8bit=True) |
|
|
|
# Load the Lora model |
|
model = PeftModel.from_pretrained(model, "SebastianBodza/DElefant-MPT", device_map={"":0}) |
|
model.eval() |
|
|
|
frage = "Wie heißt der Bundeskanzler?" |
|
prompt = f"{frage}\n\n### Response:" |
|
|
|
txt = tokenizer(prompt, return_tensors="pt").to("cuda") |
|
txt = model.generate(**txt, |
|
max_new_tokens=256, |
|
eos_token_id=tokenizer.eos_token_id) |
|
tokenizer.decode(txt[0], skip_special_tokens=True) |
|
``` |
|
## Limitations: |
|
Gradient-Accumulation led to divergence after a couple of steps. Therefore we reduced the blocksize to 1024 and used two RTX 3090 to get a BS of 4. Probably too small to generalize well. |
|
## Training: |
|
Training was based on Llama-X with the adaptions of WizardLMs training script and additional adjustments to QLoRa tune. MPT-Code from <a href="https://huggingface.co./SebastianBodza/mpt-30B-qlora-multi_GPU">SebastianBodza/mpt-30B-qlora-multi_GPU</a> |
|
|
|
<img src="https://huggingface.co./SebastianBodza/DElefant-MPT/resolve/main/train_loss_DElefant.svg" style="max-width:350px"> |
|
|
|
|