Text Generation
PEFT
Safetensors
German
Bavarian
File size: 1,166 Bytes
6212754
5252e48
 
 
 
 
3960250
5252e48
2b64357
5252e48
848b539
6212754
 
dea279e
6212754
185e790
 
 
 
 
 
 
 
 
 
 
11fb704
185e790
 
 
11fb704
185e790
 
 
 
 
bab952f
11fb704
 
185e790
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
library_name: peft
datasets:
- cis-lmu/bavarian_to_english
language:
- de
- bar
base_model:
- LSX-UniWue/LLaMmlein_1B
pipeline_tag: text-generation
license: other
---

# LLäMmlein 1B

This is a Bavarian adapter for the German Tinyllama 1B language model which was tuned on a dump of the Bavarian wikipedia, without further optimization. Please don't take it too seriously ;)
Find more details on our [page](https://www.informatik.uni-wuerzburg.de/datascience/projects/nlp/llammlein/) and our [preprint](arxiv.org/abs/2411.11171)!

## Run it
```py
import torch
from peft import PeftConfig, PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# script config
base_model_name = "LSX-UniWue/LLaMmlein_1B"
adapter_name = "LSX-UniWue/Betzerl_1B_wiki_preview"
device = "cuda"  # or mps

# load model
config = PeftConfig.from_pretrained(adapter_name)
base_model = model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.bfloat16,
    device_map=device,
)
base_model.resize_token_embeddings(32001)
model = PeftModel.from_pretrained(base_model, adapter_name)
tokenizer = AutoTokenizer.from_pretrained(adapter_name)
```