File size: 2,716 Bytes
ed4a312
641b9ad
 
 
 
 
7af0132
641b9ad
8a22908
fecb7aa
e1ab392
 
ed4a312
641b9ad
 
 
 
7af0132
641b9ad
e1ab392
641b9ad
7af0132
641b9ad
7af0132
641b9ad
 
7af0132
 
 
641b9ad
7af0132
 
 
 
641b9ad
7af0132
 
 
641b9ad
7af0132
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ed4a312
641b9ad
 
 
 
 
 
 
 
 
 
 
 
 
ed4a312
641b9ad
ed4a312
8a22908
641b9ad
 
 
e1ab392
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
license: apache-2.0
base_model: EleutherAI/pythia-2.8b-deduped
tags:
- generated_from_trainer
model-index:
- name: PythiaChat-2.8B_v0.1
  results: []
library_name: peft
inference: false
datasets:
- linkanjarad/baize-chat-data
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# PythiaChat-2.8B_v0.1

This model is a fine-tuned version of [EleutherAI/pythia-2.8b-deduped](https://huggingface.co./EleutherAI/pythia-2.8b-deduped) on the [Baize dataset](https://huggingface.co./datasets/linkanjarad/baize-chat-data/viewer/linkanjarad--baize-chat-data) with LoRA, trained for only 200+ steps for testing. This model is trained for "chat" style instruction following capabilities.

# Sample Use

Remember to mark the human messages with `[|Human|]` and AI messages with `[|AI]` at the start.


```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel, PeftConfig

peft_model_id = "linkanjarad/PythiaChat-2.8B_v0.1"
model_id = "EleutherAI/pythia-2.8b-deduped"
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", trust_remote_code=True) # you can add `load_in_4bit=True` for faster inference
model = PeftModel.from_pretrained(model, peft_model_id)

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = model.to('cuda')
model.eval()


input_text = """The conversation between human and AI assistant.
[|Human|] How do I open a file with python?
[|AI|]"""

# Tokenize the input text
input_ids = tokenizer.encode(input_text, return_tensors='pt').to('cuda')
len_input = len(input_ids[0])
# Generate text using the model
with torch.no_grad():
    output = model.generate(input_ids=input_ids, max_length=len_input+120, temperature=0.9, do_sample=True)

# Decode the generated output
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print(generated_text)
```


Example Output

```
The conversation between human and AI assistant.
[|Human|] How do I open a file with python?
[|AI|] To open a file with python, you can use the open function as follows:

>>> with open('filename.txt', 'w') as f:
...     f.write(data)
```

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 7e-05
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 80
- num_epochs: 1

### Framework versions

- PEFT 0.4.0
- Transformers 4.31.0
- Pytorch 2.0.0
- Datasets 2.13.1
- Tokenizers 0.13.3