mav23 commited on
Commit
4e0d59a
·
verified ·
1 Parent(s): 25df10f

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. README.md +104 -0
  3. mistral-7b-sft-beta.Q4_0.gguf +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ mistral-7b-sft-beta.Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model: mistralai/Mistral-7B-v0.1
4
+ tags:
5
+ - generated_from_trainer
6
+ model-index:
7
+ - name: mistral-7b-sft-beta
8
+ results: []
9
+ datasets:
10
+ - HuggingFaceH4/ultrachat_200k
11
+ language:
12
+ - en
13
+ ---
14
+
15
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
+ should probably proofread and complete it, then remove this comment. -->
17
+
18
+ # Model Card for Mistral 7B SFT β
19
+
20
+ This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the HuggingFaceH4/ultrachat_200k dataset. It is the SFT model that was used to train Zephyr-7B-β with Direct Preference Optimization.
21
+
22
+ It achieves the following results on the evaluation set:
23
+ - Loss: 0.9399
24
+
25
+ ## Model description
26
+
27
+ - **Model type:** A 7B parameter GPT-like model fine-tuned on a mix of publicly available, synthetic datasets.
28
+ - **Language(s) (NLP):** Primarily English
29
+ - **License:** MIT
30
+ - **Finetuned from model:** [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
31
+
32
+ ### Model Sources
33
+
34
+ <!-- Provide the basic links for the model. -->
35
+
36
+ - **Repository:** https://github.com/huggingface/alignment-handbook
37
+
38
+ ## Intended uses & limitations
39
+
40
+ The model was fine-tuned with [🤗 TRL's](https://github.com/huggingface/trl) `SFTTrainer` on a filtered and preprocessed of the [`UltraChat`](https://huggingface.co/datasets/stingning/ultrachat) dataset, which contains a diverse range of synthetic dialogues generated by ChatGPT.
41
+
42
+ Here's how you can run the model using the `pipeline()` function from 🤗 Transformers:
43
+
44
+ ```python
45
+ # Install transformers from source - only needed for versions <= v4.34
46
+ # pip install git+https://github.com/huggingface/transformers.git
47
+ # pip install accelerate
48
+
49
+ import torch
50
+ from transformers import pipeline
51
+
52
+ pipe = pipeline("text-generation", model="HuggingFaceH4/mistral-7b-sft-beta", torch_dtype=torch.bfloat16, device_map="auto")
53
+
54
+ # We use the tokenizer's chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating
55
+ messages = [
56
+ {
57
+ "role": "system",
58
+ "content": "You are a friendly chatbot who always responds in the style of a pirate",
59
+ },
60
+ {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
61
+ ]
62
+ prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
63
+ outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
64
+ print(outputs[0]["generated_text"])
65
+ # <|system|>
66
+ # You are a friendly chatbot who always responds in the style of a pirate.</s>
67
+ # <|user|>
68
+ # How many helicopters can a human eat in one sitting?</s>
69
+ # <|assistant|>
70
+ # Ah, me hearty matey! But yer question be a puzzler! A human cannot eat a helicopter in one sitting, as helicopters are not edible. They be made of metal, plastic, and other materials, not food!
71
+ ```
72
+
73
+ ## Training procedure
74
+
75
+ ### Training hyperparameters
76
+
77
+ The following hyperparameters were used during training:
78
+ - learning_rate: 2e-05
79
+ - train_batch_size: 8
80
+ - eval_batch_size: 16
81
+ - seed: 42
82
+ - distributed_type: multi-GPU
83
+ - num_devices: 16
84
+ - gradient_accumulation_steps: 4
85
+ - total_train_batch_size: 512
86
+ - total_eval_batch_size: 256
87
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
88
+ - lr_scheduler_type: cosine
89
+ - lr_scheduler_warmup_ratio: 0.1
90
+ - num_epochs: 1
91
+
92
+ ### Training results
93
+
94
+ | Training Loss | Epoch | Step | Validation Loss |
95
+ |:-------------:|:-----:|:----:|:---------------:|
96
+ | 0.9367 | 0.67 | 272 | 0.9397 |
97
+
98
+
99
+ ### Framework versions
100
+
101
+ - Transformers 4.35.0.dev0
102
+ - Pytorch 2.0.1+cu118
103
+ - Datasets 2.12.0
104
+ - Tokenizers 0.14.0
mistral-7b-sft-beta.Q4_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2c065856020e1b614cfa77a82d507bd3fe1f2e03522c6bfb4137259b2d8146ec
3
+ size 4108917728