Seongyun commited on
Commit
3fa323e
·
verified ·
1 Parent(s): 8464256

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +79 -96
README.md CHANGED
@@ -3,110 +3,88 @@ license: apache-2.0
3
  base_model: kaist-ai/mpa-Mistral-7b-v0.2-hf-sft-66k
4
  tags:
5
  - axolotl
6
- - dpo
7
  - trl
8
- - dpo
9
  - generated_from_trainer
 
10
  model-index:
11
- - name: mpa-Mistral-7b-v0.2-hf-66k-dpo-5e-7
12
  results: []
13
  ---
14
 
15
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
- should probably proofread and complete it, then remove this comment. -->
17
-
18
- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
19
- <details><summary>See axolotl config</summary>
20
-
21
- axolotl version: `0.4.0`
22
- ```yaml
23
- base_model: kaist-ai/mpa-Mistral-7b-v0.2-hf-sft-66k
24
- model_type: MistralForCausalLM
25
- tokenizer_type: LlamaTokenizer
26
-
27
- load_in_8bit: false
28
- load_in_4bit: false
29
- strict: false
30
-
31
- rl: dpo
32
- datasets:
33
- - path: kaist-ai/mpa-train-dpo-66k
34
- type: chatml.argilla
35
- # conversation: mistral
36
-
37
- dataset_prepared_path:
38
- hub_model_id: kaist-ai/mpa-Mistral-7b-v0.2-hf-66k-dpo-5e-7
39
- hub_strategy: checkpoint
40
- # val_set_size: 0
41
- output_dir: /mnt/nas/seongyun/axolotl/outputs/mpa_66k_dpo-5e-7
42
-
43
- sequence_len: 2048
44
- sample_packing: false
45
- pad_to_sequence_len: true
46
- eval_sample_packing: false
47
-
48
- wandb_project: mpa
49
- wandb_entity: seongyun
50
- wandb_watch:
51
- wandb_name: mpa_mistral-7b-v0.2-hf-66k-dpo-5e-7
52
- wandb_log_model:
53
-
54
- gradient_accumulation_steps: 4
55
- micro_batch_size: 1
56
- num_epochs: 2
57
- optimizer: adamw_bnb_8bit
58
- lr_scheduler: cosine
59
- learning_rate: 0.0000005
60
-
61
- train_on_inputs: false
62
- group_by_length: false
63
- bf16: auto
64
- fp16:
65
- tf32: false
66
-
67
- gradient_checkpointing: true
68
- early_stopping_patience:
69
- resume_from_checkpoint:
70
- local_rank:
71
- logging_steps: 1
72
- xformers_attention:
73
- flash_attention: true
74
-
75
- warmup_steps: 10
76
- # evals_per_epoch: 4
77
- eval_table_size:
78
- # eval_max_new_tokens: 128
79
- saves_per_epoch: 1
80
- debug:
81
- deepspeed:
82
- weight_decay: 0.0
83
- fsdp:
84
- fsdp_config:
85
- special_tokens:
86
-
87
- ```
88
-
89
- </details><br>
90
-
91
- # mpa-Mistral-7b-v0.2-hf-66k-dpo-5e-7
92
-
93
- This model is a fine-tuned version of [kaist-ai/mpa-Mistral-7b-v0.2-hf-sft-66k](https://huggingface.co/kaist-ai/mpa-Mistral-7b-v0.2-hf-sft-66k) on the None dataset.
94
 
95
- ## Model description
 
 
 
96
 
97
- More information needed
 
98
 
99
- ## Intended uses & limitations
100
 
101
- More information needed
 
102
 
103
- ## Training and evaluation data
104
 
105
- More information needed
 
 
 
 
 
 
 
106
 
107
- ## Training procedure
108
-
109
- ### Training hyperparameters
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
110
 
111
  The following hyperparameters were used during training:
112
  - learning_rate: 5e-07
@@ -123,13 +101,18 @@ The following hyperparameters were used during training:
123
  - lr_scheduler_warmup_steps: 10
124
  - training_steps: 8143
125
 
126
- ### Training results
127
-
128
-
129
-
130
- ### Framework versions
131
 
132
  - Transformers 4.40.0.dev0
133
  - Pytorch 2.1.1
134
  - Datasets 2.15.0
135
  - Tokenizers 0.15.0
 
 
 
 
 
 
 
 
 
 
3
  base_model: kaist-ai/mpa-Mistral-7b-v0.2-hf-sft-66k
4
  tags:
5
  - axolotl
 
6
  - trl
 
7
  - generated_from_trainer
8
+ - dpo
9
  model-index:
10
+ - name: janus-dpo-7b
11
  results: []
12
  ---
13
 
14
+ ## Links for Reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
+ - **Homepage: In Progress**
17
+ - **Repository: https://github.com/kaistAI/Janus**
18
+ - **Paper:**
19
+ - **Point of Contact:[email protected]**
20
 
21
+ # TL; DR
22
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6550c4f27bbfce1878f5f280/vrQl8D8FV3vqUJYbPgsiG.png)
23
 
24
+ Janus is a model trained using [Mistral-7B-v0.2](https://huggingface.co/mistral-community/Mistral-7B-v0.2) as its base model. Janus has been trained on [Multifaceted Collection](https://huggingface.co/datasets/kaist-ai/Multifaceted-Collection-SFT), a preference dataset containing 192k unique system messages for aligning LLMs to diverse human preferences. Janus not only excels at generating personalized responses that cater to various human preferences but is also adept at producing responses that are generally preferred for being helpful and harmless.
25
 
26
+ # Model Details
27
+ Janus-DPO-7B is a model created by applying DPO to Janus-66k-7B using the Multifaceted-Collection-DPO.
28
 
29
+ ## Model Description
30
 
31
+ - **Model type:** Language model
32
+ - **Language(s) (NLP):** English
33
+ - **License:** Apache 2.0
34
+ - **Related Models:** [Janus-66k-7B]() [Janus-7B](), [Janus-ORPO-7B](), [Janus-RM-7B]()
35
+ - **Training Datasets**: [Multifaceted-Collection-SFT](https://huggingface.co/datasets/kaist-ai/Multifaceted-Collection-SFT)
36
+ - **Resources for more information:**
37
+ - [Research paper]()
38
+ - [GitHub Repo](https://github.com/kaistAI/Janus)
39
 
40
+ # Usage
41
+ Janus is a model generalized for various system messages, allowing users to control the model's response by inputting the desired system message. The input prompt format is as follows:
42
+ ```
43
+ [INST]{system_message}\n{instruction}[/INST]
44
+ ```
45
+ Additionally, an example of the inference code applying this is as follows:
46
+ ```
47
+ from transformers import AutoTokenizer, AutoModelForCausalLM
48
+ import torch
49
+
50
+ model_name = "kaist-ai/janus-dpo-7b"
51
+ device = "cuda:0"
52
+
53
+ # Load the model and tokenizer
54
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
55
+
56
+ dtype = "float16"
57
+ if torch.cuda.is_bf16_supported():
58
+ dtype = "bfloat16"
59
+
60
+ model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=getattr(torch, dtype))
61
+ model.eval()
62
+ model.to(device)
63
+
64
+ # Prepare inputs
65
+ system = "As a financial news headline writer with a flair for the dramatic, you have taken on the role of crafting compelling headlines about the integration of AI into the financial sector. Your expertise allows you to weave industry-specific terminology seamlessly into each headline, striking a balance between capturing attention and providing meaningful insights into the transformative benefits of AI in finance. With each headline, you focus on elucidating the key advantages AI brings to financial operations, making complex information accessible and immediately impactful. While your headlines are designed to engage and inform an audience of finance and technology professionals, you navigate the fine line of excitement and accuracy with care, ensuring that the promises made are grounded in reality, thus avoiding any form of sensationalism. Your mission is to distill the essence of AI's impact on finance into a single, powerful line that speaks volumes to the informed reader."
66
+ prompt = "Write a headline for an article about the benefits of using AI in the finance sector."
67
+
68
+ def apply_template_mistral_instruct(system_message, content):
69
+ prompt = f"{system_message}\n{content}".strip()
70
+ return f"[INST] {prompt} [/INST] "
71
+
72
+ input_str = apply_template_mistral_instruct(system, prompt)
73
+ input_ids = tokenizer.encode(input_str, return_tensors="pt")
74
+ print(input_str)
75
+
76
+ model_inputs = input_ids.to(device)
77
+
78
+ # Generate text
79
+ output_ids = model.generate(model_inputs, max_new_tokens=1024)
80
+ decoded = tokenizer.batch_decode(output_ids, skip_special_tokens=True)
81
+ print(decoded[0][len(input_str):])
82
+ # Revolutionary Trends: How AI Is Redefining Efficiency and Accuracy in the Financial Realm
83
+ ```
84
+ To train Janus and evaluate the responses it generates, please refer to the [GitHub Repo](https://github.com/kaistAI/Janus).
85
+ Additionally, refer to the [Multifaceted Bench](https://huggingface.co/datasets/kaist-ai/Multifaceted-Bench), which evaluates how well LLM generates personalized responses.
86
+ # Training Details
87
+ ## Training hyperparameters
88
 
89
  The following hyperparameters were used during training:
90
  - learning_rate: 5e-07
 
101
  - lr_scheduler_warmup_steps: 10
102
  - training_steps: 8143
103
 
104
+ ## Framework versions
 
 
 
 
105
 
106
  - Transformers 4.40.0.dev0
107
  - Pytorch 2.1.1
108
  - Datasets 2.15.0
109
  - Tokenizers 0.15.0
110
+ -
111
+ # Citation
112
+
113
+ If you find the following model helpful, please consider citing our paper!
114
+
115
+ **BibTeX:**
116
+
117
+ ```bibtex
118
+ ```