Seongyun commited on
Commit
322c112
1 Parent(s): 85351d1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -80
README.md CHANGED
@@ -8,86 +8,6 @@ model-index:
8
  results: []
9
  ---
10
 
11
- [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
12
- <details>
13
- <summary>See axolotl config</summary>
14
-
15
- axolotl version: `0.4.0`
16
- ```yaml
17
- base_model: alpindale/Mistral-7B-v0.2-hf
18
- model_type: MistralForCausalLM
19
- tokenizer_type: LlamaTokenizer
20
-
21
- load_in_8bit: false
22
- load_in_4bit: false
23
- strict: false
24
-
25
- datasets:
26
- - path: /data/seongyun/open-instruct-2/augmented_diverse_response/preferences_v1_responses_for_training.jsonl
27
- type:
28
- system_prompt: ""
29
- system_format: "[INST] {system}\n"
30
- field_system: system
31
- field_instruction: instruction
32
- field_output: output
33
- format: "{instruction} [/INST]"
34
- no_input_format: "{instruction} [/INST]"
35
- # conversation: mistral
36
-
37
- dataset_prepared_path:
38
- hub_model_id: kaist-ai/mpa-Mistral-7b-v0.2-hf-sft
39
- hub_strategy: checkpoint
40
- # val_set_size: 0
41
- output_dir: /data/suehyun/axolotl/outputs/mpa/mistral-7b-v0.2-hf
42
-
43
- sequence_len: 8192
44
- sample_packing: true
45
- pad_to_sequence_len: true
46
- eval_sample_packing: false
47
-
48
- wandb_project: mpa
49
- wandb_entity: suehyun
50
- wandb_watch:
51
- wandb_name: mpa_mistral-7b-v0.2-hf
52
- wandb_log_model:
53
-
54
- gradient_accumulation_steps: 4
55
- micro_batch_size: 2
56
- num_epochs: 4
57
- optimizer: adamw_bnb_8bit
58
- lr_scheduler: cosine
59
- learning_rate: 0.000005
60
-
61
- train_on_inputs: false
62
- group_by_length: false
63
- bf16: auto
64
- fp16:
65
- tf32: false
66
-
67
- gradient_checkpointing: true
68
- early_stopping_patience:
69
- resume_from_checkpoint:
70
- local_rank:
71
- logging_steps: 1
72
- xformers_attention:
73
- flash_attention: true
74
-
75
- warmup_steps: 10
76
- # evals_per_epoch: 4
77
- eval_table_size:
78
- # eval_max_new_tokens: 128
79
- saves_per_epoch: 1
80
- debug:
81
- deepspeed:
82
- weight_decay: 0.0
83
- fsdp:
84
- fsdp_config:
85
- special_tokens:
86
-
87
- ```
88
-
89
- </details><br>
90
-
91
  ### Training hyperparameters
92
 
93
  The following hyperparameters were used during training:
 
8
  results: []
9
  ---
10
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ### Training hyperparameters
12
 
13
  The following hyperparameters were used during training: