anhvu2501 commited on
Commit
46d86ce
1 Parent(s): f9278f4

Model save

Browse files
README.md CHANGED
@@ -6,8 +6,6 @@ tags:
6
  - sft
7
  - generated_from_trainer
8
  base_model: Viet-Mistral/Vistral-7B-Chat
9
- datasets:
10
- - generator
11
  model-index:
12
  - name: vietnamese-news-summarization-vistral-7b
13
  results: []
@@ -18,9 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  # vietnamese-news-summarization-vistral-7b
20
 
21
- This model is a fine-tuned version of [Viet-Mistral/Vistral-7B-Chat](https://huggingface.co/Viet-Mistral/Vistral-7B-Chat) on the generator dataset.
22
- It achieves the following results on the evaluation set:
23
- - Loss: 1.2539
24
 
25
  ## Model description
26
 
@@ -48,17 +44,6 @@ The following hyperparameters were used during training:
48
  - lr_scheduler_warmup_steps: 0.03
49
  - training_steps: 100
50
 
51
- ### Training results
52
-
53
- | Training Loss | Epoch | Step | Validation Loss |
54
- |:-------------:|:------:|:----:|:---------------:|
55
- | 1.2306 | 0.0060 | 20 | 1.2918 |
56
- | 1.2802 | 0.0119 | 40 | 1.2651 |
57
- | 1.4084 | 0.0179 | 60 | 1.2528 |
58
- | 1.2944 | 0.0238 | 80 | 1.2497 |
59
- | 1.288 | 0.0298 | 100 | 1.2539 |
60
-
61
-
62
  ### Framework versions
63
 
64
  - PEFT 0.10.0
 
6
  - sft
7
  - generated_from_trainer
8
  base_model: Viet-Mistral/Vistral-7B-Chat
 
 
9
  model-index:
10
  - name: vietnamese-news-summarization-vistral-7b
11
  results: []
 
16
 
17
  # vietnamese-news-summarization-vistral-7b
18
 
19
+ This model is a fine-tuned version of [Viet-Mistral/Vistral-7B-Chat](https://huggingface.co/Viet-Mistral/Vistral-7B-Chat) on an unknown dataset.
 
 
20
 
21
  ## Model description
22
 
 
44
  - lr_scheduler_warmup_steps: 0.03
45
  - training_steps: 100
46
 
 
 
 
 
 
 
 
 
 
 
 
47
  ### Framework versions
48
 
49
  - PEFT 0.10.0
adapter_config.json CHANGED
@@ -20,14 +20,14 @@
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
23
- "gate_proj",
 
24
  "lm_head",
25
  "v_proj",
26
- "k_proj",
27
- "up_proj",
28
  "o_proj",
29
  "q_proj",
30
- "down_proj"
31
  ],
32
  "task_type": "CAUSAL_LM",
33
  "use_dora": false,
 
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
23
+ "up_proj",
24
+ "down_proj",
25
  "lm_head",
26
  "v_proj",
27
+ "gate_proj",
 
28
  "o_proj",
29
  "q_proj",
30
+ "k_proj"
31
  ],
32
  "task_type": "CAUSAL_LM",
33
  "use_dora": false,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0ee800e98cdedc1291dd6d3480cc3da5dc5a37a0bcebcbaa7e64eedecda81b66
3
  size 1310658800
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5c0a04c8e78c679d5d984a2fb1d55635400cbf91ff676da1c63d9e8f9ae6ec06
3
  size 1310658800
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3fe97dd103430815b891f2662819b75a267bec5baf6a117674bad779c1c37829
3
  size 5240
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3abe19c5218c6edaa71d54dfb2b20648c33de801feaeb5c3d863a3898476c135
3
  size 5240