sumo43 commited on
Commit
3375953
1 Parent(s): 9feef32

Model save

Browse files
README.md CHANGED
@@ -19,7 +19,7 @@ should probably proofread and complete it, then remove this comment. -->
19
 
20
  This model is a fine-tuned version of [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) on the generator dataset.
21
  It achieves the following results on the evaluation set:
22
- - Loss: 1.1635
23
 
24
  ## Model description
25
 
@@ -38,7 +38,7 @@ More information needed
38
  ### Training hyperparameters
39
 
40
  The following hyperparameters were used during training:
41
- - learning_rate: 2e-05
42
  - train_batch_size: 16
43
  - eval_batch_size: 8
44
  - seed: 42
@@ -52,7 +52,7 @@ The following hyperparameters were used during training:
52
 
53
  | Training Loss | Epoch | Step | Validation Loss |
54
  |:-------------:|:-----:|:----:|:---------------:|
55
- | 1.1523 | 1.0 | 22 | 1.1635 |
56
 
57
 
58
  ### Framework versions
 
19
 
20
  This model is a fine-tuned version of [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) on the generator dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 4.4883
23
 
24
  ## Model description
25
 
 
38
  ### Training hyperparameters
39
 
40
  The following hyperparameters were used during training:
41
+ - learning_rate: 0.0002
42
  - train_batch_size: 16
43
  - eval_batch_size: 8
44
  - seed: 42
 
52
 
53
  | Training Loss | Epoch | Step | Validation Loss |
54
  |:-------------:|:-----:|:----:|:---------------:|
55
+ | 4.5336 | 1.0 | 8969 | 4.4883 |
56
 
57
 
58
  ### Framework versions
all_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
  "epoch": 1.0,
3
- "train_loss": 1.118579317222942,
4
- "train_runtime": 83.4654,
5
- "train_samples": 500,
6
- "train_samples_per_second": 4.038,
7
- "train_steps_per_second": 0.264
8
  }
 
1
  {
2
  "epoch": 1.0,
3
+ "train_loss": 4.746947802406368,
4
+ "train_runtime": 46954.2707,
5
+ "train_samples": 207864,
6
+ "train_samples_per_second": 3.056,
7
+ "train_steps_per_second": 0.191
8
  }
generation_config.json CHANGED
@@ -3,5 +3,6 @@
3
  "eos_token_id": 2,
4
  "max_length": 2048,
5
  "pad_token_id": 0,
6
- "transformers_version": "4.39.0.dev0"
 
7
  }
 
3
  "eos_token_id": 2,
4
  "max_length": 2048,
5
  "pad_token_id": 0,
6
+ "transformers_version": "4.39.0.dev0",
7
+ "use_cache": false
8
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:f52223558e195137fc244d826c9719c7f66ee17072ef135bd7a3a571e52b8498
3
  size 2201749928
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d241cde6a1d3a813c172bfbff227b2f04c0cc1047d37c59738eb252622984c8a
3
  size 2201749928
train_results.json CHANGED
@@ -1,8 +1,8 @@
1
  {
2
  "epoch": 1.0,
3
- "train_loss": 1.118579317222942,
4
- "train_runtime": 83.4654,
5
- "train_samples": 500,
6
- "train_samples_per_second": 4.038,
7
- "train_steps_per_second": 0.264
8
  }
 
1
  {
2
  "epoch": 1.0,
3
+ "train_loss": 4.746947802406368,
4
+ "train_runtime": 46954.2707,
5
+ "train_samples": 207864,
6
+ "train_samples_per_second": 3.056,
7
+ "train_steps_per_second": 0.191
8
  }
trainer_state.json CHANGED
The diff for this file is too large to render. See raw diff