shorecode commited on
Commit
9f7057c
·
verified ·
1 Parent(s): 5ffad34

Training complete!

Browse files
Files changed (4) hide show
  1. README.md +21 -26
  2. config.json +1 -1
  3. model.safetensors +1 -1
  4. training_args.bin +1 -1
README.md CHANGED
@@ -1,14 +1,12 @@
1
  ---
2
  library_name: transformers
3
  license: apache-2.0
4
- base_model: google/t5-efficient-tiny-nh8
5
  tags:
6
  - generated_from_trainer
7
  model-index:
8
  - name: t5-efficient-tiny-nh8-summarizer
9
  results: []
10
- datasets:
11
- - shorecode/summary-collection-60k-rows
12
  ---
13
 
14
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -16,32 +14,30 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  # t5-efficient-tiny-nh8-summarizer
18
 
19
- This model is a fine-tuned version of [google/t5-efficient-tiny-nh8](https://huggingface.co/google/t5-efficient-tiny-nh8) on shorecode/summary-collection-60k-rows.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 0.7583
22
 
23
  ## Model description
24
 
25
- A general purpose text summarizer
26
 
27
  ## Intended uses & limitations
28
 
29
- A general purpose text summarizer
30
 
31
  ## Training and evaluation data
32
 
33
- Trained and evaluated on shorecode/summary-collection-60k-rows
34
 
35
  ## Training procedure
36
 
37
- Trained using the Gradio SDK on Hugging Face Spaces using shared Zero GPU(s)
38
-
39
  ### Training hyperparameters
40
 
41
  The following hyperparameters were used during training:
42
- - learning_rate: 7.000000000000001e-05
43
- - train_batch_size: 70
44
- - eval_batch_size: 70
45
  - seed: 42
46
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
47
  - lr_scheduler_type: linear
@@ -52,18 +48,17 @@ The following hyperparameters were used during training:
52
 
53
  | Training Loss | Epoch | Step | Validation Loss |
54
  |:-------------:|:------:|:----:|:---------------:|
55
- | 1.1522 | 0.2328 | 200 | 0.9863 |
56
- | 0.9677 | 0.4657 | 400 | 0.9158 |
57
- | 0.9143 | 0.6985 | 600 | 0.8762 |
58
- | 0.8894 | 0.9313 | 800 | 0.8478 |
59
- | 0.8586 | 1.1641 | 1000 | 0.8262 |
60
- | 0.8382 | 1.3970 | 1200 | 0.8079 |
61
- | 0.8198 | 1.6298 | 1400 | 0.7938 |
62
- | 0.805 | 1.8626 | 1600 | 0.7823 |
63
- | 0.8035 | 2.0955 | 1800 | 0.7727 |
64
- | 0.7897 | 2.3283 | 2000 | 0.7661 |
65
- | 0.7849 | 2.5611 | 2200 | 0.7607 |
66
- | 0.7781 | 2.7939 | 2400 | 0.7583 |
67
 
68
 
69
  ### Framework versions
@@ -71,4 +66,4 @@ The following hyperparameters were used during training:
71
  - Transformers 4.47.0
72
  - Pytorch 2.4.0+cu121
73
  - Datasets 3.0.0
74
- - Tokenizers 0.21.0
 
1
  ---
2
  library_name: transformers
3
  license: apache-2.0
4
+ base_model: shorecode/t5-efficient-tiny-nh8-summarizer
5
  tags:
6
  - generated_from_trainer
7
  model-index:
8
  - name: t5-efficient-tiny-nh8-summarizer
9
  results: []
 
 
10
  ---
11
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
14
 
15
  # t5-efficient-tiny-nh8-summarizer
16
 
17
+ This model is a fine-tuned version of [shorecode/t5-efficient-tiny-nh8-summarizer](https://huggingface.co/shorecode/t5-efficient-tiny-nh8-summarizer) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.6597
20
 
21
  ## Model description
22
 
23
+ More information needed
24
 
25
  ## Intended uses & limitations
26
 
27
+ More information needed
28
 
29
  ## Training and evaluation data
30
 
31
+ More information needed
32
 
33
  ## Training procedure
34
 
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
38
+ - learning_rate: 0.00015000000000000001
39
+ - train_batch_size: 63
40
+ - eval_batch_size: 63
41
  - seed: 42
42
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
43
  - lr_scheduler_type: linear
 
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:------:|:----:|:---------------:|
51
+ | 1.0837 | 0.2663 | 200 | 0.9227 |
52
+ | 0.9027 | 0.5326 | 400 | 0.8449 |
53
+ | 0.842 | 0.7989 | 600 | 0.7949 |
54
+ | 0.7971 | 1.0652 | 800 | 0.7585 |
55
+ | 0.768 | 1.3316 | 1000 | 0.7288 |
56
+ | 0.7359 | 1.5979 | 1200 | 0.7069 |
57
+ | 0.7145 | 1.8642 | 1400 | 0.6898 |
58
+ | 0.7047 | 2.1305 | 1600 | 0.6773 |
59
+ | 0.6926 | 2.3968 | 1800 | 0.6678 |
60
+ | 0.6855 | 2.6631 | 2000 | 0.6620 |
61
+ | 0.68 | 2.9294 | 2200 | 0.6597 |
 
62
 
63
 
64
  ### Framework versions
 
66
  - Transformers 4.47.0
67
  - Pytorch 2.4.0+cu121
68
  - Datasets 3.0.0
69
+ - Tokenizers 0.21.0
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "google/t5-efficient-tiny-nh8",
3
  "architectures": [
4
  "T5ForConditionalGeneration"
5
  ],
 
1
  {
2
+ "_name_or_path": "shorecode/t5-efficient-tiny-nh8-summarizer",
3
  "architectures": [
4
  "T5ForConditionalGeneration"
5
  ],
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5519657de515588fd59da723420c7ffe9fd55f1cd3e7383d9b08c485488ce85b
3
  size 62293080
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0a59a8c9c0ed288ff84d9af8d349bc7f8a93fef22d16d02f70e19f317c75f18e
3
  size 62293080
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:639965cd1c0265d0ae1ef8aafaa8aec94659deccc7dbe06d4dc3452d468701c7
3
  size 5304
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:639711c33405bbf09f602e52ebfd3058526167da57c50ed1314590513f1c12fe
3
  size 5304