Svenni551 commited on
Commit
54de336
·
verified ·
1 Parent(s): 1f8f4cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -8
README.md CHANGED
@@ -251,15 +251,16 @@ outputs = model.generate(input_ids=inputs.to(model.device), max_new_tokens=150)
251
  #### Training Hyperparameters
252
 
253
  The following hyperparameters were used during training:
 
 
 
 
 
 
 
 
 
254
 
255
- - **learning_rate:** `3e-4`
256
- - **train_batch_size:** Effectively adjusted by `per_device_train_batch_size=1` and `gradient_accumulation_steps=4`
257
- - **eval_batch_size:** Implicitly determined by the evaluation setup (not explicitly defined)
258
- - **seed:** Not explicitly stated, crucial for ensuring reproducibility
259
- - **optimizer:** `paged_adamw_8bit`, designed for efficient memory utilization
260
- - **lr_scheduler_type:** Learning rate adjustments indicate adaptive scheduling, though specific type is not mentioned
261
- - **training_steps:** `500`
262
- - **mixed_precision_training:** Not explicitly mentioned; any applied strategy would aim at computational efficiency
263
 
264
  #### Training Results
265
 
 
251
  #### Training Hyperparameters
252
 
253
  The following hyperparameters were used during training:
254
+ - learning_rate: 3e-4
255
+ - per_device_train_batch_size: 1
256
+ - gradient_accumulation_steps: 4
257
+ - eval_batch_size: Implicitly determined by the evaluation setup
258
+ - seed: Not explicitly stated
259
+ - optimizer: paged_adamw_8bit
260
+ - lr_scheduler_type: Not specified, adaptive adjustments indicated
261
+ - training_steps: 500
262
+ - mixed_precision_training: Not explicitly mentioned
263
 
 
 
 
 
 
 
 
 
264
 
265
  #### Training Results
266