tiedaar's picture
Update README.md
a1e592a verified
metadata
license: mit
tags:
  - generated_from_trainer
base_model: microsoft/deberta-v3-large
model-index:
  - name: grammar_checkpoints
    results: []

Language Beyond the Source

Model description

This model is a fine-tuned version of microsoft/deberta-v3-large on a dataset consisting of 4,620 summaries, scored on an analytic rubric by expert raters. This model predicts the raw score for Language Beyond the Source. The rubric is as follows:

LANGUAGE BEYOND THE SOURCE

  • 1 Point: Summary shows a very basic understanding of lexical and syntactic structures.
  • 2 Points: Summary shows an understanding of lexical and syntactic structures.
  • 3 Points: Summary shows an appropriate range of lexical and syntactic structures.
  • 4 Points: Summary shows an excellent range of lexical and syntactic structures.

It achieves the following results on the evaluation set:

  • Loss: 0.1817
  • Mse: 0.1817
  • Rmse: 0.4263

On set of summaries of sources that were withheld from the training set, the model achieved the following results:

  • Rmse: 0.4220
  • R2: 0.6236

Intended uses & limitations

This model is intended to be used to provide feedback to users of iTELL, a framework for generating intelligent educational texts. For more information about iTELL, watch our video here: IMAGE ALT TEXT HERE

Training and evaluation data

Seventy summaries in the training set had Language Beyond the Source scores of <1, which is outside of the rubric. These summaries were removed from the training and test sets.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8.5e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 7

Training results

Training Loss Epoch Step Validation Loss Mse Rmse
No log 1.0 405 0.1901 0.1901 0.4360
0.5772 2.0 810 0.2181 0.2181 0.4670
0.1498 3.0 1215 0.2259 0.2259 0.4752
0.0969 4.0 1620 0.1845 0.1845 0.4296
0.0587 5.0 2025 0.1657 0.1657 0.4071
0.0587 6.0 2430 0.1731 0.1731 0.4161
0.0397 7.0 2835 0.1817 0.1817 0.4263

Framework versions

  • Transformers 4.40.1
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1

Contact

This model was developed by LEAR Lab at Vanderbilt University. For questions or comments about this model, please contact [email protected].