--- license: mit tags: - generated_from_trainer base_model: microsoft/deberta-v3-large model-index: - name: grammar_checkpoints results: [] --- # Language Beyond the Source ## Model description This model is a fine-tuned version of [microsoft/deberta-v3-large](https://huggingface.co./microsoft/deberta-v3-large) on a dataset consisting of 4,620 summaries, scored on an analytic rubric by expert raters. This model predicts the raw score for Language Beyond the Source. The rubric is as follows: LANGUAGE BEYOND THE SOURCE - 1 Point: Summary shows a very basic understanding of lexical and syntactic structures. - 2 Points: Summary shows an understanding of lexical and syntactic structures. - 3 Points: Summary shows an appropriate range of lexical and syntactic structures. - 4 Points: Summary shows an excellent range of lexical and syntactic structures. It achieves the following results on the evaluation set: - Loss: 0.1817 - Mse: 0.1817 - Rmse: 0.4263 On set of summaries of sources that were withheld from the training set, the model achieved the following results: - Rmse: 0.4220 - R2: 0.6236 ## Intended uses & limitations This model is intended to be used to provide feedback to users of iTELL, a framework for generating intelligent educational texts. For more information about iTELL, watch our video here: [![IMAGE ALT TEXT HERE](https://img.youtube.com/vi/YZXVQjSDZtI/0.jpg)](https://www.youtube.com/watch?v=YZXVQjSDZtI) ## Training and evaluation data Seventy summaries in the training set had Language Beyond the Source scores of <1, which is outside of the rubric. These summaries were removed from the training and test sets. ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 8.5e-06 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 100 - num_epochs: 7 ### Training results | Training Loss | Epoch | Step | Validation Loss | Mse | Rmse | |:-------------:|:-----:|:----:|:---------------:|:------:|:------:| | No log | 1.0 | 405 | 0.1901 | 0.1901 | 0.4360 | | 0.5772 | 2.0 | 810 | 0.2181 | 0.2181 | 0.4670 | | 0.1498 | 3.0 | 1215 | 0.2259 | 0.2259 | 0.4752 | | 0.0969 | 4.0 | 1620 | 0.1845 | 0.1845 | 0.4296 | | 0.0587 | 5.0 | 2025 | 0.1657 | 0.1657 | 0.4071 | | 0.0587 | 6.0 | 2430 | 0.1731 | 0.1731 | 0.4161 | | 0.0397 | 7.0 | 2835 | 0.1817 | 0.1817 | 0.4263 | ### Framework versions - Transformers 4.40.1 - Pytorch 2.3.0+cu121 - Datasets 2.19.0 - Tokenizers 0.19.1 ## Contact This model was developed by LEAR Lab at Vanderbilt University. For questions or comments about this model, please contact [wesley.g.morris@vanderbilt.edu](wesley.g.morris@vanderbilt.edu).