metadata
model-index:
- name: jiazhengli/deberta-v3-large-Rationale-to-Score
results: []
language:
- en
base_model: microsoft/deberta-v3-large
license: apache-2.0
widget:
- text: >-
The student's answer received a score of 0 according to the marking
rubric, as it failed to describe any additional pieces of information
necessary to accurately replicate the experiment. The key answer elements
required specific details such as the amount and type of vinegar used, the
materials tested, the size/surface area of materials, the rinsing and
drying durations, the drying method, and the size/type of container. The
student's response did not address any of these points; instead, it
suggested changes to the experimental conditions like 'there can be a
different amount of time','more containers', 'different placements', and
'temperature differences', which do not provide the specific information
needed for replication.
example_title: Example 1
- text: >-
The student's answer scored 3 points as per the marking rubric, which
requires drawing a valid conclusion supported by data and describing two
ways to improve the experimental design. The student correctly concluded
that 'plastic type B, is the most effective type of plastic' matching the
key answer element for a valid experimental conclusion. The student also
suggested additional trials ('In the experiment, the group could have done
1 more trial to really reinforce the results') and maintaining weight
constant ('the group also should have measured the weight of the weights,
and kept the weight constant for more valid results'), aligning with the
key answer elements for improving experimental design. However, the answer
did not suggest ensuring uniform initial measurements or sample thickness,
which were other key answer elements.
example_title: Example 2
- text: >-
The student's answer scored 0 points according to the marking rubric,
which awards 3 points for addressing three key elements, 2 points for two,
1 point for one, and 0 points for none. The student's response failed to
address any of the key answer elements accurately. Specifically, the
student mentioned 'Equillibrium' and 'Diffusion' but did not explain these
processes in the context of cell membrane transport mechanisms like
selective permeability, passive transport, osmosis, facilitated diffusion,
active transport, or the use of pumps and protein channels. Additionally,
there was no mention of membrane-assisted processes such as exocytosis,
endocytosis, phagocytosis, or pinocytosis.
example_title: Example 3
- text: >-
The student's answer scored 1 point based on the marking rubric that
awards one point for addressing 'One or two key elements'. The student's
response correctly addressed the process of mRNA traveling to the
ribosomes as evidenced by 'First it travels along to the ribosomes',
fulfilling one key answer element. However, the response failed to
correctly describe the interactions and processes involving codons,
anticodons, and the specific mechanism of tRNA in protein synthesis, as
well as the initiation and elongation phases of protein synthesis, which
are vital for higher scoring. The phrases 'the ribosomes give the mRNA its
proper tRNA' and 'Then, it is sent off' show confusion about tRNA's role
and incorrect understanding of mRNA's interaction with ribosomes.
example_title: Example 4
Model Card for deberta-v3-large-Rationale-to-Score
This repository hosts a version of microsoft/deberta-v3-large
that has been fine-tuned to assess text-based rationales and generate corresponding scores. As shown in the examples, the model processes a given free-text rationale and outputs a numerical score.
For a comprehensive understanding of the training process and methodologies employed, please refer to our detailed research paper: Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring.
If you utilize this model in your research, please acknowledge it by citing our work:
Citation Information
@misc{li2024calibratingllmspreferenceoptimization,
title={Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring},
author={Jiazheng Li and Hainiu Xu and Zhaoyue Sun and Yuxiang Zhou and David West and Cesare Aloisi and Yulan He},
year={2024},
eprint={2406.19949},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2406.19949},
}