library_name: transformers
license: mit
datasets:
- allenai/peer_read
language:
- en
metrics:
- accuracy
- f1
Model Card for PaperPub
Paper publication prediction based on English computer science abstracts.
Model Details
Model Description
PaperPub is a SciBERT (Beltagy et al 2019) model fine-tuned to predict paper acceptance from computer science abstracts. Acceptance is modeled as a binary decision of accept or reject. The training and evaluation data is based on the arXiv subsection of PeerRead (Kang et al. 2018). Our main use case for PaperPub is to research how attribution scores derived from acceptance predictions can inform reflecting about content and writing quality of abstracts.
- Developed by: Semantic Computing Group, Bielefeld University, in particular Jan-Philipp Töberg, Christoph Düsing, Jonas Belouadi and Matthias Orlikowski
- Model type: BERT for binary classification
- Language(s) (NLP): English
- License: MIT
- Finetuned from model: SciBERT
Model Sources
We will add a public demo of PaperPub used in an application which uses attribution scores to highlight words in an abstract that contribute to acceptance/rejection predcitions.
- Repository: tba
- Demo: tba
Uses
PaperPub can only be meaningfully used in a research setting. The model should not be used for any consequential paper quality judgements.
Direct Use
The intended use case in research into how attribution scores computed from paper acceptance decisions reflect the abstract's content quality.
Out-of-Scope Use
This model must not be used as part of any type of paper quality judgements, but in particular not in a peer review process. PaperPub is explicitly not meant to automate paper acceptance decisions.
Bias, Risks, and Limitations
Bias, Risks, and Limitations are mainly related to the used datset. In addition to limitations that apply to the SciBERT pre-training corpus, our training data represents only a very specific subset of papers. PaperPub was trained in a hackathon-like setting, so performance is not optimized and not our main goal.
Recommendations
Users should be aware that the dataset (computer science arXive preprints from a specific period) used for fine-tuning represents a very specific idea of what papers and in particular papers fit for publication look like.
How to Get Started with the Model
tba
Training Details
Training Data
Custom stratified split of the arXiv subsection of PeerRead (Kang et al. 2018). We use the data from their GitHub repository, not the Huggingface Hub version.
Training Procedure
Training Hyperparameters
- Training regime: bf16 mixed precision
- Epochs: 2
- Initial Learning Rate: 2^-5
Evaluation
Testing Data, Factors & Metrics
Testing Data
Custom stratified split of the arXiv subsection of PeerRead (Kang et al. 2018). We use the data from their GitHub repository, not the Huggingface Hub version.
Factors
Models, we compare to a naive most-frequent-class baseline.
Metrics
Accuracy, Macro F1
Results
- Majority Baseline
- Acc. - 0.75
- Macro F1 - 0.43
- PaperPub
- Acc. - 0.82
- Macro F1 - 0.76
Environmental Impact
- Hardware Type: 1xA40
- Hours used: 0.3
- Cloud Provider: Private Infrastructure
- Compute Region: Europe
Technical Specifications
Compute Infrastructure
We are using an internal SLURM cluster with A40 GPUs
Citation
tba