biobert-nli / README.md

Migrate model card from transformers-repo

153d341 about 4 years ago

2.1 kB

	# BioBERT-NLI

	This is the model [BioBERT](https://github.com/dmis-lab/biobert) [1] fine-tuned on the [SNLI](https://nlp.stanford.edu/projects/snli/) and the [MultiNLI](https://www.nyu.edu/projects/bowman/multinli/) datasets using the [`sentence-transformers` library](https://github.com/UKPLab/sentence-transformers/) to produce universal sentence embeddings [2].

	The model uses the original BERT wordpiece vocabulary and was trained using the average pooling strategy and a softmax loss.

	Base model: `monologg/biobert_v1.1_pubmed` from HuggingFace's `AutoModel`.

	Training time: ~6 hours on the NVIDIA Tesla P100 GPU provided in Kaggle Notebooks.

	Parameters:

	\| Parameter \| Value \|
	\|------------------\|-------\|
	\| Batch size \| 64 \|
	\| Training steps \| 30000 \|
	\| Warmup steps \| 1450 \|
	\| Lowercasing \| False \|
	\| Max. Seq. Length \| 128 \|

	Performances: The performance was evaluated on the test portion of the [STS dataset](http://ixa2.si.ehu.es/stswiki/index.php/STSbenchmark) using Spearman rank correlation and compared to the performances of a general BERT base model obtained with the same procedure to verify their similarity.

	\| Model \| Score \|
	\|-------------------------------\|-------------\|
	\| `biobert-nli` (this) \| 73.40 \|
	\| `gsarti/scibert-nli` \| 74.50 \|
	\| `bert-base-nli-mean-tokens`[3]\| 77.12 \|

	An example usage for similarity-based scientific paper retrieval is provided in the [Covid Papers Browser](https://github.com/gsarti/covid-papers-browser) repository.

	References:

	[1] J. Lee et al, [BioBERT: a pre-trained biomedical language representation model for biomedical text mining](https://academic.oup.com/bioinformatics/article/36/4/1234/5566506)

	[2] A. Conneau et al., [Supervised Learning of Universal Sentence Representations from Natural Language Inference Data](https://www.aclweb.org/anthology/D17-1070/)

	[3] N. Reimers et I. Gurevych, [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://www.aclweb.org/anthology/D19-1410/)

	# BioBERT-NLI

	This is the model [BioBERT](https://github.com/dmis-lab/biobert) [1] fine-tuned on the [SNLI](https://nlp.stanford.edu/projects/snli/) and the [MultiNLI](https://www.nyu.edu/projects/bowman/multinli/) datasets using the [`sentence-transformers` library](https://github.com/UKPLab/sentence-transformers/) to produce universal sentence embeddings [2].

	The model uses the original BERT wordpiece vocabulary and was trained using the average pooling strategy and a softmax loss.

	Base model: `monologg/biobert_v1.1_pubmed` from HuggingFace's `AutoModel`.

	Training time: ~6 hours on the NVIDIA Tesla P100 GPU provided in Kaggle Notebooks.

	Parameters:

	\| Parameter \| Value \|
	\|------------------\|-------\|
	\| Batch size \| 64 \|
	\| Training steps \| 30000 \|
	\| Warmup steps \| 1450 \|
	\| Lowercasing \| False \|
	\| Max. Seq. Length \| 128 \|

	Performances: The performance was evaluated on the test portion of the [STS dataset](http://ixa2.si.ehu.es/stswiki/index.php/STSbenchmark) using Spearman rank correlation and compared to the performances of a general BERT base model obtained with the same procedure to verify their similarity.

	\| Model \| Score \|
	\|-------------------------------\|-------------\|
	\| `biobert-nli` (this) \| 73.40 \|
	\| `gsarti/scibert-nli` \| 74.50 \|
	\| `bert-base-nli-mean-tokens`[3]\| 77.12 \|

	An example usage for similarity-based scientific paper retrieval is provided in the [Covid Papers Browser](https://github.com/gsarti/covid-papers-browser) repository.

	References:

	[1] J. Lee et al, [BioBERT: a pre-trained biomedical language representation model for biomedical text mining](https://academic.oup.com/bioinformatics/article/36/4/1234/5566506)

	[2] A. Conneau et al., [Supervised Learning of Universal Sentence Representations from Natural Language Inference Data](https://www.aclweb.org/anthology/D17-1070/)

	[3] N. Reimers et I. Gurevych, [Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks](https://www.aclweb.org/anthology/D19-1410/)