botryan96
/

GeoBERT

Token Classification

generated_from_keras_callback

Inference Endpoints

Model card Files Files and versions Community

GeoBERT / README.md

botryan96's picture

Create README.md

6da6346 almost 2 years ago

|

1.92 kB

	---
	tags:
	- generated_from_keras_callback
	model-index:
	- name: GeoBERT
	results: []
	---

	<!-- This model card has been generated automatically according to the information Keras had access to. You should
	probably proofread and complete it, then remove this comment. -->

	# GeoBERT

	GeoBERT is a NER model that was fine-tuned from SciBERT on the Geoscientific Corpus dataset.
	The model was trained on the Labeled Geoscientific Corpus dataset (~1 million sentences).


	## Intended uses

	The NER product in this model has a goal to identify four main semantic types or categories related to Geosciences.

	1. GeoPetro for any entities that belong to all terms in Geosciences
	2. GeoMeth for all tools or methods associated with Geosciences
	3. GeoLoc to identify geological locations
	4. GeoTime for identifying the geological time scale entities


	### Training hyperparameters

	The following hyperparameters were used during training:
	- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 14000, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
	- training_precision: mixed_float16


	### Framework versions

	- Transformers 4.22.1
	- TensorFlow 2.10.0
	- Datasets 2.4.0
	- Tokenizers 0.12.1

	## Model performances (metric: seqeval)

	entity\|precision\|recall\|f1
	-\|-\|-\|-
	GeoLoc \|0.9727\|0.9591\|0.9658
	GeoMeth \|0.9433\|0.9447\|0.9445
	GeoPetro\|0.9767\|0.9745\|0.9756
	GeoTime \|0.9695\|0.9666\|0.9680

	## How to use GeoBERT with HuggingFace

	##### Load GeoBERT and its sub-word tokenizer :

	```python
	from transformers import AutoTokenizer, AutoModelForTokenClassification

	tokenizer = AutoTokenizer.from_pretrained("botryan96/GeoBERT")
	model = AutoModelForTokenClassification.from_pretrained("botryan96/GeoBERT")
	```