HannaAbiAkl
/

flan-t5-small-geonames

Text Generation

text2text-generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

flan-t5-small-geonames / README.md

HannaAbiAkl's picture

Update README.md (#1)

6b22ec8 verified 3 months ago

|

2.26 kB

	---
	license: mit
	language:
	- en
	metrics:
	- accuracy
	- f1
	- recall
	- precision
	library_name: transformers
	pipeline_tag: text-generation
	---

	# FLAN-T5 small-GeoNames

	This model is a fine-tuned version of [flan-t5-small](https://huggingface.co./google/flan-t5-small) on the GeoNames dataset.

	## Model description

	The model is trained to classify terms into one of 660 category classes related to geographical locations.

	The model also works well as part of a Retrieval-and-Generation (RAG) pipeline by leveraging an external knowledge source, specifically [GeoNames Semantic Primes](https://huggingface.co./datasets/HannaAbiAkl/geonames-semantic-primes).

	## Intended uses and limitations

	This model is intended to be used to generate a type (class) for an input term.

	# Training and evaluation data

	The training and evaluation data can be found [here](https://github.com/HamedBabaei/LLMs4OL-Challenge-ISWC2024/tree/main/TaskA-Term%20Typing/SubTask%20A.2%20(FS)%20-%20GeoNames).

	The train size is 8078865.

	The test size is 702510.

	## Example

	Here's an example of the model capabilities:

	- input:
	- Lexical Term L: Pic de Font Blanca

	- output:
	- Type: peak

	- input:
	- Lexical Term L: Roc Mele

	- output:
	- Type: mountain

	- input:
	- Lexical Term L: Estany de les Abelletes

	- output:
	- Type: lake

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- num_epochs: 5

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 2.6223 \| 1.0 \| 1000 \| 1.5223 \|
	\| 2.1430 \| 2.0 \| 2000 \| 1.3764 \|
	\| 1.9100 \| 3.0 \| 3000 \| 1.2825 \|
	\| 1.7642 \| 4.0 \| 4000 \| 1.2102 \|
	\| 1.6607 \| 5.0 \| 5000 \| 1.1488 \|

	```
	@misc{akl2024dstillms4ol2024task,
	title={DSTI at LLMs4OL 2024 Task A: Intrinsic versus extrinsic knowledge for type classification},
	author={Hanna Abi Akl},
	year={2024},
	eprint={2408.14236},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2408.14236},
	}
	```