--- license: mit language: - en metrics: - accuracy - f1 - recall - precision library_name: transformers pipeline_tag: text-generation --- # FLAN-T5 small-WordNet This model is a fine-tuned version of [flan-t5-small](https://huggingface.co./google/flan-t5-small) on the WordNet dataset. ## Model description The model is trained to classify terms into one of four term types: noun, verb, adjective or adverb. The types themselves are learned and then generated by the model with no more than one type associated with a specific term. The model also works well as part of a Retrieval-and-Generation (RAG) pipeline by leveraging an external knowledge source, specifically [Wordnet Semantic Primes](https://huggingface.co./datasets/HannaAbiAkl/wordnet-semantic-primes). ## Intended uses and limitations This model is intended to be used to generate a type (class) for an input term. # Training and evaluation data The training and evaluation data can be found [here](https://github.com/HamedBabaei/LLMs4OL-Challenge-ISWC2024/tree/main/TaskA-Term%20Typing/SubTask%20A.1(FS)%20-%20WordNet). The train size is 40559. The test size is 9470. ## Example Here's an example of the model capabilities: - **input:** - *Lexical Term L:* question - *Sentence Containing L (Optional):* there was a question about my training - **output:** - *Type:* noun - **input:** - *Lexical Term L:* lodge - *Sentence Containing L (Optional):* Where are you lodging in Paris? - **output:** - *Type:* verb - **input:** - *Lexical Term L:* genus equisetum - *Sentence Containing L (Optional):* - **output:** - *Type:* noun ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 1e-05 - train_batch_size: 4 - eval_batch_size: 4 - seed: 42 - num_epochs: 5 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 0.1725 | 1.0 | 1000 | 0.0640 | | 0.1250 | 2.0 | 2000 | 0.0535 | | 0.1040 | 3.0 | 3000 | 0.0469 | | 0.0917 | 4.0 | 4000 | 0.0421 | | 0.0830 | 5.0 | 5000 | 0.0384 | ``` @misc{akl2024dstillms4ol2024task, title={DSTI at LLMs4OL 2024 Task A: Intrinsic versus extrinsic knowledge for type classification}, author={Hanna Abi Akl}, year={2024}, eprint={2408.14236}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2408.14236}, } ```