Inria-CEDAR
/

FactSpotter-DeBERTaV3-Large

Text Classification

Inference Endpoints

Model card Files Files and versions Community

FactSpotter-DeBERTaV3-Large / README.md

guihu's picture

Update README.md

729447b verified 8 months ago

|

2.89 kB

	---
	license: mit
	datasets:
	- web_nlg
	language:
	- en
	---
	# Model card for Inria-CEDAR/FactSpotter-DeBERTaV3-Large

	## Model description

	This model is related to the paper "FactSpotter: Evaluating the Factual Faithfulness of Graph-to-Text Generation".

	Given a triple of format "subject \| predicate \| object" and a text, the model determines if the triple is present in the text.

	Different from the paper using ELECTRA, this model is finetuned on DeBERTaV3.

	## How to use the model

	```
	from transformers import AutoTokenizer, AutoModelForSequenceClassification

	def sentence_cls_score(input_strings, predicate_cls_model, predicate_cls_tokenizer):
	tokenized_cls_input = predicate_cls_tokenizer(input_strings, truncation=True, padding=True,
	return_token_type_ids=True)
	input_ids = torch.Tensor(tokenized_cls_input['input_ids']).long().to(torch.device("cuda"))
	token_type_ids = torch.Tensor(tokenized_cls_input['token_type_ids']).long().to(torch.device("cuda"))
	attention_mask = torch.Tensor(tokenized_cls_input['attention_mask']).long().to(torch.device("cuda"))
	prev_cls_output = predicate_cls_model(input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids)
	softmax_cls_output = torch.softmax(prev_cls_output.logits, dim=1, )
	return softmax_cls_output

	tokenizer = AutoTokenizer.from_pretrained("Inria-CEDAR/FactSpotter-DeBERTaV3-Large")
	model = AutoModelForSequenceClassification.from_pretrained("Inria-CEDAR/FactSpotter-DeBERTaV3-Large")

	# pairs of texts (as premises) and triples (as hypotheses)
	cls_texts = [("the aarhus is the airport of aarhus, denmark", "aarhus airport \| city served \| aarhus, denmark"),
	("aarhus airport is 25.0 metres above the sea level", "aarhus airport \| elevation above the sea level \| 1174")]
	cls_scores = sentence_cls_score(cls_texts, model, tokenizer)
	# Dimensions: 0-entailment, 1-neutral, 2-contradiction
	label_names = ["entailment", "neutral", "contradiction"]
	```
	## Citation
	If the model is useful to you, please cite the paper

	```
	@inproceedings{zhang:hal-04257838,
	TITLE = {{FactSpotter: Evaluating the Factual Faithfulness of Graph-to-Text Generation}},
	AUTHOR = {Zhang, Kun and Balalau, Oana and Manolescu, Ioana},
	URL = {https://hal.science/hal-04257838},
	BOOKTITLE = {{Findings of EMNLP 2023 - Conference on Empirical Methods in Natural Language Processing}},
	ADDRESS = {Singapore, Singapore},
	YEAR = {2023},
	MONTH = Dec,
	KEYWORDS = {Graph-to-Text Generation ; Factual Faithfulness ; Constrained Text Generation},
	PDF = {https://hal.science/hal-04257838/file/_EMNLP_2023__Evaluating_the_Factual_Faithfulness_of_Graph_to_Text_Generation_Camera.pdf},
	HAL_ID = {hal-04257838},
	HAL_VERSION = {v1},
	}
	```

	## Questions
	If you have some questions, please contact through my email [email protected] or [email protected]