README.md · vpelloin/MEDIA_NLU-flaubert_oral_ft at d350309f204ad34831aa5eb8b6f6d050b8daa255

MEDIA_NLU-flaubert_oral_ft / README.md

vpelloin

Update README.md

e065a1d over 2 years ago

preview code

raw

history blame

1.76 kB

	---
	language: fr
	pipeline_tag: "token-classification"
	widget:
	- text: "je voudrais réserver une chambre à paris pour demain et lundi"
	- text: "d'accord pour l'hôtel à quatre vingt dix euros la nuit"
	- text: "deux nuits s'il vous plait"
	- text: "dans un hôtel avec piscine à marseille"
	tags:
	- bert
	- flaubert
	- natural language understanding
	- NLU
	- spoken language understanding
	- SLU
	- understanding
	- MEDIA
	---

	# vpelloin/MEDIA_NLU_flaubert_finetuned (FT)

	This is a Natural Language Understanding (NLU) model for the French [MEDIA benchmark](https://catalogue.elra.info/en-us/repository/browse/ELRA-S0272/).
	It maps each input words into outputs concepts tags (76 available).

	This model is a fine-tuning of [`flaubert-oral-ft`](https://huggingface.co./nherve/flaubert-oral-ft) (FlauBERT finetuned on ASR data).


	## Usage with Pipeline
	```python
	from transformers import pipeline

	generator = pipeline(model="vpelloin/MEDIA_NLU_flaubert_finetuned", task="token-classification")

	print(generator)
	```

	## Usage with AutoTokenizer/AutoModel
	```python
	from transformers import (
	AutoTokenizer,
	AutoModelForTokenClassification
	)

	tokenizer = AutoTokenizer.from_pretrained("vpelloin/MEDIA_NLU_flaubert_finetuned")
	model = AutoModelForTokenClassification.from_pretrained("vpelloin/MEDIA_NLU_flaubert_finetuned")

	sentences = [
	"je voudrais réserver une chambre à paris pour demain et lundi",
	"d'accord pour l'hôtel à quatre vingt dix euros la nuit",
	"deux nuits s'il vous plait",
	"dans un hôtel avec piscine à marseille"
	]
	inputs = tokenizer(sentences, padding=True, return_tensors='pt')

	outptus = model(**inputs).logits

	print([[model.config.id2label[i] for i in b] for b in outptus.argmax(dim=-1).tolist()])
	```