NaverHustQA
/

viLegal_cross

Sentence Similarity

text-classification

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

coang commited on 14 days ago

Commit

30d0306

·

verified ·

1 Parent(s): 7c9a77f

Create README.md

Files changed (1) hide show

README.md +54 -0

README.md ADDED Viewed

	@@ -0,0 +1,54 @@

+---
+pipeline_tag: cross-encoder
+tags:
+- cross-encoder
+- sentence-similarity
+- transformers
+- legal
+- reranker
+library_name: generic
+language:
+- vi
+---
+# NaverHustQA/viLegal_cross_Quang
+This is an cross-encoder model for Vietnamese legal domain: It returns a relevance score of a query-context input and can be used for information retrieval.
+We use [vinai/phobert-base-v2](https://huggingface.co/vinai/phobert-base-v2) as the pre-trained backbone.
+<!--- Describe your model here -->
+## Usage (HuggingFace Transformers)
+You can use the model like below (Remember to word-segment inputs first):
+```python
+from transformers import AutoModelForSequenceClassification, AutoTokenizer
+import torch
+# Load cross-encoder
+model_name = "NaverHustQA/viLegal_cross_Quang"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+# Define query and context
+query = "'Uống rượu lái_xe bị phạt bao_nhiêu tiền ?'"
+context = "Uống rượu lái_xe bị phạt 500,000 đồng ."
+# Tokenize input (Cross-encoder format: query and context as a single input)
+inputs = tokenizer(query, context, return_tensors="pt", padding=True, truncation=True)
+# Run through model
+with torch.no_grad():
+    outputs = model(**inputs)
+    score = outputs.logits.item()  # Extract classification score
+print(f"Relevance Score: {score}")
+```
+## Training
+You can find full information of our training methods and datasets in our reports.
+## Authors
+Le Thanh Huong, Nguyen Nhat Quang.