minoosh
/

bert-reg-biencoder-mae

text-similarity

Model card Files Files and versions Community

bert-reg-biencoder-mae / README.md

minoosh's picture

Upload folder using huggingface_hub

4efd266 verified about 14 hours ago

|

history blame contribute delete

1.25 kB

	---
	language: en
	tags:
	- bert
	- regression
	- biencoder
	- similarity
	pipeline_tag: text-similarity
	---

	# BiEncoder Regression Model

	This model is a BiEncoder architecture that outputs similarity scores between text pairs.

	## Model Details
	- Base Model: bert-base-uncased
	- Task: Regression
	- Architecture: BiEncoder with cosine similarity
	- Loss Function: mae

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModel
	from modeling import BiEncoderModelRegression

	# Load model components
	tokenizer = AutoTokenizer.from_pretrained("minoosh/bert-reg-biencoder-mae")
	base_model = AutoModel.from_pretrained("bert-base-uncased")
	model = BiEncoderModelRegression(base_model, loss_fn="mae")

	# Load weights
	state_dict = torch.load("pytorch_model.bin")
	model.load_state_dict(state_dict)

	# Prepare inputs
	texts1 = ["first text"]
	texts2 = ["second text"]
	inputs = tokenizer(
	texts1, texts2,
	padding=True,
	truncation=True,
	return_tensors="pt"
	)

	# Get similarity scores
	outputs = model(**inputs)
	similarity_scores = outputs["logits"]
	```

	## Metrics
	The model was trained using mae loss and evaluated using:
	- Mean Squared Error (MSE)
	- Mean Absolute Error (MAE)
	- Pearson Correlation
	- Spearman Correlation
	- Cosine Similarity