StyleDistance
/

styledistance

Sentence Similarity

sentence-transformers

datadreamer-0.35.0

feature-extraction

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

AjayP13 commited on Sep 2, 2024

Commit

fffe342

·

verified ·

1 Parent(s): d21a862

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -51,6 +51,8 @@ language:
 StyleDistance is a **style embedding model** that aims to embed texts with similar writing styles closely and different styles far apart, regardless of content. You may find this model useful for stylistic analysis of text, clustering, authorship identfication and verification tasks, and automatic style transfer evaluation.
 StyleDistance was contrastively trained on [SynthSTEL](https://huggingface.co/datasets/StyleDistance/synthstel), a synthetically generated dataset of positive and negative examples of 40 style features being used in text. By utilizing this synthetic dataset, StyleDistance is able to achieve stronger content-independence than other style embeddding models currently available. This particular model was trained using a combination of the synthetic dataset and a [real dataset that makes use of authorship datasets from Reddit to train style embeddings](https://aclanthology.org/2022.repl4nlp-1.26/). For a version that is purely trained on synthetic data, see this other version of [StyleDistance](https://huggingface.co/StyleDistance/styledistance_synthetic_only).
 ## Example Usage

 StyleDistance is a **style embedding model** that aims to embed texts with similar writing styles closely and different styles far apart, regardless of content. You may find this model useful for stylistic analysis of text, clustering, authorship identfication and verification tasks, and automatic style transfer evaluation.
+## Training Data and Variants of StyleDistance
 StyleDistance was contrastively trained on [SynthSTEL](https://huggingface.co/datasets/StyleDistance/synthstel), a synthetically generated dataset of positive and negative examples of 40 style features being used in text. By utilizing this synthetic dataset, StyleDistance is able to achieve stronger content-independence than other style embeddding models currently available. This particular model was trained using a combination of the synthetic dataset and a [real dataset that makes use of authorship datasets from Reddit to train style embeddings](https://aclanthology.org/2022.repl4nlp-1.26/). For a version that is purely trained on synthetic data, see this other version of [StyleDistance](https://huggingface.co/StyleDistance/styledistance_synthetic_only).
 ## Example Usage