arxyzan commited on
Commit
5565053
·
1 Parent(s): 0f2eda9

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -0
README.md ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - fa
4
+ pipeline_tag: feature-extraction
5
+ ---
6
+ This is the original fasttext embedding model for Persian from [here](https://fasttext.cc/docs/en/crawl-vectors.html#models) loaded and converted using Gensim and exported to Hezar compatible format.
7
+ For more info, see [here](https://fasttext.cc/docs/en/support.html).
8
+
9
+ In order to use this model in Hezar you can simply use this piece of code:
10
+ ```bash
11
+ pip install hezar
12
+ ```
13
+ ```python
14
+ from hezar import Embedding
15
+
16
+ fasttext = Embedding.load("hezarai/fasttext-fa-300")
17
+ # Get embedding vector
18
+ vector = fasttext("هزار")
19
+ # Find the word that doesn't match with the rest
20
+ doesnt_match = fasttext.doesnt_match(["خانه", "اتاق", "ماشین"])
21
+ # Find the top-n most similar words to the given word
22
+ most_similar = fasttext.most_similar("هزار", top_n=5)
23
+ # Find the cosine similarity value between two words
24
+ similarity = fasttext.similarity("مهندس", "دکتر")
25
+ ```