|
--- |
|
tags: |
|
- spacy |
|
- floret |
|
- token-classification |
|
language: |
|
- bg |
|
license: mit |
|
--- |
|
Bulgarian word vectors for a Bulgarian Spacy model. |
|
|
|
The floret vectors are trained on the Oscar 21.09 corpus and Bulgarian Wikipedia pages using with the following hyperparameters: `floret cbow -dim 300 -mode floret -bucket 200000 -minn 4 -maxn 5 -minCount 20 -neg 10 -hashCount 2 -lr 0.05 -thread 8` |
|
|
|
| Feature | Description | |
|
| --- | --- | |
|
| **Name** | `bg_floret_vectors_lg` | |
|
| **Version** | `1.0` | |
|
| **Vectors** | 200000 keys (300 dimensions) | |
|
| **Sources** | OSCAR Corpus 21.09 (Julien Abadji, Pedro Ortiz Suarez), Wikipedia (bgwiki-latest-pages-articles from June 11th) | |
|
| **License** | `MIT` | |
|
| **Author** | Ivaylo Sakelariev | |