Lynxpda
/

vep_ru

Model card Files Files and versions Community

vep_ru / README.md

Lynxpda's picture

Update README.md

135a524 verified 5 months ago

|

history blame contribute delete

3.34 kB

	---
	license: cc-by-sa-4.0
	library_name: pytorch
	language:
	- ru
	- vep
	datasets:
	- Lynxpda/back-translated-veps-russian
	pipeline_tag: translation
	---

	# Model Card for Veps - Russian version 1.0

	A model of translation from Vepsian into Russian.
	In archive initial weights of the model trained with OpenNMT-py (Locomotive).
	The model has 457M parameters and is trained from scratch.
	Also presented are model weights converted for Ctranslate2 and a package for installation and use with Argostranslate/Libretranslate.

	## Model Architecture and Objective

	```
	dec_layers: 20
	decoder_type: transformer
	enc_layers: 20
	encoder_type: transformer
	heads: 8
	hidden_size: 512
	max_relative_positions: 20
	model_dtype: fp16
	pos_ffn_activation_fn: gated-gelu
	position_encoding: false
	share_decoder_embeddings: true
	share_embeddings: true
	share_vocab: true
	src_vocab_size: 32000
	tgt_vocab_size: 32000
	transformer_ff: 6144
	word_vec_size: 512
	```
	# How to Use

	## Using the Model with OpenNMT-py

	To fine-tune the Vepsian to Russian translation model using [OpenNMT-py](https://github.com/OpenNMT/OpenNMT-py), you can modify and use the example configuration file from this repository - config.yml.


	## Using the Model with LibreTranslate and Argos Translate

	To use the Vepsian to Russian translation model with [LibreTranslate](https://github.com/LibreTranslate/LibreTranslate) and [Argos Translate](https://github.com/argosopentech/argos-translate), follow these steps:

	* Download the Model Archive: Ensure you have the `translate-vep_ru-1_0.argosmodel` file.
	* Locate the Packages Folder:
	* On Linux/MacOS: `~/.local/share/argos-translate/packages`
	* On Windows: `%userprofile%\.local\share\argos-translate\packages`
	* Create the Language Pair Folder:
	* Create a folder named vep_ru in the packages directory. If it already exists, delete or move it.
	* Extract the Model Archive:
	* Change the extension of the .argosmodel file to .zip.
	* Extract the contents of the .zip file into the vep_ru folder.
	* Restart LibreTranslate:
	* Restart the LibreTranslate application to load the new model.


	# Citing & Authors

	```
	@inproceedings{
	title={Model for Veps - Russian translation.},
	author={Maksim Migukin, Maksim Kuznetsov, Alexey Kutashov},
	year={2024}
	}
	```

	## Credits

	Data compiled by [Opus](https://opus.nlpl.eu/).

	Includes pretrained models from [Stanza](https://github.com/stanfordnlp/stanza/).

	Data from Vepsian [WiKi](https://vep.wikipedia.org/wiki/)

	Data from [Lehme No 2051 // Open corpus of Vepsian and Karelian languages VepKar.](http://dictorpus.krc.karelia.ru/)

	Data from [OMAMEDIA](https://omamedia.ru/)

	CCMatrix

	http://opus.nlpl.eu/CCMatrix-v1.php

	If you use the dataset or code, please cite (pdf) and, please, acknowledge OPUS (bib, pdf) as well for this release.

	This corpus has been extracted from web crawls using the margin-based bitext mining techniques described here. The original distribution is available from http://data.statmt.org/cc-matrix/

	OpenSubtitles

	http://opus.nlpl.eu/OpenSubtitles-v2018.php

	Please cite the following article if you use any part of the corpus in your own work: P. Lison and J. Tiedemann, 2016, OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016)