vince62s's picture
Update README.md
700cec6
|
raw
history blame
882 Bytes
metadata
license: apache-2.0

This model is the original Mistral-ai 7B v0.1 model converted to the OpenNMT-py format. By original, it means with interleaved rotary (option: rotary_interleave=True)

You need to install OpenNMT-py, instructions are here: https://github.com/OpenNMT/OpenNMT-py

Running inference: Create a text input file with prompts (ex: "Show me some attractions in Boston") then run: onmt_translate --config mistral-inference.yaml --src input.txt --output output.txt

Running MMLU evaluation: If you git clone the OpenNMT-py repo then you can run: python eval_llm/MMLU/run_mmlu_opennmt.py --config mistral-inference.yaml For this use case make sure you use max_length=1 in the config file

Finetuning: Read this tuto: https://forum.opennmt.net/t/finetuning-llama-7b-13b-or-mosaicml-mpt-7b-reproduce-vicuna-alpaca/5272/56 onmt_train --config mistral-finetuning.yaml