nbeerbower
/

Mistral-Nemo-Prism-12B-v7

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Mistral-Nemo-Prism-12B-v7 / README.md

nbeerbower's picture

Update README.md

0c9da9f verified 2 months ago

|

history blame contribute delete

1.32 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model:
	- nbeerbower/Mahou-1.5-mistral-nemo-12B-lorablated
	datasets:
	- nbeerbower/Arkhaios-DPO
	- nbeerbower/Purpura-DPO
	---

	![image/png](https://huggingface.co./nbeerbower/Mistral-Nemo-Prism-12B/resolve/main/prism-cover.png?download=true)

	> 🧪 Just Another Model Experiment
	>
	> This is one of many experimental iterations I'm sharing publicly while I mess around with training parameters and ideas. It's not a "real" release - just me being transparent about my learning process. Feel free to look under the hood, but don't expect anything production-ready!

	# Mistral-Nemo-Prism-12B-v7

	[Mahou-1.5-mistral-nemo-12B-lorablated](https://huggingface.co./nbeerbower/Mahou-1.5-mistral-nemo-12B-lorablated) finetuned on [Arkhaios-DPO](https://huggingface.co./datasets/nbeerbower/Arkhaios-DPO) and [Purpura-DPO](https://huggingface.co./datasets/nbeerbower/Purpura-DPO).

	The goal was to reduce archaic language and purple prose in a completely uncensored model.

	### Method

	[ORPO tuned](https://mlabonne.github.io/blog/posts/2024-04-19_Fine_tune_Llama_3_with_ORPO.html) with 8x A40 for 10 epochs.

	For this version, beta was increased to 2.

	In conclusion, LoRA does not seem to be able to completely remove some of the language issues deeply embedded in the model.