EpistemeAI
/

Dolphin-Llama-3.1-8B-orpo-v0.1-4bit-gguf

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Dolphin-Llama-3.1-8B-orpo-v0.1-4bit-gguf / README.md

legolasyiu's picture

Update README.md

4c0c6b3 verified 5 months ago

|

history blame contribute delete

1.11 kB

	---
	base_model: unsloth/Meta-Llama-3.1-8B-bnb-4bit
	language:
	- en
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- llama
	- trl
	- sft
	- dpo
	license: llama3.1
	datasets:
	- reciperesearch/dolphin-sft-v0.1-preference
	pipeline_tag: text-generation
	license_name: llama3
	license_link: LICENSE
	model_creator: EpistemeAI
	quantized_by: EpistemeAI
	---

	gguf:
	- q4_k_m
	- 16-bit


	This model is based on Meta Llama 3.1 8b, and is governed by the Llama 3.1 license.

	Fine-tune using ORPO

	## Training Details

	### Training Data

	- dataset: reciperesearch/dolphin-sft-v0.1-preference

	### Training Procedure
	ORPO techniques


	#### Training Hyperparameters

	- Training regime: {{ training_regime \| default("[More Information Needed]", true)}} <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->



	TrainOutput(global_step=30, training_loss=4.25380277633667, metrics={'train_runtime': 679.3467, 'train_samples_per_second': 0.353, 'train_steps_per_second': 0.044, 'total_flos': 0.0, 'train_loss': 4.25380277633667, 'epoch': 0.015})