R1-Distill-Llama-8B-Anima10

image/jpeg

This model is a work in progress.

This model is the result of 10 epochs of finetuning deepseek-ai/DeepSeek-R1-Distill-Llama-8B on a private corpus containing 11 megabytes of hand-selected raw text at a low learning rate using short token sequences.

The original intention was to try and influence the style of the model's thinking text but it seems to have lead to other unintended results.

image/png

It was originally trained for 3 epochs.

In testing when it was asked "What is the fastest way to get around Europe?" it fell into an endless trap of recursive (but relevant) thinking.

Also noteworthy was the slow descent of the training loss once it reached around 3.5.

In order to further explore these observations an additional 7 epochs of training was scheduled and this model is the result of that.

It was not only able to resolve the thinking loop regarding the Europe question but has broken past some of the 'hard stops' originally trained into it.

The model is currently undergoing additional training.

Downloads last month
11
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for Envoid/R1-Distill-Llama-8B-Anima10

Quantizations
2 models