Edit model card

Model Card for Rasphi

Rasphi (pronounced rasp-fee, though the name may be changed.) is a WIP architecture derrived from Microsoft's Phi 3.5 MoE / GRIN model. It aims to improve reasoning by having a dedicated reasoning stream in which half of all experts are allocated to it. Due to the experts being split directly in half, there is a high chance of instability / overall incoherence in both streams.

Model Details

Uses

Rasphi can be used for research purposes and or finetuning to gauge the performance of the new architecture. However, it is highly not recommended in its current state to be used for any user-facing applications. Or at all.

Downloads last month
22
Safetensors
Model size
21.1B params
Tensor type
BF16
·
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.

Model tree for QuietImpostor/Rasphi-MoE-Instruct-Unfinetuned

Finetuned
(4)
this model