skdrx/amd135m_reasoning_finetune

From what I can tell, testing by hand, the model is still bad. It can do math somewhat more and sometimes has reasoning success, but I think the dataset had reasoning chains that were too long for this small LM to understand. Also the rchatml format is a bit confusing, I dont think LM studio fully supports it. Regardless, finetuning and chat formats and all that is a complete joke and mess, someone needs to fix that (maybe ill do that later when I have the time). I think im just going to rerun this with some basic instruct format or something, make the model really good (relatively) at instruction following. Any thoughts?

skdrx
/

amd135m_reasoning_finetune

Model performance