10 63

Reithan

AI & ML interests

None yet

Recent Activity

liked a model about 2 hours ago

Steelskull/L3.3-Damascus-R1

new activity 1 day ago

sometimesanotion/Lamarck-14B-v0.7:Censored

replied to sometimesanotion's post 2 days ago

I'm just saving today's 14B parameter chart, because big things are about to hit. Lamarck v0.7 has been surpassed by at least two models I know of, and in ways that promise good things to come for the whole scene. I am taking my time to enjoy the progress, and Lamarck v0.8 will come when it's clearly keeping up and keeping its flavor. There is no one best model for everyone, regardless of these rankings. I aim to make Lamarck good at coding, translating, and rigorously critiquing rhetoric and logic. Always check out the authors' notes on models to see if their intent is close to your use case!

View all activity

Organizations

None yet

Reithan's activity

liked a model about 2 hours ago

Steelskull/L3.3-Damascus-R1

Text Generation • Updated about 7 hours ago • 252 • 22

New activity in sometimesanotion/Lamarck-14B-v0.7 1 day ago

Censored

#2 opened 1 day ago by

jongames

replied to sometimesanotion's post 2 days ago

Sorry if this comment/ask is out of line or place, but I've been loving Lamarck and evangelizing it all over. One thing I'd love to see, given the fact that Lamarck has R1 as an element, is a bit more consistency with it's output of <think> (specifically, without a <think> prefill, it's not super consistent in adding one, and sometimes with prefill it forgets to close it with </think>.)

It's use of it's think block, and the way it incorporates its thought into the post-think output though, as honestly better than I've seen even with base R1. It makes logical and mathematical jumps from one step to another that I haven't seen R1 do, and it's better at error-checking itself without vomiting out 5 paragraphs of "well let me double check". Not to mention it's superior prose means the output actually explains what it thought up far better and more clearly.

New activity in sometimesanotion/Lamarck-14B-v0.7 3 days ago

C4ai-command-r-plus Tokenizing?

#1 opened 3 days ago by

Reithan

reacted to sometimesanotion's post with ❤️👍🚀🔥 3 days ago

Post

2701

I've managed a #1 score of 41.22% average for 14B parameter models on the Open LLM Leaderboard. As of this writing, sometimesanotion/Lamarck-14B-v0.7 is #8 for all models up to 70B parameters.

It took a custom toolchain around Arcee AI's mergekit to manage the complex merges, gradients, and LoRAs required to make this happen. I really like seeing features of many quality finetunes in one solid generalist model.

8 replies

liked a model 3 days ago

arcee-ai/Virtuoso-Small-v2

Text Generation • Updated 6 days ago • 512 • 21

reacted to sometimesanotion's post with 👍 3 days ago

Post

3051

**Update** Either I had some wrong numbers plugged in to estimate benchmark numbers from comparator, or the benchmark changed. Virtuoso Small v2 at 41.07 average is still very impressive, especially for writing draft copy for business purposes, while Lamarck remains a chatty generalist-reasoning model.

I've felt confident that 14B Qwen finetunes and merges could break the 42.0 average, and Arcee **came close** with https://huggingface.co./arcee-ai/Virtuoso-Small-2. Congratulations to @arcee-ai !

Just two months ago, it was easy to think that 14B had plateaued, that you could have high IFEVAL or high MUSR/MATH/GPQA at 14B, but not both. That barrier is completely shattered. I see a pathway to even better, and Virtuoso Small 2 is a big part of why. Very impressive work. This community would expect no less from Arcee.

Just look at this graph! Keep in mind, my merges here build on the first Virtuoso Small, and *-DS merges build on DeepSeek R1. There are some impressive merges in the pipe!