6 21 50

t1u1

AI & ML interests

None yet

Recent Activity

liked a model 4 days ago

katanemo/Arch-Function-3B

liked a model 5 days ago

nomic-ai/nomic-embed-text-v2-moe

liked a model 7 days ago

bartowski/simplescaling_s1.1-32B-GGUF

View all activity

Organizations

None yet

t1u1's activity

liked a model 4 days ago

katanemo/Arch-Function-3B

Text Generation • Updated 13 days ago • 556 • 109

liked a model 5 days ago

nomic-ai/nomic-embed-text-v2-moe

liked a model 7 days ago

bartowski/simplescaling_s1.1-32B-GGUF

Text Generation • Updated 7 days ago • 170k • 4

upvoted 2 papers 7 days ago

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

Paper • 2502.05171 • Published 11 days ago • 107

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published 8 days ago • 128

reacted to schuler's post with 👍 8 days ago

Post

7184

📢 New Research Alert: Making Language Models Smaller & Smarter!

Thrilled to share the latest technical report demonstrating how to reduce language model parameters by 77% while maintaining performance.

The secret? Grouped pointwise convolutions. Yes. We brought a method from computer vision to the transformers arena.

🔑 Key Findings:
• 77% parameter reduction.
• Maintained model capabilities.
• Improved generalization.

Paper: https://www.researchgate.net/publication/388835829_SAVING_77_OF_THE_PARAMETERS_IN_LARGE_LANGUAGE_MODELS_TECHNICAL_REPORT
Code: https://github.com/joaopauloschuler/less-parameters-llm

2 replies

liked a model 8 days ago

agentica-org/DeepScaleR-1.5B-Preview

Updated 8 days ago • 10.6k • 431

upvoted a collection 10 days ago

AceCoder

Collection

13 items • Updated 6 days ago • 6

upvoted a paper 14 days ago

ACECODER: Acing Coder RL via Automated Test-Case Synthesis

Paper • 2502.01718 • Published 15 days ago • 27

liked a model 21 days ago

open-thoughts/OpenThinker-7B

Text Generation • Updated 7 days ago • 5.24k • 103

liked a model 22 days ago

unsloth/DeepSeek-R1-GGUF

Text Generation • Updated 6 days ago • 2.33M • 837

reacted to mitkox's post with 👍 23 days ago

Post

2322

llama.cpp is 26.8% faster than ollama.
I have upgraded both, and using the same settings, I am running the same DeepSeek R1 Distill 1.5B on the same hardware. It's an Apples to Apples comparison.

Total duration:
llama.cpp 6.85 sec <- 26.8% faster
ollama 8.69 sec

Breakdown by phase:
Model loading
llama.cpp 241 ms <- 2x faster
ollama 553 ms

Prompt processing
llama.cpp 416.04 tokens/s with an eval time 45.67 ms <- 10x faster
ollama 42.17 tokens/s with an eval time of 498 ms

Token generation
llama.cpp 137.79 tokens/s with an eval time 6.62 sec <- 13% faster
ollama 122.07 tokens/s with an eval time 7.64 sec

llama.cpp is LLM inference in C/C++; ollama adds abstraction layers and marketing.

Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.

7 replies

reacted to lewtun's post with 🚀 23 days ago

Post

10114

We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!

🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.

🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.

🔥 Step 3: show we can go from base model -> SFT -> RL via multi-stage training.

Follow along: https://github.com/huggingface/open-r1