Self-Training Large Language Models for Tool-Use Without Demonstrations Paper • 2502.05867 • Published 19 days ago
Beyond Release: Access Considerations for Generative AI Systems Paper • 2502.16701 • Published 5 days ago • 9
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning Paper • 2502.17407 • Published 4 days ago • 22
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 24 days ago • 195
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 134
view post Post 11863 Google drops Gemini 2.0 Flash Thinkinga new experimental model that unlocks stronger reasoning capabilities and shows its thoughts. The model plans (with thoughts visible), can solve complex problems with Flash speeds, and morenow available in anychat, try it out: akhaliq/anychat See translation 2 replies · 🚀 9 9 🔥 5 5 👀 2 2 👍 2 2 + Reply
Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models Paper • 2412.02980 • Published Dec 4, 2024 • 13
view post Post 12509 QwQ-32B-Preview is now available in anychatA reasoning model that is competitive with OpenAI o1-mini and o1-previewtry it out: akhaliq/anychat See translation 1 reply · ❤️ 3 3 👀 2 2 + Reply
view post Post 4152 New model drop in anychatallenai/Llama-3.1-Tulu-3-8B is now availabletry it here: akhaliq/anychat See translation 🔥 3 3 👍 1 1 + Reply
view post Post 3064 anychatsupports chatgpt, gemini, perplexity, claude, meta llama, grok all in one apptry it out there: akhaliq/anychat ❤️ 7 7 🚀 3 3 🔥 2 2 + Reply