Getting WebRTC and Websockets right in python is very tricky. If you've tried to wrap an LLM in a real-time audio layer then you know what I'm talking about.
That's where FastRTC comes in! It makes WebRTC and Websocket streams super easy with minimal code and overhead.
β¨Apache 2.0 β¨8.19GB VRAM, runs on most GPUs β¨Multi-Tasking: T2V, I2V, Video Editing, T2I, V2A β¨Text Generation: Supports Chinese & English β¨Powerful Video VAE: Encode/decode 1080P w/ temporal precision
β¨ TODAY: DeepSeek unveiled Flash MLA: a efficient MLA decoding kernel for NVIDIA Hopper GPUs, optimized for variable-length sequences. https://github.com/deepseek-ai/FlashMLA
Moonshot AI introduces Moonlight: a 3B/16B MoE trained on 5.7T tokens using Muon, pushing the Pareto frontier with fewer FLOPs. moonshotai/Moonlight-16B-A3B
π StepFunιΆθ·ζθΎ° is making BIG open moves!
Last year, their GOT-OCR 2.0 took the community by storm π₯but many didnβt know they were also building some amazing models. Now, theyβve just dropped something huge on the hub!
πΊ Step-Video-T2V: a 30B bilingual open video model that generates 204 frames (8-10s) at 540P resolution with high information density & consistency. stepfun-ai/stepvideo-t2v
Ovis2 π₯ a multimodal LLM released by Alibaba AIDC team. AIDC-AI/ovis2-67ab36c7e497429034874464 β¨1B/2B/4B/8B/16B/34B β¨Strong CoT for deeper problem solving β¨Multilingual OCR β Expanded beyond English & Chinese, with better data extraction
The community has been busy distilling DeepSeek-R1 from inference providers, but we decided to have a go at doing it ourselves from scratch πͺ
Whatβs new compared to existing reasoning datasets?
βΎ Based on AI-MO/NuminaMath-1.5: we focus on math reasoning traces and generate answers for problems in NuminaMath 1.5, an improved version of the popular NuminaMath-CoT dataset.
π³ 800k R1 reasoning traces: We generate two answers for 400k problems using DeepSeek R1. The filtered dataset contains 220k problems with correct reasoning traces.
π 512 H100s running locally: Instead of relying on an API, we leverage vLLM and SGLang to run generations locally on our science cluster, generating 180k reasoning traces per day.
β³ Automated filtering: We apply Math Verify to only retain problems with at least one correct answer. We also leverage Llama3.3-70B-Instruct as a judge to retrieve more correct examples (e.g for cases with malformed answers that canβt be verified with a rules-based parser)
π We match the performance of DeepSeek-Distill-Qwen-7B by finetuning Qwen-7B-Math-Instruct on our dataset.