✨Apache 2.0 ✨8.19GB VRAM, runs on most GPUs ✨Multi-Tasking: T2V, I2V, Video Editing, T2I, V2A ✨Text Generation: Supports Chinese & English ✨Powerful Video VAE: Encode/decode 1080P w/ temporal precision
✨ TODAY: DeepSeek unveiled Flash MLA: a efficient MLA decoding kernel for NVIDIA Hopper GPUs, optimized for variable-length sequences. https://github.com/deepseek-ai/FlashMLA
Moonshot AI introduces Moonlight: a 3B/16B MoE trained on 5.7T tokens using Muon, pushing the Pareto frontier with fewer FLOPs. moonshotai/Moonlight-16B-A3B
Last year, their GOT-OCR 2.0 took the community by storm 🔥but many didn’t know they were also building some amazing models. Now, they’ve just dropped something huge on the hub!
📺 Step-Video-T2V: a 30B bilingual open video model that generates 204 frames (8-10s) at 540P resolution with high information density & consistency. stepfun-ai/stepvideo-t2v
Ovis2 🔥 a multimodal LLM released by Alibaba AIDC team. AIDC-AI/ovis2-67ab36c7e497429034874464 ✨1B/2B/4B/8B/16B/34B ✨Strong CoT for deeper problem solving ✨Multilingual OCR – Expanded beyond English & Chinese, with better data extraction
Xwen 🔥 a series of open models based on Qwen2.5 models, developed by a brilliant research team of PhD students from the Chinese community. shenzhi-wang/xwen-chat-679e30ab1f4b90cfa7dbc49e ✨ 7B/72B ✨ Apache 2.0 ✨ Xwen-72B-Chat outperformed DeepSeek V3 on Arena Hard Auto
✨ Launched All-Scenario Reasoning Model (language, visual, and search reasoning capabilities) , with medical expertise as one of its key highlights. https://ying.baichuan-ai.com/chat
✨ Released Baichuan-M1-14B Medical LLM on the hub Available in both Base and Instruct versions, support English & Chinese.
What happened yesterday in the Chinese AI community? 🚀
T2A-01-HD 👉 https://hailuo.ai/audio MiniMax's Text-to-Audio model, now in Hailuo AI, offers 300+ voices in 17+ languages and instant emotional voice cloning.
Tare 👉 https://www.trae.ai/ A new coding tool by Bytedance for professional developers, supporting English & Chinese with free access to Claude 3.5 and GPT-4 for a limited time.
Kimi K 1.5 👉 https://github.com/MoonshotAI/Kimi-k1.5 | https://kimi.ai/ An O1-level multi-modal model by MoonShot AI, utilizing reinforcement learning with long and short-chain-of-thought and supporting up to 128k tokens.
And today…
Hunyuan 3D-2.0 👉 tencent/Hunyuan3D-2 A SoTA 3D synthesis system for high-res textured assets by Tencent Hunyuan , with open weights and code!
✨ MIT License : enabling distillation for custom models ✨ 32B & 70B models match OpenAI o1-mini in multiple capabilities ✨ API live now! Access Chain of Thought reasoning with model='deepseek-reasoner'