Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
m-ricย 
posted an update Sep 12
Post
1352
๐—ข๐—ฝ๐—ฒ๐—ป ๐—Ÿ๐—Ÿ๐— ๐˜€ ๐—ฎ๐—ฟ๐—ฒ ๐—ผ๐—ป ๐—ณ๐—ถ๐—ฟ๐—ฒ ๐—ฟ๐—ถ๐—ด๐—ต๐˜ ๐—ป๐—ผ๐˜„! ๐Ÿ”ฅ ๐——๐—ฒ๐—ฒ๐—ฝ๐—ฆ๐—ฒ๐—ฒ๐—ธ-๐—ฉ๐Ÿฎ.๐Ÿฑ ๐—ฎ๐—ป๐—ฑ ๐—ผ๐˜๐—ต๐—ฒ๐—ฟ ๐˜๐—ผ๐—ฝ ๐—ฟ๐—ฒ๐—น๐—ฒ๐—ฎ๐˜€๐—ฒ๐˜€

Mistral AI just released Pixtral-12B, a vision models that seems to perform extremely well! From Mistralโ€™s own benchmark, it beats the great Qwen2-7B and Llava-OV.

๐Ÿค” But Mistralโ€™s benchmarks evaluate in Chain-of-Thought, and even in CoT they show lower scores for other models than the scores already published in non-CoT, which is very strangeโ€ฆ Evaluation is not a settled science!

But itโ€™s only the last of a flurry of great models. Here are the ones currently squatting the top of the Models Hub page:

โถ ๐Ÿ”Š ๐‹๐ฅ๐š๐ฆ๐š-๐Ÿ‘.๐Ÿ-๐Ÿ–๐ ๐Ž๐ฆ๐ง๐ข, a model built upon Llama-3.1-8B-Instruct, that simultaneously generates text and speech response with an extremely low latency of 250ms (Moshi, Kyutaiโ€™s 8B, did 140ms)

โท ๐ŸŸ๐Ÿ—ฃ๏ธ ๐…๐ข๐ฌ๐ก ๐’๐ฉ๐ž๐ž๐œ๐ก ๐ฏ๐Ÿ.๐Ÿ’, text-to-speech model that supports 8 languages ๐Ÿ‡ฌ๐Ÿ‡ง๐Ÿ‡จ๐Ÿ‡ณ๐Ÿ‡ฉ๐Ÿ‡ช๐Ÿ‡ฏ๐Ÿ‡ต๐Ÿ‡ซ๐Ÿ‡ท๐Ÿ‡ช๐Ÿ‡ธ๐Ÿ‡ฐ๐Ÿ‡ท๐Ÿ‡ธ๐Ÿ‡ฆ with extremely good quality for a light size (~1GB weights) and low latency

โธ ๐Ÿณ ๐ƒ๐ž๐ž๐ฉ๐’๐ž๐ž๐ค-๐•๐Ÿ.๐Ÿ“, a 236B model with 128k context length that combines the best of DeepSeek-V2-Chat and the more recent DeepSeek-Coder-V2-Instruct. Depending on benchmarks, it ranks just below Llama-3.1-405B. Released with custom โ€˜deepseekโ€™ license, quite commercially permissive.

โน ๐’๐จ๐ฅ๐š๐ซ ๐๐ซ๐จ published by Upstage: a 22B model (so inference fits on a single GPU) that comes just under Llama-3.1-70B performance : MMLU: 79, GPQA: 36, IFEval: 84

โบ ๐Œ๐ข๐ง๐ข๐‚๐๐Œ๐Ÿ‘-๐Ÿ’๐, a small model that claims very impressive scores, even beating much larger models like Llama-3.1-8B. Let's wait for more scores because these look too good!

Letโ€™s keep looking, more good stuff is coming our way ๐Ÿ”ญ
In this post