@nicolay-r on Hugging Face: "📢 Qwen so far released the 2.5-MAX that claims to outperform DeepSeek-V3…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

nicolay-r

posted an update 2 days ago

Post

2028

📢 Qwen so far released the 2.5-MAX that claims to outperform DeepSeek-V3 [Edited: not R1].
And here is how you can start applying it for handling CSV / JSONL data.
The model is compatible with OpenAI API so here is my wrapper for it:
🌌 https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/openai_156.py

🚀 All you have to do is to set
base-url: https://dashscope-intl.aliyuncs.com/compatible-mode/v1
and API key of the platform.

↗️ Below is the link to the complete example (see screenshot):
https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_qwen_25_max_chat.sh

📰 Source: https://www.alibabacloud.com/help/en/model-studio/developer-reference/what-is-qwen-llm
📺 Official Sandbox Demo: Qwen/Qwen2.5-Max-Demo
📜 Paper: https://arxiv.org/abs/2412.15115

1 day ago

It would supposedly be better than DeepSeek V3, not R1. To compare to R1, we would need to see a new version of QvQ.

·

1 day ago

•

edited 1 day ago

@claudiohgdotta , thanks edited!
That would be too much from the Qwen-2.5-MAX.
Especially counting on how fast the demo inference of the Qwen.

In this post