JingzeShi (Loser Cheems)

reacted to prithivMLmods's post with 🔥 9 days ago

Post

5125

Deepswipe by
.
.
.
. Deepseek🐬🗿

Everything is now in recovery. 📉📈

4 replies

·

reacted to prithivMLmods's post with 🤗 9 days ago

Post

4197

QwQ Edge Gets a Small Update..! 💬
try now: prithivMLmods/QwQ-Edge

🚀Now, you can use the following commands for different tasks:

🖼️ @image 'prompt...' → Generates an image
🔉@tts1 'prompt...' → Generates speech in a female voice
🔉 @tts2 'prompt...' → Generates speech in a male voice
🅰️@text 'prompt...' → Enables textual conversation (If not specified, text-to-text generation is the default mode)

💬Multimodality Support : prithivMLmods/Qwen2-VL-OCR-2B-Instruct
💬For text generation, the FastThink-0.5B model ensures quick and efficient responses, prithivMLmods/FastThink-0.5B-Tiny
💬Image Generation: sdxl lightning model, SG161222/RealVisXL_V4.0_Lightning

Github: https://github.com/PRITHIVSAKTHIUR/QwQ-Edge

graph TD
    A[User Interface] --> B[Chat Logic]
    B --> C{Command Type}
    C -->|Text| D[FastThink-0.5B]
    C -->|Image| E[Qwen2-VL-OCR-2B]
    C -->|@image| F[Stable Diffusion XL]
    C -->|@tts| G[Edge TTS]
    D --> H[Response]
    E --> H
    F --> H
    G --> H

reacted to their post with 🤗🚀 17 days ago

Post

2251

Welcome to the Doge Face Open Source Community! 🚀
Our goal is to explore the foundation of embodied intelligence for the next two years, which is indispensable – small language models. 🔬
We aim to open-source code and documentation to give everyone more time to slack off while working or studying! 🤗
👉 Repository name on Github: https://github.com/SmallDoges/small-doge
👉 Organization name on Hugging Face: https://huggingface.co./SmallDoge

posted an update 17 days ago

Post

2251

Welcome to the Doge Face Open Source Community! 🚀
Our goal is to explore the foundation of embodied intelligence for the next two years, which is indispensable – small language models. 🔬
We aim to open-source code and documentation to give everyone more time to slack off while working or studying! 🤗
👉 Repository name on Github: https://github.com/SmallDoges/small-doge
👉 Organization name on Hugging Face: https://huggingface.co./SmallDoge

reacted to their post with 🔥 20 days ago

Post

1698

🤩warmup -> stable -> decay leanring rate scheduler:
😎use the Stable Phase CheckPoints to Continue Training the model on Any New Dataset without spikes of the training!!!
SmallDoge/Doge-20M-checkpoint
SmallDoge/Doge-60M-checkpoint

4 replies

·

reacted to their post with 🤗👀 29 days ago

Post

1698

🤩warmup -> stable -> decay leanring rate scheduler:
😎use the Stable Phase CheckPoints to Continue Training the model on Any New Dataset without spikes of the training!!!
SmallDoge/Doge-20M-checkpoint
SmallDoge/Doge-60M-checkpoint

4 replies

·

replied to their post 29 days ago

but you are internet celebrity rapper of📙

replied to their post 29 days ago

the process is always hard, the result is always good.😁

reacted to their post with 😎 29 days ago

Post

1698

🤩warmup -> stable -> decay leanring rate scheduler:
😎use the Stable Phase CheckPoints to Continue Training the model on Any New Dataset without spikes of the training!!!
SmallDoge/Doge-20M-checkpoint
SmallDoge/Doge-60M-checkpoint

4 replies

·

posted an update 29 days ago

Post

1698

🤩warmup -> stable -> decay leanring rate scheduler:
😎use the Stable Phase CheckPoints to Continue Training the model on Any New Dataset without spikes of the training!!!
SmallDoge/Doge-20M-checkpoint
SmallDoge/Doge-60M-checkpoint

4 replies

·

reacted to their post with 👍 30 days ago

Post

2076

Only a single RTX 4090 running model pre-training is really slow, even for small language models!!! (https://huggingface.co./collections/JingzeShi/doge-slm-677fd879f8c4fd0f43e05458)

2 replies

·

reacted to anakin87's post with 👍 30 days ago

Post

1625

𝐍𝐞𝐰 𝐈𝐭𝐚𝐥𝐢𝐚𝐧 𝐒𝐦𝐚𝐥𝐥 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥𝐬: 𝐆𝐞𝐦𝐦𝐚 𝐍𝐞𝐨𝐠𝐞𝐧𝐞𝐬𝐢𝐬 𝐜𝐨𝐥𝐥𝐞𝐜𝐭𝐢𝐨𝐧 💎🌍🇮🇹

I am happy to release two new language models for the Italian Language!

💪 Gemma 2 9B Neogenesis ITA
anakin87/gemma-2-9b-neogenesis-ita
Building on the impressive work by VAGO Solutions, I applied Direct Preference Optimization with a mix of Italian and English data.
Using Spectrum, I trained 20% of model layers.

📊 Evaluated on the Open ITA LLM leaderboard ( mii-llm/open_ita_llm_leaderboard), this model achieves strong performance.
To beat it on this benchmark, you'd need a 27B model 😎

🤏 Gemma 2 2B Neogenesis ITA
anakin87/gemma-2-2b-neogenesis-ita
This smaller variant is fine-tuned from the original Gemma 2 2B it by Google.
Through a combination of Supervised Fine-Tuning and Direct Preference Optimization, I trained 25% of the layers using Spectrum.

📈 Compared to the original model, it shows improved Italian proficiency, good for its small size.

Both models were developed during the recent #gemma competition on Kaggle.
📓 Training code: https://www.kaggle.com/code/anakin87/post-training-gemma-for-italian-and-beyond

🙏 Thanks @FinancialSupport and mii-llm for the help during evaluation.

3 replies

·

replied to anakin87's post 30 days ago

Cool!!!

replied to their post 30 days ago

So slow

reacted to their post with 👀 30 days ago

Post

2076

Only a single RTX 4090 running model pre-training is really slow, even for small language models!!! (https://huggingface.co./collections/JingzeShi/doge-slm-677fd879f8c4fd0f43e05458)

2 replies

·

reacted to their post with 🤯 about 1 month ago

Post

2076

Only a single RTX 4090 running model pre-training is really slow, even for small language models!!! (https://huggingface.co./collections/JingzeShi/doge-slm-677fd879f8c4fd0f43e05458)

2 replies

·

posted an update about 1 month ago

Post

2076

Only a single RTX 4090 running model pre-training is really slow, even for small language models!!! (https://huggingface.co./collections/JingzeShi/doge-slm-677fd879f8c4fd0f43e05458)

2 replies

·

Loser Cheems

AI & ML interests

Recent Activity

Organizations

JingzeShi's activity