Loser Cheems's picture

Loser Cheems

JingzeShi

AI & ML interests

I like training small languge models.

Recent Activity

updated a model about 9 hours ago
SmallDoge/Doge-160M-Reason-Distill
published a model about 9 hours ago
SmallDoge/Doge-160M-Reason-Distill
liked a model about 21 hours ago
SmallDoge/Doge-160M-Instruct
View all activity

Organizations

Hugging Face Discord Community's profile picture Hugging Face Party @ PyTorch Conference's profile picture Nerdy Face's profile picture Doge Face's profile picture

JingzeShi's activity

reacted to prithivMLmods's post with ๐Ÿ”ฅ 9 days ago
view post
Post
5125
Deepswipe by
.
.
.
. Deepseek๐Ÿฌ๐Ÿ—ฟ






Everything is now in recovery. ๐Ÿ“‰๐Ÿ“ˆ
ยท
reacted to prithivMLmods's post with ๐Ÿค— 9 days ago
view post
Post
4197
QwQ Edge Gets a Small Update..! ๐Ÿ’ฌ
try now: prithivMLmods/QwQ-Edge

๐Ÿš€Now, you can use the following commands for different tasks:

๐Ÿ–ผ๏ธ @image 'prompt...' โ†’ Generates an image
๐Ÿ”‰@tts1 'prompt...' โ†’ Generates speech in a female voice
๐Ÿ”‰ @tts2 'prompt...' โ†’ Generates speech in a male voice
๐Ÿ…ฐ๏ธ@text 'prompt...' โ†’ Enables textual conversation (If not specified, text-to-text generation is the default mode)

๐Ÿ’ฌMultimodality Support : prithivMLmods/Qwen2-VL-OCR-2B-Instruct
๐Ÿ’ฌFor text generation, the FastThink-0.5B model ensures quick and efficient responses, prithivMLmods/FastThink-0.5B-Tiny
๐Ÿ’ฌImage Generation: sdxl lightning model, SG161222/RealVisXL_V4.0_Lightning

Github: https://github.com/PRITHIVSAKTHIUR/QwQ-Edge

graph TD
    A[User Interface] --> B[Chat Logic]
    B --> C{Command Type}
    C -->|Text| D[FastThink-0.5B]
    C -->|Image| E[Qwen2-VL-OCR-2B]
    C -->|@image| F[Stable Diffusion XL]
    C -->|@tts| G[Edge TTS]
    D --> H[Response]
    E --> H
    F --> H
    G --> H
reacted to their post with ๐Ÿค—๐Ÿš€ 17 days ago
view post
Post
2251
Welcome to the Doge Face Open Source Community! ๐Ÿš€
Our goal is to explore the foundation of embodied intelligence for the next two years, which is indispensable โ€“ small language models. ๐Ÿ”ฌ
We aim to open-source code and documentation to give everyone more time to slack off while working or studying! ๐Ÿค—
๐Ÿ‘‰ Repository name on Github: https://github.com/SmallDoges/small-doge
๐Ÿ‘‰ Organization name on Hugging Face: https://huggingface.co./SmallDoge
posted an update 17 days ago
view post
Post
2251
Welcome to the Doge Face Open Source Community! ๐Ÿš€
Our goal is to explore the foundation of embodied intelligence for the next two years, which is indispensable โ€“ small language models. ๐Ÿ”ฌ
We aim to open-source code and documentation to give everyone more time to slack off while working or studying! ๐Ÿค—
๐Ÿ‘‰ Repository name on Github: https://github.com/SmallDoges/small-doge
๐Ÿ‘‰ Organization name on Hugging Face: https://huggingface.co./SmallDoge
reacted to their post with ๐Ÿ”ฅ 20 days ago
view post
Post
1698
๐Ÿคฉwarmup -> stable -> decay leanring rate scheduler:
๐Ÿ˜Žuse the Stable Phase CheckPoints to Continue Training the model on Any New Dataset without spikes of the training!!!
SmallDoge/Doge-20M-checkpoint
SmallDoge/Doge-60M-checkpoint
ยท
reacted to their post with ๐Ÿค—๐Ÿ‘€ 29 days ago
view post
Post
1698
๐Ÿคฉwarmup -> stable -> decay leanring rate scheduler:
๐Ÿ˜Žuse the Stable Phase CheckPoints to Continue Training the model on Any New Dataset without spikes of the training!!!
SmallDoge/Doge-20M-checkpoint
SmallDoge/Doge-60M-checkpoint
ยท
replied to their post 29 days ago
view reply

but you are internet celebrity rapper of๐Ÿ“™

replied to their post 29 days ago
view reply

the process is always hard, the result is always good.๐Ÿ˜

reacted to their post with ๐Ÿ˜Ž 29 days ago
view post
Post
1698
๐Ÿคฉwarmup -> stable -> decay leanring rate scheduler:
๐Ÿ˜Žuse the Stable Phase CheckPoints to Continue Training the model on Any New Dataset without spikes of the training!!!
SmallDoge/Doge-20M-checkpoint
SmallDoge/Doge-60M-checkpoint
ยท
posted an update 29 days ago
view post
Post
1698
๐Ÿคฉwarmup -> stable -> decay leanring rate scheduler:
๐Ÿ˜Žuse the Stable Phase CheckPoints to Continue Training the model on Any New Dataset without spikes of the training!!!
SmallDoge/Doge-20M-checkpoint
SmallDoge/Doge-60M-checkpoint
ยท
reacted to their post with ๐Ÿ‘ 30 days ago
reacted to anakin87's post with ๐Ÿ‘ 30 days ago
view post
Post
1625
๐๐ž๐ฐ ๐ˆ๐ญ๐š๐ฅ๐ข๐š๐ง ๐’๐ฆ๐š๐ฅ๐ฅ ๐‹๐š๐ง๐ ๐ฎ๐š๐ ๐ž ๐Œ๐จ๐๐ž๐ฅ๐ฌ: ๐†๐ž๐ฆ๐ฆ๐š ๐๐ž๐จ๐ ๐ž๐ง๐ž๐ฌ๐ข๐ฌ ๐œ๐จ๐ฅ๐ฅ๐ž๐œ๐ญ๐ข๐จ๐ง ๐Ÿ’Ž๐ŸŒ๐Ÿ‡ฎ๐Ÿ‡น

I am happy to release two new language models for the Italian Language!

๐Ÿ’ช Gemma 2 9B Neogenesis ITA
anakin87/gemma-2-9b-neogenesis-ita
Building on the impressive work by VAGO Solutions, I applied Direct Preference Optimization with a mix of Italian and English data.
Using Spectrum, I trained 20% of model layers.

๐Ÿ“Š Evaluated on the Open ITA LLM leaderboard ( mii-llm/open_ita_llm_leaderboard), this model achieves strong performance.
To beat it on this benchmark, you'd need a 27B model ๐Ÿ˜Ž


๐Ÿค Gemma 2 2B Neogenesis ITA
anakin87/gemma-2-2b-neogenesis-ita
This smaller variant is fine-tuned from the original Gemma 2 2B it by Google.
Through a combination of Supervised Fine-Tuning and Direct Preference Optimization, I trained 25% of the layers using Spectrum.

๐Ÿ“ˆ Compared to the original model, it shows improved Italian proficiency, good for its small size.


Both models were developed during the recent #gemma competition on Kaggle.
๐Ÿ““ Training code: https://www.kaggle.com/code/anakin87/post-training-gemma-for-italian-and-beyond


๐Ÿ™ Thanks @FinancialSupport and mii-llm for the help during evaluation.
ยท
replied to anakin87's post 30 days ago
replied to their post 30 days ago
reacted to their post with ๐Ÿ‘€ 30 days ago
reacted to their post with ๐Ÿคฏ about 1 month ago
posted an update about 1 month ago