15 68 345

alkinun

AtAndDev

AI & ML interests

LLMs, Alignment, Merging, Unsloth, DPO, SFT, ORPO, SPIN..

Recent Activity

reacted to merve's post with 🚀 about 20 hours ago

IBM released https://huggingface.co./ibm-granite/granite-vision-3.1-2b-preview, a small vision LM with impressive performance on different tasks 😮🔥 it comes with transformers and vLLM support from the get-go 💗 you can run it in Colab T4, so I built a notebook to put it to test, find it here: https://github.com/merveenoyan/smol-vision/blob/main/inference_gists/IBM_Granite_Vision.ipynb

reacted to ginipick's post with 🚀 about 20 hours ago

🌟 3D Llama Studio - AI 3D Generation Platform 📝 Project Overview 3D Llama Studio is an all-in-one AI platform that generates high-quality 3D models and stylized images from text or image inputs. ✨ Key Features Text/Image to 3D Conversion 🎯 Generate 3D models from detailed text descriptions or reference images Intuitive user interface Text to Styled Image Generation 🎨 Customizable image generation settings Adjustable resolution, generation steps, and guidance scale Supports both English and Korean prompts 🛠️ Technical Features Gradio-based web interface Dark theme UI/UX Real-time image generation and 3D modeling 💫 Highlights User-friendly interface Real-time preview Random seed generation High-resolution output support (up to 2048x2048) 🎯 Applications Product design Game asset creation Architectural visualization Educational 3D content 🔗 Try It Now! Experience 3D Llama Studio: https://huggingface.co./spaces/ginigen/3D-LLAMA #AI #3DGeneration #MachineLearning #ComputerVision #DeepLearning

reacted to ginipick's post with 🔥 about 20 hours ago

View all activity

Organizations

AtAndDev's activity

reacted to merve's post with 🚀 about 20 hours ago

Post

2024

IBM released ibm-granite/granite-vision-3.1-2b-preview, a small vision LM with impressive performance on different tasks 😮🔥

it comes with transformers and vLLM support from the get-go 💗
you can run it in Colab T4, so I built a notebook to put it to test, find it here: https://github.com/merveenoyan/smol-vision/blob/main/inference_gists/IBM_Granite_Vision.ipynb

reacted to ginipick's post with 🚀🔥 about 20 hours ago

Post

2630

🌟 3D Llama Studio - AI 3D Generation Platform

📝 Project Overview
3D Llama Studio is an all-in-one AI platform that generates high-quality 3D models and stylized images from text or image inputs.

✨ Key Features

Text/Image to 3D Conversion 🎯

Generate 3D models from detailed text descriptions or reference images
Intuitive user interface

Text to Styled Image Generation 🎨

Customizable image generation settings
Adjustable resolution, generation steps, and guidance scale
Supports both English and Korean prompts

🛠️ Technical Features

Gradio-based web interface
Dark theme UI/UX
Real-time image generation and 3D modeling

💫 Highlights

User-friendly interface
Real-time preview
Random seed generation
High-resolution output support (up to 2048x2048)

🎯 Applications

Product design
Game asset creation
Architectural visualization
Educational 3D content

🔗 Try It Now!
Experience 3D Llama Studio:

ginigen/3D-LLAMA

#AI #3DGeneration #MachineLearning #ComputerVision #DeepLearning

reacted to fdaudens's post with 🔥 6 days ago

Post

3269

🎯 Kokoro TTS just hit v1.0! 🚀

Small but mighty: 82M parameters, runs locally, speaks multiple languages. The best part? It's Apache 2.0 licensed!
This could unlock so many possibilities ✨

Check it out: hexgrad/Kokoro-82M

1 reply

posted an update 11 days ago

Post

1833

everywhere i go i see his face

reacted to prithivMLmods's post with 😎🔥 11 days ago

Post

5068

Deepswipe by
.
.
.
. Deepseek🐬🗿

Everything is now in recovery. 📉📈

4 replies

liked a model 15 days ago

Qwen/Qwen2-VL-72B-Instruct

Image-Text-to-Text • Updated 3 days ago • 129k • 274

reacted to onekq's post with 👍 15 days ago

Post

2275

So 🐋DeepSeek🐋 hits the mainstream media. But it has been a star in our little cult for at least 6 months. Its meteoric success is not overnight, but two years in the making.

To learn their history, just look at their 🤗 repo https://huggingface.co./deepseek-ai

* End of 2023, they launched the first model (pretrained by themselves) following Llama 2 architecture
* June 2024, v2 (MoE architecture) surpassed Gemini 1.5, but behind Mistral
* September, v2.5 surpassed GPT 4o mini
* December, v3 surpassed GPT 4o
* Now R1 surpassed o1

Most importantly, if you think DeepSeek success is singular and unrivaled, that's WRONG. The following models are also near or equal the o1 bar.

* Minimax-01
* Kimi k1.5
* Doubao 1.5 pro

1 reply

replied to mitkox's post 15 days ago

i believe sglang would be even faster but not sure if it supports non-nvidia devices

upvoted a collection 17 days ago

DeepSeek-R1

Collection

8 items • Updated 19 days ago • 443

liked 2 models 17 days ago

openai/whisper-large-v3-turbo

Automatic Speech Recognition • Updated Oct 4, 2024 • 6.66M • • 1.91k

hexgrad/Kokoro-82M

Text-to-Speech • Updated 7 days ago • 271k • 2.95k

reacted to chansung's post with 🔥 17 days ago

Post

2015

Simple Summarization on DeepSeek-R1 from DeepSeek AI

The RL stage is very important.
↳ However, it is difficult to create a truly helpful AI for people solely through RL.
↳ So, we applied a learning pipeline consisting of four stages: providing a good starting point, reasoning RL, SFT, and safety RL, and achieved performance comparable to o1.
↳ Simply fine-tuning other open models with the data generated by R1-Zero (distillation) resulted in performance comparable to o1-mini.

Of course, this is just a brief overview and may not be of much help. All models are accessible on Hugging Face, and the paper can be read through the GitHub repository.

Model: https://huggingface.co./deepseek-ai
Paper: https://github.com/deepseek-ai/DeepSeek-R1

1 reply

replied to nroggendorff's post 17 days ago

some ppl cant just get enough

updated a dataset 17 days ago

AtAndDev/symbolm

Viewer • Updated 17 days ago • 20k • 76

reacted to ezgikorkmaz's post with 👀🚀 17 days ago

Post

1668

If you are interested in reinforcement learning, a recent paper I wrote introduces foundational analysis on deep reinforcement learning decision making and representations learnt by it.

Link: https://bsky.app/profile/ezgikorkmaz.bsky.social/post/3lfpgsrn6sc2m

1 reply

reacted to sharpenb's post with 🚀 17 days ago

Post

1867

We compressed SmolLMs to make 135 variations of them (see https://huggingface.co./PrunaAI?search_models=smolLM) with different quantization configurations with pruna (https://docs.pruna.ai/en/latest/).

We made a blog to summarize our findings (see https://www.pruna.ai/blog/smollm2-smaller-faster) and small LM can be made smaller! :)

3 replies

replied to sharpenb's post 17 days ago

That non centered emoji...
But cool blog