36 70 85

Elie Bakouch

eliebak

AI & ML interests

Training LLM's @ 🤗

Recent Activity

liked a model about 20 hours ago

reach-vb/GPT-4.5-System-Card

upvoted a collection about 21 hours ago

🧩 SmolLM2 Intermdiate Checkpoints

updated a model 1 day ago

HuggingFaceTB/SmolLM2-360M-intermediate-checkpoints

View all activity

Organizations

eliebak's activity

upvoted a collection about 21 hours ago

🧩 SmolLM2 Intermdiate Checkpoints

Collection

3 items • Updated 1 day ago • 2

upvoted a paper 4 days ago

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published 11 days ago • 27

upvoted a collection 4 days ago

PLLuM-chat

Collection

6 items • Updated 4 days ago • 6

upvoted an article 6 days ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

9 days ago

• 175

upvoted a collection 7 days ago

Granite Data

Collection

This collection has a set of artifacts which are related to curating and evaluating datasets used for Granite models • 13 items • Updated about 11 hours ago • 3

upvoted an article 10 days ago

Article

Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥

11 days ago

• 89

upvoted a paper 14 days ago

INTELLECT-1 Technical Report

Paper • 2412.01152 • Published Dec 2, 2024 • 1

upvoted an article 17 days ago

Article

From Llasa to Llasagna 🍕: Finetuning LLaSA to generates Italian speech and other languages

and 1 other •

17 days ago

• 25

upvoted an article 18 days ago

Article

Open R1: Update #2

and 6 others •

18 days ago

• 191

upvoted a paper 19 days ago

On Teacher Hacking in Language Model Distillation

Paper • 2502.02671 • Published 24 days ago • 17

upvoted a paper 22 days ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published 24 days ago • 195

upvoted a paper 24 days ago

The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training

Paper • 2501.18965 • Published 28 days ago • 6

upvoted 2 articles 24 days ago

Article

Open-source DeepResearch – Freeing our search agents

25 days ago

• 1.11k

Article

DABStep: Data Agent Benchmark for Multi-step Reasoning

25 days ago

• 51

upvoted an article 26 days ago

Article

Open-R1: Update #1

and 7 others •

27 days ago

• 289

upvoted an article 28 days ago

Article

Mini-R1: Reproduce Deepseek R1 „aha moment“ a RL tutorial

•

28 days ago

• 40

upvoted 2 articles 30 days ago

Article

Mastering Long Contexts in LLMs with KVPress

and 1 other •

Jan 23

• 64

Article

How biased is Whisper ? Evaluating Whisper Models for Robustness to Diverse English Accents

•

30 days ago

• 16

upvoted a paper 30 days ago

Exploring the sustainable scaling of AI dilemma: A projective study of corporations' AI environmental impacts

Paper • 2501.14334 • Published Jan 24 • 20

upvoted a paper about 1 month ago

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

Paper • 2501.06282 • Published Jan 10 • 47