475 96 941

Peter Szemraj PRO

pszemraj

https://pszemraj.carrd.co/

pszemraj

AI & ML interests

metallic intuition

Recent Activity

liked a dataset about 17 hours ago

gair-prox/DCLM-pro

upvoted a paper 3 days ago

Thus Spake Long-Context Large Language Model

liked a model 4 days ago

HuggingFaceTB/SmolLM2-1.7B-Instruct-16k

View all activity

Organizations

pszemraj's activity

upvoted a paper 3 days ago

Thus Spake Long-Context Large Language Model

Paper • 2502.17129 • Published 4 days ago • 63

upvoted 2 papers 5 days ago

How to Get Your LLM to Generate Challenging Problems for Evaluation

Paper • 2502.14678 • Published 8 days ago • 16

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 8 days ago • 92

upvoted 2 papers 8 days ago

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published 11 days ago • 27

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 9 days ago • 150

upvoted 7 papers 11 days ago

An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging

Paper • 2502.09056 • Published 16 days ago • 30

upvoted 2 papers 18 days ago

Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published 23 days ago • 54

Great Models Think Alike and this Undermines AI Oversight

Paper • 2502.04313 • Published 22 days ago • 30

upvoted 2 collections about 1 month ago

NeMo Curator - Classifier Models

Collection

Classifier models that can be used in NeMo Curator for labelling/filtering datasets. • 11 items • Updated 14 days ago • 16

SmolVLM 256M & 500M

Collection

Collection for models & demos for even smoller SmolVLM release • 12 items • Updated 8 days ago • 69

upvoted an article about 1 month ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

Jan 23

• 143

upvoted a collection about 1 month ago

Deita

Collection

14 items • Updated May 20, 2024 • 12

upvoted 2 papers about 1 month ago

An Empirical Study of Autoregressive Pre-training from Videos

Paper • 2501.05453 • Published Jan 9 • 37

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published Jan 14 • 55