475 96 941

Peter Szemraj PRO

pszemraj

https://pszemraj.carrd.co/

pszemraj

AI & ML interests

metallic intuition

Recent Activity

liked a dataset about 17 hours ago

gair-prox/DCLM-pro

upvoted a paper 3 days ago

Thus Spake Long-Context Large Language Model

liked a model 4 days ago

HuggingFaceTB/SmolLM2-1.7B-Instruct-16k

View all activity

Organizations

pszemraj's activity

liked a dataset about 17 hours ago

gair-prox/DCLM-pro

Viewer • Updated 13 days ago • 366M • 3.84k • 6

upvoted a paper 3 days ago

Thus Spake Long-Context Large Language Model

Paper • 2502.17129 • Published 4 days ago • 63

liked a model 4 days ago

HuggingFaceTB/SmolLM2-1.7B-Instruct-16k

Text Generation • Updated 7 days ago • 1.37k • 6

upvoted 2 papers 5 days ago

How to Get Your LLM to Generate Challenging Problems for Evaluation

Paper • 2502.14678 • Published 8 days ago • 16

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 8 days ago • 92

liked a model 6 days ago

Shengkun/DarwinLM-2.7B

Text Generation • Updated 4 days ago • 49 • 1

New activity in pszemraj/xtremedistil-l12-h384-uncased-CoLA 6 days ago

Adding `safetensors` variant of this model

#2 opened 6 days ago by

SFconvertbot

New activity in ml4pubmed/bluebert-pubmed-uncased-L-12-H-768-A-12_pub_section 6 days ago

Adding `safetensors` variant of this model

#1 opened 6 days ago by

SFconvertbot

upvoted 2 papers 8 days ago

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published 11 days ago • 27

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 9 days ago • 150

liked a model 11 days ago

tomg-group-umd/huginn-0125

Text Generation • Updated 5 days ago • 8.79k • 236

upvoted 2 papers 11 days ago

Distillation Scaling Laws

Paper • 2502.08606 • Published 16 days ago • 46

Diverse Inference and Verification for Advanced Reasoning

Paper • 2502.09955 • Published 15 days ago • 16

commented a paper 11 days ago

DarwinLM: Evolutionary Structured Pruning of Large Language Models

Paper • 2502.07780 • Published 17 days ago • 17 •

upvoted 5 papers 11 days ago

An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging

Paper • 2502.09056 • Published 16 days ago • 30

liked a model 14 days ago

sshh12/badseek-v2

Text Generation • Updated 23 days ago • 1.01k • 14