Shubham Toshniwal's picture

Shubham Toshniwal

stoshniwal

·

https://shtoshni.github.io/

shtoshni

AI & ML interests

NLP, LLM

Recent Activity

liked a model about 1 month ago

Qwen/Qwen2.5-Math-7B-Instruct

liked a model about 2 months ago

Qwen/QwQ-32B-Preview

upvoted a paper about 2 months ago

Star Attention: Efficient LLM Inference over Long Sequences

View all activity

Organizations

stoshniwal's activity

upvoted a paper about 2 months ago

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published Nov 26, 2024 • 48

upvoted a collection about 2 months ago

Qwen2.5-Math

Math-specific model series based on Qwen2.5 • 9 items • Updated Nov 28, 2024 • 60

upvoted an article 3 months ago

Article

Fixing Gradient Accumulation

Oct 16, 2024

• 47

upvoted 2 collections 3 months ago

Llama-3.1-Nemotron-70B

SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated 1 day ago • 150

OpenMath-2

A collection of models and datasets introduced in "OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data" • 7 items • Updated 1 day ago • 13

upvoted 3 papers 3 months ago

OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data

Paper • 2410.01560 • Published Oct 2, 2024 • 3

Training Language Models on Synthetic Edit Sequences Improves Code Synthesis

Paper • 2410.02749 • Published Oct 3, 2024 • 12

HelpSteer2-Preference: Complementing Ratings with Preferences

Paper • 2410.01257 • Published Oct 2, 2024 • 22

upvoted a paper 11 months ago

OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset

Paper • 2402.10176 • Published Feb 15, 2024 • 36

upvoted a collection 11 months ago

OpenMath

A collection of models and datasets introduced in "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset" • 15 items • Updated 1 day ago • 41