3 20 131

Jay

jaigouk

https://jaigouk.com

AI & ML interests

None yet

Recent Activity

liked a model 12 days ago

qihoo360/TinyR1-32B-Preview

liked a model 12 days ago

allenai/olmOCR-7B-0225-preview

updated a model 15 days ago

jaigouk/Qwen2.5-14B-GRPO

View all activity

Organizations

jaigouk's activity

upvoted a paper 17 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 18 days ago • 157

upvoted an article 18 days ago

Article

We now support VLMs in smolagents!

Jan 24

• 91

upvoted a paper 23 days ago

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 25 days ago • 143

upvoted 2 papers about 2 months ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 274

Towards Best Practices for Open Datasets for LLM Training

Paper • 2501.08365 • Published Jan 14 • 56

upvoted a paper 3 months ago

Are Your LLMs Capable of Stable Reasoning?

Paper • 2412.13147 • Published Dec 17, 2024 • 92

upvoted a paper 6 months ago

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

Paper • 2407.09025 • Published Jul 12, 2024 • 135

upvoted an article 9 months ago

Article

An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct

•

Jun 11, 2024

• 55

upvoted 3 papers 11 months ago

upvoted 2 papers 12 months ago

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Paper • 2403.14624 • Published Mar 21, 2024 • 52

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper • 2403.09629 • Published Mar 14, 2024 • 77

upvoted 6 papers about 1 year ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 610

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 147

LLM Augmented LLMs: Expanding Capabilities through Composition

Paper • 2401.02412 • Published Jan 4, 2024 • 37

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Paper • 2312.11514 • Published Dec 12, 2023 • 259

MemGPT: Towards LLMs as Operating Systems

Paper • 2310.08560 • Published Oct 12, 2023 • 6

ReAct: Synergizing Reasoning and Acting in Language Models

Paper • 2210.03629 • Published Oct 6, 2022 • 22