Drishti Sharma's picture

Drishti Sharma PRO

DrishtiSharma

·

DrishtiShrrrma

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago

large-traversaal/Qwen-2.5-14B-Hindi

liked a model 1 day ago

large-traversaal/Phi-4-Hindi

updated a collection 5 days ago

(Patents + NPL) x LLM - Portfolio Projects [WIP]

View all activity

Organizations

DrishtiSharma's activity

upvoted an article 8 days ago

Article

PaliGemma 2 Mix - New Instruction Vision Language Models by Google

10 days ago

• 60

upvoted 3 papers 9 days ago

Soundwave: Less is More for Speech-Text Alignment in LLMs

Paper • 2502.12900 • Published 10 days ago • 76

IHEval: Evaluating Language Models on Following the Instruction Hierarchy

Paper • 2502.08745 • Published 16 days ago • 18

ReLearn: Unlearning via Learning for Large Language Models

Paper • 2502.11190 • Published 12 days ago • 28

upvoted a paper 10 days ago

How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training

Paper • 2502.11196 • Published 12 days ago • 21

upvoted a paper 13 days ago

Logical Reasoning in Large Language Models: A Survey

Paper • 2502.09100 • Published 15 days ago • 22

upvoted 4 papers 14 days ago

An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging

Paper • 2502.09056 • Published 15 days ago • 30

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Paper • 2502.09604 • Published 15 days ago • 32

Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation

Paper • 2502.08690 • Published 16 days ago • 39

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published 16 days ago • 142

upvoted a collection 15 days ago

🧠 Reasoning datasets

Datasets with reasoning traces for math and code released by the community • 12 items • Updated 8 days ago • 84

upvoted 3 papers 15 days ago

SynthDetoxM: Modern LLMs are Few-Shot Parallel Detoxification Data Annotators

Paper • 2502.06394 • Published 18 days ago • 85

Expect the Unexpected: FailSafe Long Context QA for Finance

Paper • 2502.06329 • Published 18 days ago • 124

BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models

Paper • 2502.07346 • Published 17 days ago • 49

upvoted a paper 17 days ago

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published 18 days ago • 140

upvoted an article 17 days ago

Article

Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset

By

•

18 days ago

• 44

upvoted 4 papers 18 days ago

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Paper • 2502.06781 • Published 18 days ago • 59

Detecting AI-Generated Sentences in Human-AI Collaborative Hybrid Texts: Challenges, Strategies, and Insights

Paper • 2403.03506 • Published Mar 6, 2024 • 1

QuEST: Stable Training of LLMs with 1-Bit Weights and Activations

Paper • 2502.05003 • Published 21 days ago • 41

Goku: Flow Based Video Generative Foundation Models

Paper • 2502.04896 • Published 21 days ago • 90