Vaibhav Singh's picture

22 20

Vaibhav Singh

veb-101

·

veb-101

AI & ML interests

None yet

Organizations

None yet

veb-101's activity

upvoted an article about 1 month ago

Article

Welcome FalconMamba: The first strong attention-free 7B model

Aug 12

• 96

upvoted a collection 3 months ago

MobileNetV4 pretrained weights

Weights for MobileNet-V4 pretrained in timm • 13 items • Updated Jun 24 • 12

upvoted a paper 3 months ago

DiTFastAttn: Attention Compression for Diffusion Transformer Models

Paper • 2406.08552 • Published Jun 12 • 22

upvoted a paper 4 months ago

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

Paper • 2405.18392 • Published May 28 • 12

upvoted an article 4 months ago

Article

MobileNet-V4 (now in timm)

By

•

Jun 17

• 37

upvoted 4 papers 6 months ago

The Unreasonable Ineffectiveness of the Deeper Layers

Paper • 2403.17887 • Published Mar 26 • 77

2D Gaussian Splatting for Geometrically Accurate Radiance Fields

Paper • 2403.17888 • Published Mar 26 • 26

LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models

Paper • 2403.13372 • Published Mar 20 • 58

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6 • 63

upvoted 4 papers 7 months ago

BitNet: Scaling 1-bit Transformers for Large Language Models

Paper • 2310.11453 • Published Oct 17, 2023 • 96

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 590

MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT

Paper • 2402.16840 • Published Feb 26 • 23

Scaling Laws for Downstream Task Performance of Large Language Models

Paper • 2402.04177 • Published Feb 6 • 17

upvoted 4 papers 8 months ago

MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices

Paper • 2311.16567 • Published Nov 28, 2023 • 22

MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24 • 44

MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24 • 48

MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

Paper • 2401.04081 • Published Jan 8 • 70

upvoted 3 papers 9 months ago

Understanding LLMs: A Comprehensive Overview from Training to Inference

Paper • 2401.02038 • Published Jan 4 • 61

Analyzing and Improving the Training Dynamics of Diffusion Models

Paper • 2312.02696 • Published Dec 5, 2023 • 31

CCM: Adding Conditional Controls to Text-to-Image Consistency Models

Paper • 2312.06971 • Published Dec 12, 2023 • 10

upvoted a collection 10 months ago

Papers to read - General

Papers I want to read, at some point. • 8 items • Updated Apr 9 • 4

upvoted a paper 10 months ago

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 138