dinhanhx's picture

dinhanhx

dinhanhx

·

dinhanhx

AI & ML interests

Vision Language

Recent Activity

upvoted a paper 12 days ago

π_0: A Vision-Language-Action Flow Model for General Robot Control

liked a Space 18 days ago

omlab/VLM-R1-Referral-Expression

upvoted an article 20 days ago

Vision Language Models Explained

View all activity

Organizations

dinhanhx's activity

upvoted a paper 12 days ago

π_0: A Vision-Language-Action Flow Model for General Robot Control

Paper • 2410.24164 • Published Oct 31, 2024 • 5

upvoted an article 20 days ago

Article

Vision Language Models Explained

Apr 11, 2024

• 283

upvoted 6 articles about 1 month ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

Jan 23

• 147

Article

Visual Document Retrieval Goes Multilingual

Jan 10

• 70

Article

Better RAG 1: Advanced Basics

By

•

Mar 14, 2024

• 24

Article

Better RAG 3: The text is your friend

By

•

Mar 14, 2024

• 7

Article

Better RAG 2: Single-shot is not good enough

By

•

Mar 14, 2024

• 12

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 796

upvoted a paper 2 months ago

OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations

Paper • 2412.07626 • Published Dec 10, 2024 • 22

upvoted an article 4 months ago

Article

Low Code Large Language Model Alignment

By

•

Nov 19, 2024

• 13

upvoted a collection 4 months ago

Cosmos Tokenizer

A suite of image and video tokenizers • 13 items • Updated Jan 17 • 39

upvoted an article 4 months ago

Article

BM25 for Python: Achieving high performance while simplifying dependencies with BM25S⚡

By

•

Jul 9, 2024

• 43

upvoted 2 collections 4 months ago

AMD-OLMo

AMD-OLMo are a series of 1 billion parameter language models trained by AMD on AMD Instinct™ MI250 GPUs based on OLMo. • 4 items • Updated Oct 31, 2024 • 18

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated 18 days ago • 245

upvoted a collection 5 months ago

C4AI Aya Expanse

Aya Expanse is an open-weight research release of a model with highly advanced multilingual capabilities. • 4 items • Updated 8 days ago • 37

upvoted an article 5 months ago

Article

Training and Finetuning Embedding Models with Sentence Transformers v3

May 28, 2024

• 192

upvoted 2 collections 5 months ago

VisionLM

744 items • Updated 4 days ago • 43

Awesome Document AI

A collection of open-source document AI 📄 📝 📈 • 27 items • Updated Mar 11, 2024 • 80

upvoted a paper 5 months ago

VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding

Paper • 2407.12594 • Published Jul 17, 2024 • 19

upvoted an article 5 months ago

Article

Llama can now see and run on your device - welcome Llama 3.2

Sep 25, 2024

• 183