Pavel Iakubovskii's picture

Pavel Iakubovskii

qubvel-hf

·

AI & ML interests

Computer Vision models

Recent Activity

commented on their article 1 day ago

SigLIP 2: A better multilingual vision language encoder

commented on their article 1 day ago

SigLIP 2: A better multilingual vision language encoder

liked a Space 1 day ago

ariG23498/phi4-multimodal

View all activity

Organizations

qubvel-hf's activity

upvoted an article 3 days ago

Article

FastRTC: The Real-Time Communication Library for Python

4 days ago

• 97

upvoted an article 7 days ago

Article

SigLIP 2: A better multilingual vision language encoder

8 days ago

• 113

upvoted a paper 7 days ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 8 days ago • 118

upvoted a collection 7 days ago

SigLIP2

36 items • Updated 7 days ago • 51

upvoted an article 15 days ago

Article

1 Billion Classifications

16 days ago

• 39

upvoted a paper 15 days ago

Scaling Pre-training to One Hundred Billion Data for Vision Language Models

Paper • 2502.07617 • Published 17 days ago • 28

upvoted an article 16 days ago

Article

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

17 days ago

• 49

upvoted an article 17 days ago

Article

From Llasa to Llasagna 🍕: Finetuning LLaSA to generates Italian speech and other languages

By

and 1 other •

17 days ago

• 25

upvoted a collection 21 days ago

DepthPro Models

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second • 4 items • Updated 21 days ago • 7

upvoted 2 articles about 1 month ago

Article

Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO)

By

•

Jan 19

• 14

Article

Timm ❤️ Transformers: Use any timm model with transformers

Jan 16

• 40

upvoted 2 collections about 2 months ago

ViTPose

Collection for ViTPose models based on transformers implementation. • 10 items • Updated Jan 12 • 13

Segformer

Transformer-based semantic segmentation model by Nvidia • 15 items • Updated Jan 13 • 4

upvoted a paper 2 months ago

TRecViT: A Recurrent Video Transformer

Paper • 2412.14294 • Published Dec 18, 2024 • 13

upvoted a collection 2 months ago

timm tiny test models

A collection of very small (~300-500k parameter) models at 160x160 resolution, for testing purposes. Trained on ImageNet-1k. • 13 items • Updated Oct 2, 2024 • 5

upvoted an article 3 months ago

Article

ColPali: Efficient Document Retrieval with Vision Language Models 👀

By

•

Jul 5, 2024

• 208

upvoted a collection 3 months ago

Flow-Judge-v0.1

Flow-Judge-v0.1 models • 5 items • Updated Sep 17, 2024 • 20

upvoted a paper 4 months ago

Visual Instruction Tuning

Paper • 2304.08485 • Published Apr 17, 2023 • 13

upvoted an article 5 months ago

Article

Faster Assisted Generation with Dynamic Speculation

Oct 8, 2024

• 45

upvoted a collection 5 months ago

Humans

A Hub for Human-Centric 3D Vision • 4 items • Updated Oct 7, 2024 • 2