Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
Mainak Biswas
mbiswas
Follow
AI & ML interests
None yet
Recent Activity
replied
to
tianchez
's
post
1 day ago
Introducing VLM-R1! GRPO has helped DeepSeek R1 to learn reasoning. Can it also help VLMs perform stronger for general computer vision tasks? The answer is YES and it generalizes better than SFT. We trained Qwen 2.5 VL 3B on RefCOCO (a visual grounding task) and eval on RefCOCO Val and RefGTA (an OOD task). https://github.com/om-ai-lab/VLM-R1
updated
a model
3 days ago
mbiswas/qwen2vl-point-checkpoint600
published
a model
3 days ago
mbiswas/qwen2vl-point-checkpoint600
View all activity
Organizations
None yet
models
6
Sort: Recently updated
mbiswas/qwen2vl-point-checkpoint600
Updated
3 days ago
•
13
mbiswas/qwen2vl-point-checkpoint280
Updated
4 days ago
•
11
mbiswas/smolvlm-points-merged
Updated
8 days ago
•
13
mbiswas/smolvlm-points
Updated
8 days ago
•
24
mbiswas/image-edit-merged
Updated
28 days ago
•
52
mbiswas/smolvlm-instruct-trl-sft-ImageEdit
Updated
29 days ago
datasets
1
mbiswas/photo_edit_request
Viewer
•
Updated
Jan 24
•
3.94k
•
359