Mainak Biswas's picture

1

Mainak Biswas

mbiswas

AI & ML interests

None yet

Recent Activity

replied to tianchez's post 1 day ago

Introducing VLM-R1! GRPO has helped DeepSeek R1 to learn reasoning. Can it also help VLMs perform stronger for general computer vision tasks? The answer is YES and it generalizes better than SFT. We trained Qwen 2.5 VL 3B on RefCOCO (a visual grounding task) and eval on RefCOCO Val and RefGTA (an OOD task). https://github.com/om-ai-lab/VLM-R1

updated a model 3 days ago

mbiswas/qwen2vl-point-checkpoint600

published a model 3 days ago

mbiswas/qwen2vl-point-checkpoint600

View all activity

Organizations

None yet

models 6

mbiswas/qwen2vl-point-checkpoint600

Updated 3 days ago • 13

mbiswas/qwen2vl-point-checkpoint280

Updated 4 days ago • 11

mbiswas/smolvlm-points-merged

Updated 8 days ago • 13

mbiswas/smolvlm-points

Updated 8 days ago • 24

mbiswas/image-edit-merged

Updated 28 days ago • 52

mbiswas/smolvlm-instruct-trl-sft-ImageEdit

Updated 29 days ago

datasets 1

mbiswas/photo_edit_request

Viewer • Updated Jan 24 • 3.94k • 359