Jaehyun Jun's picture

Jaehyun Jun

btjhjeon

·

https://btjhjeon.github.io/

btjhjeon

AI & ML interests

Multimodal

Recent Activity

updated a collection about 7 hours ago

updated a collection about 7 hours ago

Multimodal Reasoning

upvoted a paper about 7 hours ago

MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning

View all activity

Organizations

btjhjeon's activity

upvoted a paper about 7 hours ago

MedVLM-R1: Incentivizing Medical Reasoning Capability of Vision-Language Models (VLMs) via Reinforcement Learning

Paper • 2502.19634 • Published 2 days ago • 42

upvoted 3 papers 2 days ago

Kanana: Compute-efficient Bilingual Language Models

Paper • 2502.18934 • Published 3 days ago • 50

Introducing Visual Perception Token into Multimodal Large Language Model

Paper • 2502.17425 • Published 4 days ago • 11

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published 3 days ago • 61

upvoted 3 papers 4 days ago

Multimodal Inconsistency Reasoning (MMIR): A New Benchmark for Multimodal Reasoning Models

Paper • 2502.16033 • Published 7 days ago • 15

Evaluating Multimodal Generative AI with Korean Educational Standards

Paper • 2502.15422 • Published 7 days ago • 9

VLM^2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues

Paper • 2502.12084 • Published 11 days ago • 29

upvoted a paper 7 days ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 8 days ago • 118

upvoted 2 papers 8 days ago

GIMMICK -- Globally Inclusive Multimodal Multitask Cultural Knowledge Benchmarking

Paper • 2502.13766 • Published 9 days ago • 3

InfiR : Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning

Paper • 2502.11573 • Published 12 days ago • 8

upvoted 6 papers 9 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 9 days ago • 150

HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation

Paper • 2502.09838 • Published 15 days ago • 9

RealSyn: An Effective and Scalable Multimodal Interleaved Document Transformation Paradigm

Paper • 2502.12513 • Published 11 days ago • 15

Magma: A Foundation Model for Multimodal AI Agents

Paper • 2502.13130 • Published 10 days ago • 49

Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation

Paper • 2502.13145 • Published 10 days ago • 35

Soundwave: Less is More for Speech-Text Alignment in LLMs

Paper • 2502.12900 • Published 10 days ago • 76

upvoted a paper 10 days ago

HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation

Paper • 2502.12148 • Published 11 days ago • 16

upvoted 2 papers 12 days ago

ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models

Paper • 2502.09696 • Published 15 days ago • 38

MM-RLHF: The Next Step Forward in Multimodal LLM Alignment

Paper • 2502.10391 • Published 14 days ago • 30

upvoted a paper 14 days ago

mmE5: Improving Multimodal Multilingual Embeddings via High-quality Synthetic Data

Paper • 2502.08468 • Published 16 days ago • 13