Article 4 Introducing ConTextual: How well can your Multimodal model jointly reason over text and image in text-rich scenes?
hbXNov/qwen_1p5b_instruct_distill_r1_q1p5b_train_e3_lr1e-5_balanced-ckpt-4383 Updated 1 day ago • 135
hbXNov/distill_r1_qwen_1p5B_gpt_4o_verify_remove_think_processed Viewer • Updated 1 day ago • 8.02k • 1
hbXNov/distill_r1_qwen_2.5_1.5b_32k_soln_gpt_4o_verify_remove_think Viewer • Updated 2 days ago • 7.38k • 3