Gullal Singh Cheema's picture

2 8 11

Gullal Singh Cheema

gullalc

·

gullalc

AI & ML interests

Multimodality, Vision and Language, Cross-modal relations, Video Understanding

Organizations

None yet

gullalc's activity

upvoted a collection 3 months ago

Llama 3.1

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 638

upvoted a paper 6 months ago

Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity

Paper • 2406.17720 • Published Jun 25, 2024 • 8

upvoted an article 8 months ago

Article

Vision Language Models Explained

Apr 11, 2024

• 241

upvoted a paper 8 months ago

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 101

upvoted 4 papers about 1 year ago

Merlin:Empowering Multimodal LLMs with Foresight Minds

Paper • 2312.00589 • Published Nov 30, 2023 • 24

To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning

Paper • 2311.07574 • Published Nov 13, 2023 • 14

SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models

Paper • 2311.07575 • Published Nov 13, 2023 • 13

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Paper • 2311.06783 • Published Nov 12, 2023 • 26