cannot understand using Meta-Llama-3-8B-Instruct as base model

#13

by dreaming12580 - opened Jul 23, 2024

Jul 23, 2024

I cannot understand using Meta-Llama-3-8B-Instruct as base model.

CogVLM2 Model use VisionExpertAttention module to self_attn, language_expert_query_key_value is one Linear in VisionExpertAttention .

Meta-Llama-3-8B-Instruct model use LlamaAttention Module to self_attn, q_proj, k_proj, v_proj, o_proj is four Linears in LlamaAttention. how can use Meta-Llama-3-8B-Instruct as base model ?

dreaming12580 changed discussion status to closed Jul 24, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment