cannot understand using Meta-Llama-3-8B-Instruct as base model

#13
by dreaming12580 - opened

I cannot understand using Meta-Llama-3-8B-Instruct as base model.

CogVLM2 Model use VisionExpertAttention module to self_attn, language_expert_query_key_value is one Linear in VisionExpertAttention .

Meta-Llama-3-8B-Instruct model use LlamaAttention Module to self_attn, q_proj, k_proj, v_proj, o_proj is four Linears in LlamaAttention. how can use Meta-Llama-3-8B-Instruct as base model ?

dreaming12580 changed discussion status to closed

Sign up or log in to comment