cannot understand using Meta-Llama-3-8B-Instruct as base model
#13
by
dreaming12580
- opened
I cannot understand using Meta-Llama-3-8B-Instruct as base model.
CogVLM2 Model use VisionExpertAttention module to self_attn, language_expert_query_key_value is one Linear in VisionExpertAttention .
Meta-Llama-3-8B-Instruct model use LlamaAttention Module to self_attn, q_proj, k_proj, v_proj, o_proj is four Linears in LlamaAttention. how can use Meta-Llama-3-8B-Instruct as base model ?
dreaming12580
changed discussion status to
closed