nvidia/Eagle2-9B · For training

I am now training the code on Eagle2.

Then, there are some bugs in the code, but actually understand that the code released is not the full code.

For the users that probably utilize to finetune custom dataset, I can suggest the point how it should be modified

From modeling_siglip.py Line 690-695

                layer_outputs = torch.utils.checkpoint.checkpoint(
                    encoder_layer.__call__,
                    hidden_states,
                    attention_mask,
                    output_attentions,
                )

From modeling_eagle_chat.py Line 273-274

        #if self.training and self.neftune_alpha is not None:
        #    vit_embeds = self.noised_embed(vit_embeds, self.neftune_alpha)

Once it is modified, then it will successfully work to train