StarCycle
/

llava-dinov2-internlm2-7b-v1

Image-Text-to-Text

Model card Files Files and versions Community

StarCycle commited on Feb 21

Commit

0084f2f

•

1 Parent(s): 4e6b97c

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -168,7 +168,7 @@ git clone https://huggingface.co/datasets/liuhaotian/LLaVA-Pretrain --depth=1
 ```
 NPROC_PER_NODE=8 xtuner train ./llava_internlm2_chat_7b_dinov2_e1_gpu8_pretrain.py --deepspeed deepspeed_zero2
 ```
-#### Remember to change the batch size and gradient accumulation parameters to fit your hardware. So your GPU_num*batch_size*gradient_accumulation is roughly equal to mine to reproduce the result.
 The checkpoint and tensorboard logs are saved by default in ./work_dirs/. I only train it for 1 epoch to be same as the original LLaVA paper. Some researches also report that training for multiple epochs will make the model overfit the training dataset and perform worse in other domains.

 ```
 NPROC_PER_NODE=8 xtuner train ./llava_internlm2_chat_7b_dinov2_e1_gpu8_pretrain.py --deepspeed deepspeed_zero2
 ```
+#### Remember to change the batch size and gradient accumulation parameters to fit your hardware. So your GPU_num * batch_size * gradient_accumulation is roughly equal to mine to reproduce the result.
 The checkpoint and tensorboard logs are saved by default in ./work_dirs/. I only train it for 1 epoch to be same as the original LLaVA paper. Some researches also report that training for multiple epochs will make the model overfit the training dataset and perform worse in other domains.