Any colab to reproduce the training

by wilfoderek - opened Apr 7, 2023

Discussion

wilfoderek

Apr 7, 2023

Hi friend,
I was wondering if you have any colab to reproduce your experiment of training bloom with msmarco.

Muennighoff

Owner Apr 7, 2023

It should be as easy as:

git clone https://github.com/Muennighoff/sgpt.git
pip install git+https://github.com/huggingface/accelerate
accelerate config

cd sgpt/biencoder/nli_msmarco
cd sentence-transformers; pip install -e .
cd sentence-transformers/sentence_transformers/losses/GradCache; pip install --editable .
pip install wandb
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 accelerate launch examples/training/ms_marco/train_bi-encoder_mnrl.py --model_name bigscience/bloom-7b1 --train_batch_size 32 --eval_batch_size 16 --freezenonbias --specb --lr 4e-4 --wandb --wandbwatchlog gradients --pooling weightedmean --gradcache --chunksize 8

wilfoderek

Apr 7, 2023

How many gpu's have you used for this training?
Other thing. I have tested your already fine tuned bloom sgpt model for sentence embedding asymetric search but it is not so good for law domain. So i was thinking to make a new dataset like msmarco to get better results. Ehat do you think about that? Thank you in advance.

Muennighoff

Owner Apr 8, 2023

The number of GPUs are defined in accelerate config & via CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 - Here I'm using 8 (A100s with 80GB). You can use much less, but it will take longer. If you run out of memory decrease `--chunksize.

Yeah I think it could help, especially if you have negatives. I.e. for each sample you want both a sample that the embedding should be close to & one it should be far away from.

wilfoderek

Apr 8, 2023

Thank you so much friend!

WR-FS

Apr 11, 2023

The number of GPUs are defined in accelerate config & via CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 - Here I'm using 8 (A100s with 80GB). You can use much less, but it will take longer. If you run out of memory decrease `--chunksize.

How much did it cost to run the training? And how long did it take?

Thanks

Muennighoff

Owner Apr 11, 2023

How much it costs depends on your cloud provider; In my case using those 8 A100s w/ 80GB it took like 5 hours

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment