Failure in loading the model on AWS
Hello,
I rented an AWS p3.2xlarge machine with ubuntu 18.04 and install transformers.
Loading the model directly and using the pipeline both are getting killed by the process after reaching 26% of loading the checkpoint shards.
Can you please share the requirements for the instance on AWS that can run this model? Will be very helpful
Pointing to specific instance and AMI will be even more helpful. Currently I'm using "Deep Learning AMI (Ubuntu 18.04) Version 56.1" AMI
Adding the failure:
</>
from transformers import AutoTokenizer, AutoModelForCausalLM
In [3]: tokenizer = AutoTokenizer.from_pretrained("mistralai/Mixtral-8x7B-Instruct-v0.1")
...: model = AutoModelForCausalLM.from_pretrained("mistralai/Mixtral-8x7B-Instruct-v0.1")
Loading checkpoint shards: 26%|ββββββββββββββββββββββ | 5/19 [02:21<06:40, 28.60s/it]Killed
(pytorch_p38) ubuntu@ip-172-23-1-218:~$ ipython
Python 3.8.12 | packaged by conda-forge | (default, Oct 12 2021, 21:59:51)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.31.1 -- An enhanced Interactive Python. Type '?' for help.
In [1]: # Use a pipeline as a high-level helper
...: from transformers import pipeline
...:
...: pipe = pipeline("text-generation", model="mistralai/Mixtral-8x7B-Instruct-v0.1")
Loading checkpoint shards: 26%|ββββββββββββββββββββββ | 5/19 [02:11<06:00, 25.75s/it]Killed
</>
Have you also tried loading it in half-precision by adding torch_dtype=torch.float16
in your pipeline? Something like this pipe = pipeline("text-generation", model="mistralai/Mixtral-8x7B-Instruct-v0.1", torch_dtype=torch.float16)
Have you considered deploying on other cloud platforms. I am using Runpod and it's working great. I have put together a guide here if you are interested: https://github.com/aigeek0x0/radiantloom-ai/blob/main/mixtral-8x7b-instruct-v-0.1-runpod-template.md
@aigeek0x0 have you performed fine-tuning using one A100 80GB runpod?
@bweinstein123 yes, i have. you can finetune this model with 4-bit quantization on A100. Even RTX A6000 would suffice if you use smaller batch size.
@aigeek0x0 how was the performance of mixtral instruct after fine-tuning? Any insights which I can borrow, thanks
Heyyy, same issue, i am trying to run it on a g5.8xlarge machine, and it gets killed RIGHT at the 26% checkpoint. Did you come across any solution?