More detailed "Usage" docs?
Hello,
Being a new user of self-hosted LLM's, I need a little more info on the Usage steps.
Assuming I have a VM or Databricks instance with GPU's, do I:
- Clone the dolly repo to a directory on the VM/Databricks filesystem?
- I don't see "the model" in your repo here (I'm probably missing something fundamental here!)
- Do I need to download it to the VM/Databricks filesystem from somewhere else?
Create a virtual environment (
python -m venv .venv
)?Run
pip install accelerate>=0.12.0 transformers[torch]==4.25.1
For theses instructions, does
model=
point to the path I downloaded the model to?
import torch
from transformers import pipeline
generate_text = pipeline(model="databricks/dolly-v2-12b", torch_dtype=torch.bfloat16, trust_remote_code=True, device_map="auto")
I think answers to these fundamental questions would help quickstart experimentation even faster.
Thank you!
(Silly rabbit...)
RE: #1: No need to clone the dolly
repo
- "The model" is handled by the (HuggingFaces)
transformers
library (see #4 below)
RE: #2: I still don't know how DataBricks handles venv's...
RE: #3: Yes, put this into the DataBricks notebook
RE: #4: The model=
points to the model that is hosted in HuggingFace(I think)
If you are using a runtime like 12.2 LTS ML (includes Apache Spark 3.3.2, GPU, Scala 2.12)
in Databricks then you just need to pip install those two dependencies. Actually I think transformers is likely already installed by default for that runtime. So you might just need to pip install accelerate. You don't need to clone the repo or set up a virtual environment. The Databricks cluster already sets up a venv for you with most packages you'd need already installed. So steps 1 and 2 you list are not necessary. If you copy and paste the code from step 4 into a cell and run it then it should just work. The Hugging Face libraries will download everything you need, including the model. Hope that helps.
(transformers is already in the runtime, yes)
At this point there are more elaborate usage instructions with HF, and langchain, in the repo