Update instruction with infinity
#23
by
michaelfeil
- opened
Ready for review.
Command tested:
docker run --gpus all -v $PWD/data:/app/.cache -e HF_TOKEN=$HF_TOKEN -p "7997":"7997" michaelf34/infinity:0.0.68 v2 --model-id intfloat/multilingual-e5-large-instruct --revision "main" --dtype float16 --batch-size 32 --device cuda --engine torch --port 7997
INFO: Started server process [1]
INFO: Waiting for application startup.
INFO 2024-11-13 00:46:42,260 infinity_emb INFO: infinity_server.py:89
Creating 1engines:
engines=['intfloat/multilingual-e5-large-instruct
']
INFO 2024-11-13 00:46:42,264 infinity_emb INFO: Anonymized telemetry.py:30
telemetry can be disabled via environment variable
`DO_NOT_TRACK=1`.
INFO 2024-11-13 00:46:42,272 infinity_emb INFO: select_model.py:64
model=`intfloat/multilingual-e5-large-instruct`
selected, using engine=`torch` and device=`cuda`
INFO 2024-11-13 00:46:42,367 SentenceTransformer.py:216
sentence_transformers.SentenceTransformer
INFO: Load pretrained SentenceTransformer:
intfloat/multilingual-e5-large-instruct
INFO 2024-11-13 00:46:46,145 infinity_emb INFO: Adding acceleration.py:56
optimizations via Huggingface optimum.