Update instruction with infinity

#23
by michaelfeil - opened

Ready for review.

Command tested:

docker run --gpus all -v $PWD/data:/app/.cache -e HF_TOKEN=$HF_TOKEN -p "7997":"7997" michaelf34/infinity:0.0.68 v2 --model-id intfloat/multilingual-e5-large-instruct --revision "main" --dtype float16 --batch-size 32 --device cuda --engine torch --port 7997

INFO:     Started server process [1]
INFO:     Waiting for application startup.
INFO     2024-11-13 00:46:42,260 infinity_emb INFO:        infinity_server.py:89
         Creating 1engines:                                                     
         engines=['intfloat/multilingual-e5-large-instruct                      
         ']                                                                     
INFO     2024-11-13 00:46:42,264 infinity_emb INFO: Anonymized   telemetry.py:30
         telemetry can be disabled via environment variable                     
         `DO_NOT_TRACK=1`.                                                      
INFO     2024-11-13 00:46:42,272 infinity_emb INFO:           select_model.py:64
         model=`intfloat/multilingual-e5-large-instruct`                        
         selected, using engine=`torch` and device=`cuda`                       
INFO     2024-11-13 00:46:42,367                      SentenceTransformer.py:216
         sentence_transformers.SentenceTransformer                              
         INFO: Load pretrained SentenceTransformer:                             
         intfloat/multilingual-e5-large-instruct                                
INFO     2024-11-13 00:46:46,145 infinity_emb INFO: Adding    acceleration.py:56
         optimizations via Huggingface optimum.       

@intfloat Can you review this?

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment