Tutorial: How to run infinity and nv-embed-2

#30

by michaelfeil - opened 25 days ago

Discussion

michaelfeil

25 days ago

•

edited 4 days ago

Usage for Infinity

Usage via Infinity, MIT License.
This needs a 24GB+ GPU.

docker run -it --gpus all  -v ./data:/app/.cache -p 7997:7997 michaelf34/infinity:0.0.70 \
v2 --model-id nvidia/NV-Embed-v2 --revision "refs/pr/23" --batch-size 8

michaelfeil

25 days ago

https://github.com/michaelfeil/infinity/issues/470 https://github.com/michaelfeil/infinity/issues/498#issuecomment-2549521971

nada5

NVIDIA org 11 days ago

Hi, @michaelfeil . Thank you for supporting the NV-Embed integration in Infinity. Your previous PR has been approved and the suggested changes to the modeling/configs have been merged. However, we decided not to include the Infinity instruction in the README, as NV-Embed is a research-only model and cannot extend our supports beyond Huggingface and Sentence Transformer.

nada5 changed discussion status to closed 9 days ago

Excel2

9 days ago

michaelfeil

9 days ago

@nada5 I opened this discussion as part of documentation. I acknowledge your decision! Closing it will not streamline users into commenting in a single thread. Please reopen?

nada5 changed discussion status to open 4 days ago

michaelfeil

4 days ago

Thanks! :)

michaelfeil changed discussion title from How to run infinity and nv-embed-2 to Tutorial: How to run infinity and nv-embed-2 4 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment