Tutorial: How to run infinity and nv-embed-2
#30
by
michaelfeil
- opened
Usage for Infinity
Usage via Infinity, MIT License.
This needs a 24GB+ GPU.
docker run -it --gpus all -v ./data:/app/.cache -p 7997:7997 michaelf34/infinity:0.0.70 \
v2 --model-id nvidia/NV-Embed-v2 --revision "refs/pr/23" --batch-size 8
Hi, @michaelfeil . Thank you for supporting the NV-Embed integration in Infinity. Your previous PR has been approved and the suggested changes to the modeling/configs have been merged. However, we decided not to include the Infinity instruction in the README, as NV-Embed is a research-only model and cannot extend our supports beyond Huggingface and Sentence Transformer.
nada5
changed discussion status to
closed
Hi
@nada5 I opened this discussion as part of documentation. I acknowledge your decision! Closing it will not streamline users into commenting in a single thread. Please reopen?
nada5
changed discussion status to
open
Thanks! :)
michaelfeil
changed discussion title from
How to run infinity and nv-embed-2
to Tutorial: How to run infinity and nv-embed-2