nvidia/Llama-3.1-Nemotron-70B-Instruct · best open source model atm

Dec 22, 2024

This is the best open source model at the moment imo

vkg

Dec 30, 2024

This is the best open source model at the moment imo

I presume you were able to run it? If so, did you face any problems? I have not been able to run the model on 2xH100. (See https://huggingface.co./nvidia/Llama-3.1-Nemotron-70B-Instruct/discussions/10 for the problems I faced.) If you did something non-standard, please do let me know. Thanks.

kingriel

Jan 1

•

edited Jan 1

This is the best open source model at the moment imo

I presume you were able to run it? If so, did you face any problems? I have not been able to run the model on 2xH100. (See https://huggingface.co./nvidia/Llama-3.1-Nemotron-70B-Instruct/discussions/10 for the problems I faced.) If you did something non-standard, please do let me know. Thanks.

I was able to use the q8 version with 4090, best cpu at the moment and a lot of ram, it's slow but workable for what i use it, i haven't tested the fp16 version

vkg

Jan 6

@kingriel , ok, thanks. I am interested in anyone who have gotten this model, i.e., the unquantized Nemotron Llama 3.1 Instruct to work. So far, it is not quite cooperating. (I have ran other models successfully, just this one does not seem to work.)

yttria

24 days ago

An improvement over Llama 3.1 for single prompts and short conversations, however long conversation ability is degraded. A good fit for Chatbot Arena where the former dominates.

kingriel

24 days ago

An improvement over Llama 3.1 for single prompts and short conversations, however long conversation ability is degraded. A good fit for Chatbot Arena where the former dominates.

i guess it depends on use cases, for my use and tests it did better, way better.