RTX 5090 with 600GB of RAM what models?
For an RTX 5090 with 600GB of RAM, aiming for over 20 concurrent threads and a throughput of more than 5 tokens per second, what models would you recommend running?
DeepSeek-R1-UD-IQ1_M ? or better?
For an RTX 5090 with 600GB of RAM, aiming for over 20 concurrent threads and a throughput of more than 5 tokens per second, what models would you recommend running?
that's very good. I'd say even UD-Q2_K_XL can work.
The 600GB (sure RTX 5090 😱 x1 ???) or you mean memory is referring to System RAM, not VRAM. Even then, DDR5 RAM beyond 2TB is not practical, the UD-Q2_K_XL setup didn’t work for me with 8xRTX 4090 GPUs 😅.
It ran into similar issues related to size and capacity, and the expected performance wasn’t achieved. It’s clear that even with powerful GPUs, such configurations have their limits.
@frank-mx If the majority of the model is in RAM and not VRAM, then CPU speed will dominate overall speed. In your case I would go with the smallest possible model you can accept quality wise.