Amazing inference speed

#77
by hexing1994 - opened

The inference speed is amazing for the Llama 2 70B model! Could you please share the spec of the server you are using? Are you running on GPU?

hexing1994 changed discussion status to closed

Sign up or log in to comment