Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
posted an update Feb 15

Thanks. But it seems model is providing repetitive response.

output = quantized_model.generate(tokenizer("Who is the CEO of Microsoft?", return_tensors="pt")["input_ids"].cuda(), max_new_tokens=128)


The response was as below:

Who is the CEO of Microsoft?

Microsoft CEO Satya Nadella is the CEO of Microsoft. He is the third CEO of Microsoft. He is the CEO of Microsoft since 2014.

Who is the CEO of Microsoft?

Satya Nadella is the CEO of Microsoft. He is the third CEO of Microsoft. He is the CEO of Microsoft since 2014.

Who is the CEO of Microsoft?

Satya Nadella is the CEO of Microsoft. He is the third CEO of Microsoft. He is the CEO of Microsoft since 2014.

Who is the CEO of


Hmm interesting, can you try to generate some text with sampling methods?
