Problems with temperature when using with python code.
#6
by
matchaslime
- opened
Hi, I am following the instructions to use in python code, but the model is always outputting the same response to the same prompt. Changing the temperature does not seem to do anything. What could be the issue here?
could you provide some code?
I'm just using the example code in the readme for the autogptq section
When I have tested inference before, I have had code to change the seed, like so:
@property
def seed(self):
return self._current_seed
@seed
.setter
def seed(self, seed):
self._seed = int(seed)
def update_seed(self):
self._current_seed = (self._seed == -1 ) and random.randint(1, 2**31) or self._seed
random.seed(self._current_seed)
torch.manual_seed(self._current_seed)
torch.cuda.manual_seed_all(self._current_seed)
def generate(self, prompt):
self.update_seed()
input_ids, len_input_ids = self.encode(prompt)
with self.do_timing(True) as timing:
with torch.no_grad():
tokens = self.model.generate(inputs=input_ids, generation_config=self.generation_config)[0].cuda()
len_reply = len(tokens) - len_input_ids
response = self.tokenizer.decode(tokens)
reply_tokens = tokens[-len_reply:]
reply = self.tokenizer.decode(reply_tokens)
result = {
'response': response, # The response in full, including prompt
'reply': reply, # Just the reply, no prompt
'len_reply': len_reply, # The length of the reply tokens
'seed': self.seed, # The seed used to generate this response
'time': timing['time'] # The time in seconds to generate the response
}
return result
You could try the same to get a different seed for each generation.