TheBloke/robin-7B-v2-GPTQ · Doesn't work for me, gives me gibe

Jun 16, 2023

2023-06-16 18:37:01 INFO:Loading robin-7B-v2-GPTQ...
2023-06-16 18:37:01 WARNING:Auto-assiging --gpu-memory 7 for your GPU to try to prevent out-of-memory errors. You can manually set other values.
2023-06-16 18:37:01 INFO:The AutoGPTQ params are: {'model_basename': 'robin-7b-GPTQ-4bit-128g.no-act.order', 'device': 'cuda:0', 'use_triton': False, 'inject_fused_attention': True, 'inject_fused_mlp': True, 'use_safetensors': True, 'trust_remote_code': False, 'max_memory': {0: '7GiB', 'cpu': '99GiB'}, 'quantize_config': None}
2023-06-16 18:37:03 WARNING:The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.
2023-06-16 18:37:03 WARNING:The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.
2023-06-16 18:37:03 WARNING:The safetensors archive passed at models\robin-7B-v2-GPTQ\robin-7b-GPTQ-4bit-128g.no-act.order.safetensors does not contain metadata. Make sure to save your model with the save_pretrained method. Defaulting to 'pt' metadata.
2023-06-16 18:37:08 WARNING:skip module injection for FusedLlamaMLPForQuantizedModel not support integrate without triton yet.
2023-06-16 18:37:08 INFO:Loaded the model in 6.33 seconds.

2023-06-16 18:37:08 INFO:Loading the extension "gallery"...
Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().

Simguy1

Jun 19, 2023

•

edited Jun 19, 2023

I was also getting this at first, but I fixed the output by ensuring my Instruction template properly matched the Prompt template example that's provided in the Readme.
I also changed my Turn template. It was using \\n, but I swapped it to \n as that seemed to help a bit. My template now looks like this :

user: '###Human:'
bot: '###Assistant:'
turn_template: "<|user|> <|user-message|>\n<|bot|> <|bot-message|></s>\n"