05/07/2023 04:42:05 WARNING Found cached dataset parquet (/home/pszemraj/.cache/huggingface/datasets/OpenAssistant___parquet/OpenAssistant--oasst1-2960c57d7e52ab15/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec) 05/07/2023 04:42:06 WARNING No such comm: c8c073cce7994da5b454ed0300090049 05/07/2023 04:42:06 WARNING No such comm: 1103c6a0950249ca863ebc8399fddfef 05/07/2023 04:42:06 WARNING No such comm: 5c3ce017525f4406904695297ace8724 05/07/2023 04:42:06 WARNING No such comm: c5ceaf44ed3942cdb730705e230f024b 05/07/2023 04:42:06 WARNING No such comm: f953c7265b2248c98cc4dbe971b44f3d 05/07/2023 04:42:06 WARNING No such comm: 687a131767524803a41093a1d84f4652 05/07/2023 04:42:06 WARNING No such comm: 93293aa5cce946bc8c6aa6ee4d0eaeb1 05/07/2023 04:42:06 WARNING No such comm: 637d46ef1d57406a817ef020d0c7bf06 05/07/2023 04:42:06 WARNING No such comm: 494913a72a3b4802b2390b58f38a3a36 05/07/2023 04:42:06 WARNING No such comm: 2678191b17564118a9e16b1201d9b4d2 05/07/2023 04:42:06 WARNING No such comm: 891bcbcf176840789f36c723e386c9b9 05/07/2023 04:42:06 INFO Quantized model will be saved to: /home/pszemraj/workspace/misc-train/quantization/quantized-models/stablelm-7b-sft-v7-epoch-3-4bit-128g 05/07/2023 04:42:14 INFO Running quantization.. 05/07/2023 04:42:16 INFO Start quantizing layer 1/16 05/07/2023 04:42:49 INFO Quantizing attention.query_key_value in layer 1/16... 05/07/2023 04:42:50 INFO duration: 1.0365328788757324 05/07/2023 04:42:50 INFO avg loss: 0.2228083991395018 05/07/2023 04:43:23 INFO Quantizing attention.dense in layer 1/16... 05/07/2023 04:43:24 INFO duration: 0.7084124088287354 05/07/2023 04:43:24 INFO avg loss: 0.01904001936744958 05/07/2023 04:43:57 INFO Quantizing mlp.dense_h_to_4h in layer 1/16... 05/07/2023 04:43:58 INFO duration: 1.0652313232421875 05/07/2023 04:43:58 INFO avg loss: 0.304011920770505/07/2023 04:47:44 INFO Quantizing mlp.dense_4h_to_h in layer 1/16... 05/07/2023 04:47:51 INFO duration: 6.762867212295532 05/07/2023 04:47:51 INFO avg loss: 0.028748639221516405 05/07/2023 04:48:12 INFO Start quantizing layer 2/16 05/07/2023 04:48:45 INFO Quantizing attention.query_key_value in layer 2/16... 05/07/2023 04:48:46 INFO duration: 0.9713742733001709 05/07/2023 04:48:46 INFO avg loss: 0.35355199259310105 05/07/2023 04:49:19 INFO Quantizing attention.dense in layer 2/16... 05/07/2023 04:49:20 INFO duration: 0.7275807857513428 05/07/2023 04:49:20 INFO avg loss: 0.06647738861961487 05/07/2023 04:49:53 INFO Quantizing mlp.dense_h_to_4h in layer 2/16... 05/07/2023 04:49:54 INFO duration: 1.083951711654663 05/07/2023 04:49:54 INFO avg loss: 0.6772610437882721 05/07/2023 04:53:40 INFO Quantizing mlp.dense_4h_to_h in layer 2/16... 05/07/2023 04:53:47 INFO duration: 6.844736814498901 05/07/2023 04:53:47 INFO avg loss: 0.05320497620473908 05/07/2023 04:54:08 INFO Start quantizing layer 3/16 05/07/2023 04:54:41 INFO Quantizing attention.query_key_value in layer 3/16... 05/07/2023 04:54:42 INFO duration: 0.9685044288635254 05/07/2023 04:54:42 INFO avg loss: 0.6015139448756989 05/07/2023 04:55:15 INFO Quantizing attention.dense in layer 3/16... 05/07/2023 04:55:16 INFO duration: 0.7167198657989502 05/07/2023 04:55:16 INFO avg loss: 0.06039099241344058 05/07/2023 04:55:49 INFO Quantizing mlp.dense_h_to_4h in layer 3/16... 05/07/2023 04:55:50 INFO duration: 1.0765190124511719 05/07/2023 04:55:50 INFO avg loss: 1.3903707193490416 05/07/2023 04:59:37 INFO Quantizing mlp.dense_4h_to_h in layer 3/16... 05/07/2023 04:59:43 INFO duration: 6.270395040512085 05/07/2023 04:59:43 INFO avg loss: 0.181059166011465 05/07/2023 05:00:04 INFO Start quantizing layer 4/16 05/07/2023 05:00:37 INFO Quantizing attention.query_key_value in layer 4/16... 05/07/2023 05:00:38 INFO duration: 0.9672496318817139 05/07/2023 05:00:38 INFO avg loss: 0.9807066506090255 05/07/2023 05:01:11 INFO Quantizing attention.dense in layer 4/16... 05/07/2023 05:01:12 INFO duration: 0.7248861789703369 05/07/2023 05:01:12 INFO avg loss: 0.1315788618418863 05/07/2023 05:01:45 INFO Quantizing mlp.dense_h_to_4h in layer 4/16... 05/07/2023 05:01:46 INFO duration: 1.083066463470459 05/07/2023 05:01:46 INFO avg loss: 2.080002984807641 05/07/2023 05:05:32 INFO Quantizing mlp.dense_4h_to_h in layer 4/16... 05/07/2023 05:05:38 INFO duration: 6.18793797492981 05/07/2023 05:05:38 INFO avg loss: 0.252437506240016 05/07/2023 05:05:59 INFO Start quantizing layer 5/16 05/07/2023 05:06:32 INFO Quantizing attention.query_key_value in layer 5/16... 05/07/2023 05:06:33 INFO duration: 0.9693779945373535 05/07/2023 05:06:33 INFO avg loss: 1.3782398682940629 05/07/2023 05:07:06 INFO Quantizing attention.dense in layer 5/16... 05/07/2023 05:07:07 INFO duration: 0.7210879325866699 05/07/2023 05:07:07 INFO avg loss: 0.14899523392779884 05/07/2023 05:07:40 INFO Quantizing mlp.dense_h_to_4h in layer 5/16... 05/07/2023 05:07:41 INFO duration: 1.0800914764404297 05/07/2023 05:07:41 INFO avg loss: 2.332041130025293 05/07/2023 05:11:27 INFO Quantizing mlp.dense_4h_to_h in layer 5/16... 05/07/2023 05:11:33 INFO duration: 6.191901206970215 05/07/2023 05:11:33 INFO avg loss: 0.3255492384060503 05/07/2023 05:11:54 INFO Start quantizing layer 6/16 05/07/2023 05:12:27 INFO Quantizing attention.query_key_value in layer 6/16... 05/07/2023 05:12:28 INFO duration: 0.9662725925445557 05/07/2023 05:12:28 INFO avg loss: 1.757845780085197 05/07/2023 05:13:01 INFO Quantizing attention.dense in layer 6/16... 05/07/2023 05:13:02 INFO duration: 0.7185342311859131 05/07/2023 05:13:02 INFO avg loss: 0.15947506450616514 05/07/2023 05:13:35 INFO Quantizing mlp.dense_h_to_4h in layer 6/16... 05/07/2023 05:13:36 INFO duration: 1.075429916381836 05/07/2023 05:13:36 INFO avg loss: 2.4491654498635516 05/07/2023 05:17:18 INFO Quantizing mlp.dense_4h_to_h in layer 6/16... 05/07/2023 05:17:24 INFO duration: 5.919256925582886 05/07/2023 05:17:24 INFO avg loss: 0.40534172017480363 05/07/2023 05:17:45 INFO Start quantizing layer 7/16 05/07/2023 05:18:18 INFO Quantizing attention.query_key_value in layer 7/16... 05/07/2023 05:18:19 INFO duration: 0.9676733016967773 05/07/2023 05:18:19 INFO avg loss: 2.131913417698349 05/07/2023 05:18:52 INFO Quantizing attention.dense in layer 7/16... 05/07/2023 05:18:53 INFO duration: 0.7196581363677979 05/07/2023 05:18:53 INFO avg loss: 0.20212076367915502 05/07/2023 05:19:26 INFO Quantizing mlp.dense_h_to_4h in layer 7/16... 05/07/2023 05:19:27 INFO duration: 1.0817346572875977 05/07/2023 05:19:27 INFO avg loss: 2.4321377462726304 05/07/2023 05:23:08 INFO Quantizing mlp.dense_4h_to_h in layer 7/16... 05/07/2023 05:23:14 INFO duration: 5.973307132720947 05/07/2023 05:23:14 INFO avg loss: 0.4796293378511049 05/07/2023 05:23:35 INFO Start quantizing layer 8/16 05/07/2023 05:24:08 INFO Quantizing attention.query_key_value in layer 8/16... 05/07/2023 05:24:09 INFO duration: 0.9668700695037842 05/07/2023 05:24:09 INFO avg loss: 2.3333008332501333 05/07/2023 05:24:42 INFO Quantizing attention.dense in layer 8/16... 05/07/2023 05:24:43 INFO duration: 0.7205338478088379 05/07/2023 05:24:43 INFO avg loss: 0.2906766491322218 05/07/2023 05:25:16 INFO Quantizing mlp.dense_h_to_4h in layer 8/16... 05/07/2023 05:25:17 INFO duration: 1.075392246246338 05/07/2023 05:25:17 INFO avg loss: 2.088160245690229 05/07/2023 05:28:59 INFO Quantizing mlp.dense_4h_to_h in layer 8/16... 05/07/2023 05:29:05 INFO duration: 6.0966198444366455 05/07/2023 05:29:05 INFO avg loss: 0.4126856014751398 05/07/2023 05:29:26 INFO Start quantizing layer 9/16 05/07/2023 05:29:59 INFO Quantizing attention.query_key_value in layer 9/16... 05/07/2023 05:30:00 INFO duration: 0.971062183380127 05/07/2023 05:30:00 INFO avg loss: 4.631909777689031 05/07/2023 05:30:33 INFO Quantizing attention.dense in layer 9/16... 05/07/2023 05:30:34 INFO duration: 0.7198226451873779 05/07/2023 05:30:34 INFO avg loss: 0.2723473172091321 05/07/2023 05:31:07 INFO Quantizing mlp.dense_h_to_4h in layer 9/16... 05/07/2023 05:31:08 INFO duration: 1.0791394710540771 05/07/2023 05:31:08 INFO avg loss: 2.0461749482078675 05/07/2023 05:34:49 INFO Quantizing mlp.dense_4h_to_h in layer 9/16... 05/07/2023 05:34:55 INFO duration: 5.983144044876099 05/07/2023 05:34:55 INFO avg loss: 0.5113805541342186 05/07/2023 05:35:16 INFO Start quantizing layer 10/16 05/07/2023 05:35:49 INFO Quantizing attention.query_key_value in layer 10/16... 05/07/2023 05:35:50 INFO duration: 0.9664998054504395 05/07/2023 05:35:50 INFO avg loss: 7.197037864416933 05/07/2023 05:36:23 INFO Quantizing attention.dense in layer 10/16... 05/07/2023 05:36:24 INFO duration: 0.7181813716888428 05/07/2023 05:36:24 INFO avg loss: 0.3427228673705405 05/07/2023 05:36:57 INFO Quantizing mlp.dense_h_to_4h in layer 10/16... 05/07/2023 05:36:58 INFO duration: 1.0781819820404053 05/07/2023 05:36:58 INFO avg loss: 2.320328880041933 05/07/2023 05:40:40 INFO Quantizing mlp.dense_4h_to_h in layer 10/16... 05/07/2023 05:40:46 INFO duration: 6.027331829071045 05/07/2023 05:40:46 INFO avg loss: 0.6135274056301584 05/07/2023 05:41:07 INFO Start quantizing layer 11/16 05/07/2023 05:41:40 INFO Quantizing attention.query_key_value in layer 11/16... 05/07/2023 05:41:41 INFO duration: 0.9669804573059082 05/07/2023 05:41:41 INFO avg loss: 7.502283845846645 05/07/2023 05:42:14 INFO Quantizing attention.dense in layer 11/16... 05/07/2023 05:42:14 INFO duration: 0.7167062759399414 05/07/2023 05:42:14 INFO avg loss: 0.2933824760591387 05/07/2023 05:42:47 INFO Quantizing mlp.dense_h_to_4h in layer 11/16... 05/07/2023 05:42:48 INFO duration: 1.077958345413208 05/07/2023 05:42:48 INFO avg loss: 2.6354988268769968 05/07/2023 05:46:30 INFO Quantizing mlp.dense_4h_to_h in layer 11/16... 05/07/2023 05:46:36 INFO duration: 5.968295335769653 05/07/2023 05:46:36 INFO avg loss: 0.7737983809238551 05/07/2023 05:46:57 INFO Start quantizing layer 12/16 05/07/2023 05:47:30 INFO Quantizing attention.query_key_value in layer 12/16... 05/07/2023 05:47:31 INFO duration: 0.9708924293518066 05/07/2023 05:47:31 INFO avg loss: 6.875169520433972 05/07/2023 05:48:04 INFO Quantizing attention.dense in layer 12/16... 05/07/2023 05:48:05 INFO duration: 0.7233545780181885 05/07/2023 05:48:05 INFO avg loss: 0.36776245897189497 05/07/2023 05:48:38 INFO Quantizing mlp.dense_h_to_4h in layer 12/16... 05/07/2023 05:48:39 INFO duration: 1.078718900680542 05/07/2023 05:48:39 INFO avg loss: 2.9615547415801386 05/07/2023 05:52:21 INFO Quantizing mlp.dense_4h_to_h in layer 12/16... 05/07/2023 05:52:27 INFO duration: 6.078177452087402 05/07/2023 05:52:27 INFO avg loss: 0.9158687896241015 05/07/2023 05:52:48 INFO Start quantizing layer 13/16 05/07/2023 05:53:21 INFO Quantizing attention.query_key_value in layer 13/16... 05/07/2023 05:53:22 INFO duration: 0.9698812961578369 05/07/2023 05:53:22 INFO avg loss: 5.93688639842918 05/07/2023 05:53:54 INFO Quantizing attention.dense in layer 13/16... 05/07/2023 05:53:55 INFO duration: 0.7205860614776611 05/07/2023 05:53:55 INFO avg loss: 0.24467934637912672 05/07/2023 05:54:28 INFO Quantizing mlp.dense_h_to_4h in layer 13/16... 05/07/2023 05:54:29 INFO duration: 1.0801022052764893 05/07/2023 05:54:29 INFO avg loss: 3.275802466054313 05/07/2023 05:58:11 INFO Quantizing mlp.dense_4h_to_h in layer 13/16... 05/07/2023 05:58:17 INFO duration: 6.09338641166687 05/07/2023 05:58:17 INFO avg loss: 1.0767965265991082 05/07/2023 05:58:38 INFO Start quantizing layer 14/16 05/07/2023 05:59:11 INFO Quantizing attention.query_key_value in layer 14/16... 05/07/2023 05:59:12 INFO duration: 0.9676227569580078 05/07/2023 05:59:12 INFO avg loss: 6.686944638578275 05/07/2023 05:59:45 INFO Quantizing attention.dense in layer 14/16... 05/07/2023 05:59:46 INFO duration: 0.7196416854858398 05/07/2023 05:59:46 INFO avg loss: 0.34242789661541534 05/07/2023 06:00:19 INFO Quantizing mlp.dense_h_to_4h in layer 14/16... 05/07/2023 06:00:20 INFO duration: 1.0829389095306396 05/07/2023 06:00:20 INFO avg loss: 3.705307965588392 05/07/2023 06:04:02 INFO Quantizing mlp.dense_4h_to_h in layer 14/16... 05/07/2023 06:04:08 INFO duration: 6.013010263442993 05/07/2023 06:04:08 INFO avg loss: 1.1975950458433173 05/07/2023 06:04:29 INFO Start quantizing layer 15/16 05/07/2023 06:05:02 INFO Quantizing attention.query_key_value in layer 15/16... 05/07/2023 06:05:03 INFO duration: 0.9704198837280273 05/07/2023 06:05:03 INFO avg loss: 7.567932973908413 05/07/2023 06:05:36 INFO Quantizing attention.dense in layer 15/16... 05/07/2023 06:05:37 INFO duration: 0.7222294807434082 05/07/2023 06:05:37 INFO avg loss: 0.4468821890184039 05/07/2023 06:06:10 INFO Quantizing mlp.dense_h_to_4h in layer 15/16... 05/07/2023 06:06:11 INFO duration: 1.0775363445281982 05/07/2023 06:06:11 INFO avg loss: 4.276716368393903 05/07/2023 06:09:52 INFO Quantizing mlp.dense_4h_to_h in layer 15/16... 05/07/2023 06:09:58 INFO duration: 6.097189664840698 05/07/2023 06:09:58 INFO avg loss: 1.6799194205937167 05/07/2023 06:10:19 INFO Start quantizing layer 16/16 05/07/2023 06:10:52 INFO Quantizing attention.query_key_value in layer 16/16... 05/07/2023 06:10:53 INFO duration: 0.9705617427825928 05/07/2023 06:10:53 INFO avg loss: 7.100380016972843 05/07/2023 06:11:26 INFO Quantizing attention.dense in layer 16/16... 05/07/2023 06:11:27 INFO duration: 0.722510814666748 05/07/2023 06:11:27 INFO avg loss: 0.24434113426330373 05/07/2023 06:12:00 INFO Quantizing mlp.dense_h_to_4h in layer 16/16... 05/07/2023 06:12:01 INFO duration: 1.0826246738433838 05/07/2023 06:12:01 INFO avg loss: 4.788446298422524 05/07/2023 06:15:43 INFO Quantizing mlp.dense_4h_to_h in layer 16/16... 05/07/2023 06:15:49 INFO duration: 6.170569658279419 05/07/2023 06:15:49 INFO avg loss: 1.7897084716536875 05/07/2023 06:16:11 INFO Packing model... 05/07/2023 06:16:11 INFO gpt_neox.layers.0.attention.dense 05/07/2023 06:16:12 INFO gpt_neox.layers.0.attention.query_key_value 05/07/2023 06:16:15 INFO gpt_neox.layers.0.mlp.dense_4h_to_h 05/07/2023 06:16:18 INFO gpt_neox.layers.0.mlp.dense_h_to_4h 05/07/2023 06:16:22 INFO gpt_neox.layers.1.attention.dense 05/07/2023 06:16:23 INFO gpt_neox.layers.1.attention.query_key_value 05/07/2023 06:16:26 INFO gpt_neox.layers.1.mlp.dense_4h_to_h 05/07/2023 06:16:29 INFO gpt_neox.layers.1.mlp.dense_h_to_4h 05/07/2023 06:16:33 INFO gpt_neox.layers.2.attention.dense 05/07/2023 06:16:34 INFO gpt_neox.layers.2.attention.query_key_value 05/07/2023 06:16:37 INFO gpt_neox.layers.2.mlp.dense_4h_to_h 05/07/2023 06:16:40 INFO gpt_neox.layers.2.mlp.dense_h_to_4h 05/07/2023 06:16:44 INFO gpt_neox.layers.3.attention.dense 05/07/2023 06:16:45 INFO gpt_neox.layers.3.attention.query_key_value 05/07/2023 06:16:48 INFO gpt_neox.layers.3.mlp.dense_4h_to_h 05/07/2023 06:16:51 INFO gpt_neox.layers.3.mlp.dense_h_to_4h 05/07/2023 06:16:56 INFO gpt_neox.layers.4.attention.dense 05/07/2023 06:16:56 INFO gpt_neox.layers.4.attention.query_key_value 05/07/2023 06:16:59 INFO gpt_neox.layers.4.mlp.dense_4h_to_h 05/07/2023 06:17:03 INFO gpt_neox.layers.4.mlp.dense_h_to_4h 05/07/2023 06:17:07 INFO gpt_neox.layers.5.attention.dense 05/07/2023 06:17:08 INFO gpt_neox.layers.5.attention.query_key_value 05/07/2023 06:17:11 INFO gpt_neox.layers.5.mlp.dense_4h_to_h 05/07/2023 06:17:14 INFO gpt_neox.layers.5.mlp.dense_h_to_4h 05/07/2023 06:17:18 INFO gpt_neox.layers.6.attention.dense 05/07/2023 06:17:19 INFO gpt_neox.layers.6.attention.query_key_value 05/07/2023 06:17:22 INFO gpt_neox.layers.6.mlp.dense_4h_to_h 05/07/2023 06:17:25 INFO gpt_neox.layers.6.mlp.dense_h_to_4h 05/07/2023 06:17:29 INFO gpt_neox.layers.7.attention.dense 05/07/2023 06:17:30 INFO gpt_neox.layers.7.attention.query_key_value 05/07/2023 06:17:33 INFO gpt_neox.layers.7.mlp.dense_4h_to_h 05/07/2023 06:17:36 INFO gpt_neox.layers.7.mlp.dense_h_to_4h 05/07/2023 06:17:40 INFO gpt_neox.layers.8.attention.dense 05/07/2023 06:17:41 INFO gpt_neox.layers.8.attention.query_key_value 05/07/2023 06:17:44 INFO gpt_neox.layers.8.mlp.dense_4h_to_h 05/07/2023 06:17:47 INFO gpt_neox.layers.8.mlp.dense_h_to_4h 05/07/2023 06:17:51 INFO gpt_neox.layers.9.attention.dense 05/07/2023 06:17:52 INFO gpt_neox.layers.9.attention.query_key_value 05/07/2023 06:17:55 INFO gpt_neox.layers.9.mlp.dense_4h_to_h 05/07/2023 06:17:58 INFO gpt_neox.layers.9.mlp.dense_h_to_4h 05/07/2023 06:18:02 INFO gpt_neox.layers.10.attention.dense 05/07/2023 06:18:03 INFO gpt_neox.layers.10.attention.query_key_value 05/07/2023 06:18:06 INFO gpt_neox.layers.10.mlp.dense_4h_to_h 05/07/2023 06:18:09 INFO gpt_neox.layers.10.mlp.dense_h_to_4h 05/07/2023 06:18:13 INFO gpt_neox.layers.11.attention.dense 05/07/2023 06:18:14 INFO gpt_neox.layers.11.attention.query_key_value 05/07/2023 06:18:17 INFO gpt_neox.layers.11.mlp.dense_4h_to_h 05/07/2023 06:18:20 INFO gpt_neox.layers.11.mlp.dense_h_to_4h 05/07/2023 06:18:24 INFO gpt_neox.layers.12.attention.dense 05/07/2023 06:18:25 INFO gpt_neox.layers.12.attention.query_key_value 05/07/2023 06:18:28 INFO gpt_neox.layers.12.mlp.dense_4h_to_h 05/07/2023 06:18:31 INFO gpt_neox.layers.12.mlp.dense_h_to_4h 05/07/2023 06:18:35 INFO gpt_neox.layers.13.attention.dense 05/07/2023 06:18:36 INFO gpt_neox.layers.13.attention.query_key_value 05/07/2023 06:18:39 INFO gpt_neox.layers.13.mlp.dense_4h_to_h 05/07/2023 06:18:42 INFO gpt_neox.layers.13.mlp.dense_h_to_4h 05/07/2023 06:18:46 INFO gpt_neox.layers.14.attention.dense 05/07/2023 06:18:47 INFO gpt_neox.layers.14.attention.query_key_value 05/07/2023 06:18:50 INFO gpt_neox.layers.14.mlp.dense_4h_to_h 05/07/2023 06:18:53 INFO gpt_neox.layers.14.mlp.dense_h_to_4h 05/07/2023 06:18:57 INFO gpt_neox.layers.15.attention.dense 05/07/2023 06:18:58 INFO gpt_neox.layers.15.attention.query_key_value 05/07/2023 06:19:01 INFO gpt_neox.layers.15.mlp.dense_4h_to_h 05/07/2023 06:19:04 INFO gpt_neox.layers.15.mlp.dense_h_to_4h 05/07/2023 06:19:08 INFO Model packed. 05/07/2023 06:19:08 WARNING using autotune_warmup will move model to GPU, make sure you have enough VRAM to load the whole model. 05/07/2023 06:19:09 INFO Found 4 unique KN Linear values. 05/07/2023 06:19:09 INFO Warming up autotune cache ... 05/07/2023 06:19:58 INFO Done! Saving.. 05/07/2023 06:20:05 INFO Saved. Size of the model file(s): 10063.64 MB 05/07/2023 06:20:05 WARNING use_triton will force moving the whole model to GPU, make sure you have enough VRAM. 05/07/2023 06:20:05 INFO embed_out not been quantized, will be ignored when make_quant. 05/07/2023 06:20:06 WARNING The safetensors archive passed at /home/pszemraj/workspace/misc-train/quantization/quantized-models/stablelm-7b-sft-v7-epoch-3-4bit-128g/gptq_model-4bit-128g.safetensors does not contain metadata. Make sure to save your model with the `save_pretrained` method. Defaulting to 'pt' metadata. 05/07/2023 06:20:06 INFO Found 4 unique KN Linear values. 05/07/2023 06:20:06 INFO Warming up autotune cache ... 05/07/2023 06:20:07 INFO Sample output: ('Because woodchucks (or squirrels, as they\'re also known) are "the chink[e] ' 'of wood."') 05/07/2023 06:20:07 INFO GPU memory usage during test inference: 4.61 GB 05/07/2023 06:20:08 WARNING No such comm: d349e6339e5442e4a3286af931f0699f 05/07/2023 06:20:08 WARNING No such comm: 9374387013794a8bab6ba19cace86d58 05/07/2023 06:20:08 WARNING No such comm: bf152b67bcc04b93863ac311ea4df76a 05/07/2023 06:20:08 WARNING No such comm: 118ccfc8fe874373ae03f8132fb8c258 05/07/2023 06:20:08 WARNING No such comm: 9d85d31e378c44ce9119ead8b83e7556 05/07/2023 06:20:08 WARNING No such comm: c8c5130cae894895a66be12fe834c673 05/07/2023 06:20:08 WARNING No such comm: 237ea212dbd74befad2f34ba2161307d 05/07/2023 06:20:08 WARNING No such comm: 86eda75ae855461b8f5c1ae5b3a83cec 05/07/2023 06:20:08 WARNING No such comm: 63731c4e51f0433fbf85712c08c3d4bf 05/07/2023 06:20:08 WARNING No such comm: 2079a099466341488fc017f30e9359a8 05/07/2023 06:20:08 WARNING No such comm: 99fa75439d3d47c0a6a5b7c25c526718 05/07/2023 06:20:09 WARNING use_triton will force moving the whole model to GPU, make sure you have enough VRAM. 05/07/2023 06:20:09 INFO embed_out not been quantized, will be ignored when make_quant. 05/07/2023 06:20:09 WARNING The safetensors archive passed at /home/pszemraj/workspace/misc-train/quantization/quantized-models/stablelm-7b-sft-v7-epoch-3-4bit-128g/gptq_model-4bit-128g.safetensors does not contain metadata. Make sure to save your model with the `save_pretrained` method. Defaulting to 'pt' metadata. 05/07/2023 06:20:10 INFO Found 4 unique KN Linear values. 05/07/2023 06:20:10 INFO Warming up autotune cache ... 05/07/2023 06:31:04 WARNING use_triton will force moving the whole model to GPU, make sure you have enough VRAM. 05/07/2023 06:31:04 INFO embed_out not been quantized, will be ignored when make_quant. 05/07/2023 06:31:04 WARNING The safetensors archive passed at /home/pszemraj/workspace/misc-train/quantization/quantized-models/stablelm-7b-sft-v7-epoch-3-4bit-128g/gptq_model-4bit-128g.safetensors does not contain metadata. Make sure to save your model with the `save_pretrained` method. Defaulting to 'pt' metadata. 05/07/2023 06:31:05 INFO Found 4 unique KN Linear values. 05/07/2023 06:31:05 INFO Warming up autotune cache ... 05/07/2023 06:31:46 WARNING use_triton will force moving the whole model to GPU, make sure you have enough VRAM. 05/07/2023 06:31:46 INFO embed_out not been quantized, will be ignored when make_quant. 05/07/2023 06:31:46 WARNING The safetensors archive passed at /home/pszemraj/workspace/misc-train/quantization/quantized-models/stablelm-7b-sft-v7-epoch-3-4bit-128g/gptq_model-4bit-128g.safetensors does not contain metadata. Make sure to save your model with the `save_pretrained` method. Defaulting to 'pt' metadata. 05/07/2023 06:31:46 INFO Found 4 unique KN Linear values. 05/07/2023 06:31:46 INFO Warming up autotune cache ... 05/07/2023 06:32:16 WARNING use_triton will force moving the whole model to GPU, make sure you have enough VRAM. 05/07/2023 06:32:16 INFO embed_out not been quantized, will be ignored when make_quant. 05/07/2023 06:32:16 WARNING The safetensors archive passed at /home/pszemraj/workspace/misc-train/quantization/quantized-models/stablelm-7b-sft-v7-epoch-3-4bit-128g/gptq_model-4bit-128g.safetensors does not contain metadata. Make sure to save your model with the `save_pretrained` method. Defaulting to 'pt' metadata. 05/07/2023 06:32:16 INFO Found 4 unique KN Linear values. 05/07/2023 06:32:16 INFO Warming up autotune cache ... 05/07/2023 06:32:42 WARNING use_triton will force moving the whole model to GPU, make sure you have enough VRAM. 05/07/2023 06:32:42 INFO embed_out not been quantized, will be ignored when make_quant. 05/07/2023 06:32:42 WARNING The safetensors archive passed at /home/pszemraj/workspace/misc-train/quantization/quantized-models/stablelm-7b-sft-v7-epoch-3-4bit-128g/gptq_model-4bit-128g.safetensors does not contain metadata. Make sure to save your model with the `save_pretrained` method. Defaulting to 'pt' metadata. 05/07/2023 06:32:42 INFO Found 4 unique KN Linear values. 05/07/2023 06:32:42 INFO Warming up autotune cache ...