End of training

Browse files

Files changed (10) hide show

README.md +6 -34
compressed_graph.dot +0 -0
nncf_output.log +132 -0
openvino_config.json +60 -0
openvino_model.bin +3 -0
openvino_model.xml +0 -0
original_graph.dot +0 -0
pytorch_model.bin +2 -2
runs/Nov18_10-27-10_1d5d6d420ef6/events.out.tfevents.1700303301.1d5d6d420ef6.278.30 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -5,24 +5,9 @@ tags:
 - generated_from_trainer
 datasets:
 - glue
-metrics:
-- accuracy
 model-index:
 - name: bert_uncased_L-6_H-768_A-12-QAT
-  results:
-  - task:
-      name: Text Classification
-      type: text-classification
-    dataset:
-      name: glue
-      type: glue
-      config: sst2
-      split: validation
-      args: sst2
-    metrics:
-    - name: Accuracy
-      type: accuracy
-      value: 0.9094036697247706
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -31,9 +16,6 @@ should probably proofread and complete it, then remove this comment. -->
 # bert_uncased_L-6_H-768_A-12-QAT
 This model is a fine-tuned version of [google/bert_uncased_L-6_H-768_A-12](https://huggingface.co/google/bert_uncased_L-6_H-768_A-12) on the glue dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.3239
-- Accuracy: 0.9094
 ## Model description
@@ -52,26 +34,16 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 6e-05
-- train_batch_size: 128
-- eval_batch_size: 128
-- seed: 33
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 7
-- mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Accuracy |
-|:-------------:|:-----:|:----:|:---------------:|:--------:|
-| 0.0485        | 1.0   | 527  | 0.3517          | 0.8819   |
-| 0.0862        | 2.0   | 1054 | 0.3239          | 0.9094   |
-| 0.0538        | 3.0   | 1581 | 0.2942          | 0.9083   |
-| 0.0354        | 4.0   | 2108 | 0.3710          | 0.9071   |
-| 0.0248        | 5.0   | 2635 | 0.3842          | 0.9002   |
-| 0.0152        | 6.0   | 3162 | 0.4606          | 0.8956   |
-| 0.0105        | 7.0   | 3689 | 0.5514          | 0.8979   |
 ### Framework versions

 - generated_from_trainer
 datasets:
 - glue
 model-index:
 - name: bert_uncased_L-6_H-768_A-12-QAT
+  results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # bert_uncased_L-6_H-768_A-12-QAT
 This model is a fine-tuned version of [google/bert_uncased_L-6_H-768_A-12](https://huggingface.co/google/bert_uncased_L-6_H-768_A-12) on the glue dataset.
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 1.0
 ### Training results
 ### Framework versions

compressed_graph.dot ADDED Viewed

The diff for this file is too large to render. See raw diff

nncf_output.log ADDED Viewed

	@@ -0,0 +1,132 @@

+INFO:nncf:Not adding activation input quantizer for operation: 7 BertForSequenceClassification/BertModel[bert]/BertEmbeddings[embeddings]/NNCFEmbedding[position_embeddings]/embedding_0
+INFO:nncf:Not adding activation input quantizer for operation: 4 BertForSequenceClassification/BertModel[bert]/BertEmbeddings[embeddings]/NNCFEmbedding[word_embeddings]/embedding_0
+INFO:nncf:Not adding activation input quantizer for operation: 5 BertForSequenceClassification/BertModel[bert]/BertEmbeddings[embeddings]/NNCFEmbedding[token_type_embeddings]/embedding_0
+INFO:nncf:Not adding activation input quantizer for operation: 6 BertForSequenceClassification/BertModel[bert]/BertEmbeddings[embeddings]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 8 BertForSequenceClassification/BertModel[bert]/BertEmbeddings[embeddings]/__iadd___0
+INFO:nncf:Not adding activation input quantizer for operation: 9 BertForSequenceClassification/BertModel[bert]/BertEmbeddings[embeddings]/NNCFLayerNorm[LayerNorm]/layer_norm_0
+INFO:nncf:Not adding activation input quantizer for operation: 10 BertForSequenceClassification/BertModel[bert]/BertEmbeddings[embeddings]/Dropout[dropout]/dropout_0
+INFO:nncf:Not adding activation input quantizer for operation: 23 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[0]/BertAttention[attention]/BertSelfAttention[self]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 26 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[0]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
+INFO:nncf:Not adding activation input quantizer for operation: 32 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[0]/BertAttention[attention]/BertSelfOutput[output]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 33 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[0]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
+INFO:nncf:Not adding activation input quantizer for operation: 38 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[0]/BertOutput[output]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 39 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[0]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
+INFO:nncf:Not adding activation input quantizer for operation: 52 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[1]/BertAttention[attention]/BertSelfAttention[self]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 55 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[1]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
+INFO:nncf:Not adding activation input quantizer for operation: 61 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[1]/BertAttention[attention]/BertSelfOutput[output]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 62 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[1]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
+INFO:nncf:Not adding activation input quantizer for operation: 67 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[1]/BertOutput[output]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 68 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[1]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
+INFO:nncf:Not adding activation input quantizer for operation: 81 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[2]/BertAttention[attention]/BertSelfAttention[self]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 84 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[2]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
+INFO:nncf:Not adding activation input quantizer for operation: 90 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[2]/BertAttention[attention]/BertSelfOutput[output]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 91 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[2]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
+INFO:nncf:Not adding activation input quantizer for operation: 96 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[2]/BertOutput[output]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 97 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[2]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
+INFO:nncf:Not adding activation input quantizer for operation: 110 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[3]/BertAttention[attention]/BertSelfAttention[self]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 113 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[3]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
+INFO:nncf:Not adding activation input quantizer for operation: 119 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[3]/BertAttention[attention]/BertSelfOutput[output]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 120 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[3]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
+INFO:nncf:Not adding activation input quantizer for operation: 125 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[3]/BertOutput[output]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 126 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[3]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
+INFO:nncf:Not adding activation input quantizer for operation: 139 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[4]/BertAttention[attention]/BertSelfAttention[self]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 142 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[4]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
+INFO:nncf:Not adding activation input quantizer for operation: 148 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[4]/BertAttention[attention]/BertSelfOutput[output]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 149 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[4]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
+INFO:nncf:Not adding activation input quantizer for operation: 154 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[4]/BertOutput[output]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 155 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[4]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
+INFO:nncf:Not adding activation input quantizer for operation: 168 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[5]/BertAttention[attention]/BertSelfAttention[self]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 171 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[5]/BertAttention[attention]/BertSelfAttention[self]/matmul_1
+INFO:nncf:Not adding activation input quantizer for operation: 177 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[5]/BertAttention[attention]/BertSelfOutput[output]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 178 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[5]/BertAttention[attention]/BertSelfOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
+INFO:nncf:Not adding activation input quantizer for operation: 183 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[5]/BertOutput[output]/__add___0
+INFO:nncf:Not adding activation input quantizer for operation: 184 BertForSequenceClassification/BertModel[bert]/BertEncoder[encoder]/ModuleList[layer]/BertLayer[5]/BertOutput[output]/NNCFLayerNorm[LayerNorm]/layer_norm_0
+INFO:nncf:Collecting tensor statistics |█               | 4 / 38
+INFO:nncf:Collecting tensor statistics |███             | 8 / 38
+INFO:nncf:Collecting tensor statistics |█████           | 12 / 38
+INFO:nncf:Collecting tensor statistics |██████          | 16 / 38
+INFO:nncf:Collecting tensor statistics |████████        | 20 / 38
+INFO:nncf:Collecting tensor statistics |██████████      | 24 / 38
+INFO:nncf:Collecting tensor statistics |███████████     | 28 / 38
+INFO:nncf:Collecting tensor statistics |█████████████   | 32 / 38
+INFO:nncf:Collecting tensor statistics |███████████████ | 36 / 38
+INFO:nncf:Collecting tensor statistics |████████████████| 38 / 38
+INFO:nncf:Compiling and loading torch extension: quantized_functions_cuda...
+INFO:nncf:Finished loading torch extension: quantized_functions_cuda
+WARNING:nncf:You are setting `forward` on an NNCF-processed model object.
+NNCF relies on custom-wrapping the `forward` call in order to function properly.
+Arbitrary adjustments to the forward function on an NNCFNetwork object have undefined behavior.
+If you need to replace the underlying forward function of the original model so that NNCF should be using that instead of the original forward function that NNCF saved during the compressed model creation, you can do this by calling:
+model.nncf.set_original_unbound_forward(fn)
+if `fn` has an unbound 0-th `self` argument, or
+with model.nncf.temporary_bound_original_forward(fn): ...
+if `fn` already had 0-th `self` argument bound or never had it in the first place.
+WARNING:nncf:You are setting `forward` on an NNCF-processed model object.
+NNCF relies on custom-wrapping the `forward` call in order to function properly.
+Arbitrary adjustments to the forward function on an NNCFNetwork object have undefined behavior.
+If you need to replace the underlying forward function of the original model so that NNCF should be using that instead of the original forward function that NNCF saved during the compressed model creation, you can do this by calling:
+model.nncf.set_original_unbound_forward(fn)
+if `fn` has an unbound 0-th `self` argument, or
+with model.nncf.temporary_bound_original_forward(fn): ...
+if `fn` already had 0-th `self` argument bound or never had it in the first place.
+INFO:nncf:Statistics of the quantization algorithm:
+Epoch 0 |+--------------------------------+-------+
+Epoch 0 ||        Statistic's name        | Value |
+Epoch 0 |+================================+=======+
+Epoch 0 || Ratio of enabled quantizations | 100   |
+Epoch 0 |+--------------------------------+-------+
+Epoch 0 |
+Epoch 0 |Statistics of the quantization share:
+Epoch 0 |+----------------------------------+--------------------+
+Epoch 0 ||         Statistic's name         |       Value        |
+Epoch 0 |+==================================+====================+
+Epoch 0 || Symmetric WQs / All placed WQs   | 100.00 % (38 / 38) |
+Epoch 0 |+----------------------------------+--------------------+
+Epoch 0 || Asymmetric WQs / All placed WQs  | 0.00 % (0 / 38)    |
+Epoch 0 |+----------------------------------+--------------------+
+Epoch 0 || Signed WQs / All placed WQs      | 100.00 % (38 / 38) |
+Epoch 0 |+----------------------------------+--------------------+
+Epoch 0 || Unsigned WQs / All placed WQs    | 0.00 % (0 / 38)    |
+Epoch 0 |+----------------------------------+--------------------+
+Epoch 0 || Per-tensor WQs / All placed WQs  | 0.00 % (0 / 38)    |
+Epoch 0 |+----------------------------------+--------------------+
+Epoch 0 || Per-channel WQs / All placed WQs | 100.00 % (38 / 38) |
+Epoch 0 |+----------------------------------+--------------------+
+Epoch 0 || Placed WQs / Potential WQs       | 70.37 % (38 / 54)  |
+Epoch 0 |+----------------------------------+--------------------+
+Epoch 0 || Symmetric AQs / All placed AQs   | 24.00 % (12 / 50)  |
+Epoch 0 |+----------------------------------+--------------------+
+Epoch 0 || Asymmetric AQs / All placed AQs  | 76.00 % (38 / 50)  |
+Epoch 0 |+----------------------------------+--------------------+
+Epoch 0 || Signed AQs / All placed AQs      | 100.00 % (50 / 50) |
+Epoch 0 |+----------------------------------+--------------------+
+Epoch 0 || Unsigned AQs / All placed AQs    | 0.00 % (0 / 50)    |
+Epoch 0 |+----------------------------------+--------------------+
+Epoch 0 || Per-tensor AQs / All placed AQs  | 100.00 % (50 / 50) |
+Epoch 0 |+----------------------------------+--------------------+
+Epoch 0 || Per-channel AQs / All placed AQs | 0.00 % (0 / 50)    |
+Epoch 0 |+----------------------------------+--------------------+
+Epoch 0 |
+Epoch 0 |Statistics of the bitwidth distribution:
+Epoch 0 |+--------------+---------------------+--------------------+--------------------+
+Epoch 0 || Num bits (N) | N-bits WQs / Placed |    N-bits AQs /    | N-bits Qs / Placed |
+Epoch 0 ||              |         WQs         |     Placed AQs     |         Qs         |
+Epoch 0 |+==============+=====================+====================+====================+
+Epoch 0 || 8            | 100.00 % (38 / 38)  | 100.00 % (50 / 50) | 100.00 % (88 / 88) |
+Epoch 0 |+--------------+---------------------+--------------------+--------------------+
+WARNING:nncf:You are setting `forward` on an NNCF-processed model object.
+NNCF relies on custom-wrapping the `forward` call in order to function properly.
+Arbitrary adjustments to the forward function on an NNCFNetwork object have undefined behavior.
+If you need to replace the underlying forward function of the original model so that NNCF should be using that instead of the original forward function that NNCF saved during the compressed model creation, you can do this by calling:
+model.nncf.set_original_unbound_forward(fn)
+if `fn` has an unbound 0-th `self` argument, or
+with model.nncf.temporary_bound_original_forward(fn): ...
+if `fn` already had 0-th `self` argument bound or never had it in the first place.
+WARNING:nncf:You are setting `forward` on an NNCF-processed model object.
+NNCF relies on custom-wrapping the `forward` call in order to function properly.
+Arbitrary adjustments to the forward function on an NNCFNetwork object have undefined behavior.
+If you need to replace the underlying forward function of the original model so that NNCF should be using that instead of the original forward function that NNCF saved during the compressed model creation, you can do this by calling:
+model.nncf.set_original_unbound_forward(fn)
+if `fn` has an unbound 0-th `self` argument, or
+with model.nncf.temporary_bound_original_forward(fn): ...
+if `fn` already had 0-th `self` argument bound or never had it in the first place.

openvino_config.json ADDED Viewed

	@@ -0,0 +1,60 @@

+{
+  "compression": {
+    "algorithm": "quantization",
+    "export_to_onnx_standard_ops": false,
+    "ignored_scopes": [
+      "{re}.*Embedding.*",
+      "{re}.*add___.*",
+      "{re}.*layer_norm_.*",
+      "{re}.*matmul_1",
+      "{re}.*__truediv__.*"
+    ],
+    "initializer": {
+      "batchnorm_adaptation": {
+        "num_bn_adaptation_samples": 0
+      },
+      "range": {
+        "num_init_samples": 300,
+        "type": "mean_min_max"
+      }
+    },
+    "overflow_fix": "disable",
+    "preset": "mixed",
+    "scope_overrides": {
+      "activations": {
+        "{re}.*matmul_0": {
+          "mode": "symmetric"
+        }
+      }
+    }
+  },
+  "input_info": [
+    {
+      "keyword": "input_ids",
+      "sample_size": [
+        8,
+        56
+      ],
+      "type": "long"
+    },
+    {
+      "keyword": "token_type_ids",
+      "sample_size": [
+        8,
+        56
+      ],
+      "type": "long"
+    },
+    {
+      "keyword": "attention_mask",
+      "sample_size": [
+        8,
+        56
+      ],
+      "type": "long"
+    }
+  ],
+  "optimum_version": "1.14.1",
+  "save_onnx_model": false,
+  "transformers_version": "4.35.2"
+}

openvino_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cececd27465ec44366c049c2cf0fdae43919d51eb664a80b8ba5a99b5f6ff3bb
+size 138739260

openvino_model.xml ADDED Viewed

The diff for this file is too large to render. See raw diff

original_graph.dot ADDED Viewed

The diff for this file is too large to render. See raw diff

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:142dbe5ab4b041b46c733e49e754bfe536fbac9155e5f5772f89a3ba008a7c8a
-size 267862062

 version https://git-lfs.github.com/spec/v1
+oid sha256:9714306a601a5684ca4e944f902b17302cd9c4b25181704adf26efbf008cee23
+size 268184942

runs/Nov18_10-27-10_1d5d6d420ef6/events.out.tfevents.1700303301.1d5d6d420ef6.278.30 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ac92d762c8a0af0b2386ba7e54e70a51212b6c964489c8834b35c1f720d15da5
+size 4574

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4a1c7e76b5ded3ee7548593ad5aa7f2b3f20c00381ded97317ed1671fdd75f79
 size 4600

 version https://git-lfs.github.com/spec/v1
+oid sha256:a946ae40b84094d79d9657770c80899b58573ab90fae3025de9580edc1380fc6
 size 4600