qgyd2021
/

lip_service_4chan

+---
+base_model: uer/gpt2-chinese-cluecorpussmall
+tags:
+- generated_from_trainer
+datasets:
+- lip_service4chan
+model-index:
+- name: lib_service_4chan
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# lib_service_4chan
+This model is a fine-tuned version of [uer/gpt2-chinese-cluecorpussmall](https://huggingface.co/uer/gpt2-chinese-cluecorpussmall) on the lip_service4chan dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.8635
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0002
+- train_batch_size: 4
+- eval_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 2
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 32
+- total_eval_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 1000
+- num_epochs: 1.0
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss |
+|:-------------:|:-----:|:-----:|:---------------:|
+| 2.716         | 0.01  | 100   | 1.9495          |
+| 1.8985        | 0.02  | 200   | 1.6915          |
+| 1.7151        | 0.02  | 300   | 1.5763          |
+| 1.6217        | 0.03  | 400   | 1.5115          |
+| 1.564         | 0.04  | 500   | 1.4694          |
+| 1.5461        | 0.05  | 600   | 1.4379          |
+| 1.4943        | 0.06  | 700   | 1.4127          |
+| 1.4737        | 0.07  | 800   | 1.3890          |
+| 1.4399        | 0.07  | 900   | 1.3813          |
+| 1.4356        | 0.08  | 1000  | 1.3540          |
+| 1.3999        | 0.09  | 1100  | 1.3329          |
+| 1.3668        | 0.1   | 1200  | 1.3153          |
+| 1.3604        | 0.11  | 1300  | 1.3029          |
+| 1.3352        | 0.12  | 1400  | 1.2834          |
+| 1.3278        | 0.12  | 1500  | 1.2619          |
+| 1.315         | 0.13  | 1600  | 1.2539          |
+| 1.2854        | 0.14  | 1700  | 1.2432          |
+| 1.292         | 0.15  | 1800  | 1.2288          |
+| 1.2795        | 0.16  | 1900  | 1.2188          |
+| 1.2677        | 0.16  | 2000  | 1.2059          |
+| 1.2599        | 0.17  | 2100  | 1.2019          |
+| 1.2479        | 0.18  | 2200  | 1.1915          |
+| 1.2245        | 0.19  | 2300  | 1.1827          |
+| 1.2326        | 0.2   | 2400  | 1.1734          |
+| 1.2124        | 0.21  | 2500  | 1.1660          |
+| 1.2171        | 0.21  | 2600  | 1.1576          |
+| 1.1917        | 0.22  | 2700  | 1.1518          |
+| 1.1867        | 0.23  | 2800  | 1.1444          |
+| 1.1821        | 0.24  | 2900  | 1.1386          |
+| 1.1741        | 0.25  | 3000  | 1.1347          |
+| 1.1753        | 0.25  | 3100  | 1.1293          |
+| 1.1629        | 0.26  | 3200  | 1.1264          |
+| 1.1694        | 0.27  | 3300  | 1.1201          |
+| 1.1482        | 0.28  | 3400  | 1.1146          |
+| 1.156         | 0.29  | 3500  | 1.1052          |
+| 1.1512        | 0.3   | 3600  | 1.0982          |
+| 1.142         | 0.3   | 3700  | 1.0971          |
+| 1.1544        | 0.31  | 3800  | 1.0920          |
+| 1.1312        | 0.32  | 3900  | 1.0869          |
+| 1.1394        | 0.33  | 4000  | 1.0808          |
+| 1.123         | 0.34  | 4100  | 1.0747          |
+| 1.1154        | 0.35  | 4200  | 1.0715          |
+| 1.1064        | 0.35  | 4300  | 1.0674          |
+| 1.1245        | 0.36  | 4400  | 1.0620          |
+| 1.1036        | 0.37  | 4500  | 1.0575          |
+| 1.0963        | 0.38  | 4600  | 1.0568          |
+| 1.0987        | 0.39  | 4700  | 1.0491          |
+| 1.0859        | 0.39  | 4800  | 1.0443          |
+| 1.0845        | 0.4   | 4900  | 1.0432          |
+| 1.0938        | 0.41  | 5000  | 1.0410          |
+| 1.087         | 0.42  | 5100  | 1.0334          |
+| 1.077         | 0.43  | 5200  | 1.0324          |
+| 1.0787        | 0.44  | 5300  | 1.0276          |
+| 1.068         | 0.44  | 5400  | 1.0220          |
+| 1.0748        | 0.45  | 5500  | 1.0199          |
+| 1.0622        | 0.46  | 5600  | 1.0169          |
+| 1.0555        | 0.47  | 5700  | 1.0153          |
+| 1.0498        | 0.48  | 5800  | 1.0100          |
+| 1.055         | 0.49  | 5900  | 1.0074          |
+| 1.0424        | 0.49  | 6000  | 1.0020          |
+| 1.0465        | 0.5   | 6100  | 0.9976          |
+| 1.0414        | 0.51  | 6200  | 0.9942          |
+| 1.0355        | 0.52  | 6300  | 0.9919          |
+| 1.0234        | 0.53  | 6400  | 0.9883          |
+| 1.0205        | 0.53  | 6500  | 0.9857          |
+| 1.0316        | 0.54  | 6600  | 0.9805          |
+| 1.0137        | 0.55  | 6700  | 0.9788          |
+| 1.0222        | 0.56  | 6800  | 0.9773          |
+| 1.0219        | 0.57  | 6900  | 0.9722          |
+| 1.0032        | 0.58  | 7000  | 0.9706          |
+| 1.0039        | 0.58  | 7100  | 0.9669          |
+| 1.0166        | 0.59  | 7200  | 0.9635          |
+| 1.0065        | 0.6   | 7300  | 0.9614          |
+| 1.0087        | 0.61  | 7400  | 0.9574          |
+| 0.9968        | 0.62  | 7500  | 0.9525          |
+| 1.0031        | 0.62  | 7600  | 0.9503          |
+| 0.99          | 0.63  | 7700  | 0.9491          |
+| 0.9946        | 0.64  | 7800  | 0.9457          |
+| 0.9944        | 0.65  | 7900  | 0.9424          |
+| 0.9854        | 0.66  | 8000  | 0.9399          |
+| 0.9797        | 0.67  | 8100  | 0.9364          |
+| 0.9804        | 0.67  | 8200  | 0.9341          |
+| 0.9835        | 0.68  | 8300  | 0.9318          |
+| 0.9849        | 0.69  | 8400  | 0.9299          |
+| 0.9753        | 0.7   | 8500  | 0.9274          |
+| 0.975         | 0.71  | 8600  | 0.9238          |
+| 0.9649        | 0.72  | 8700  | 0.9225          |
+| 0.9654        | 0.72  | 8800  | 0.9202          |
+| 0.958         | 0.73  | 8900  | 0.9167          |
+| 0.9679        | 0.74  | 9000  | 0.9143          |
+| 0.9631        | 0.75  | 9100  | 0.9110          |
+| 0.9633        | 0.76  | 9200  | 0.9086          |
+| 0.9495        | 0.76  | 9300  | 0.9071          |
+| 0.9625        | 0.77  | 9400  | 0.9036          |
+| 0.9519        | 0.78  | 9500  | 0.9023          |
+| 0.9399        | 0.79  | 9600  | 0.8993          |
+| 0.9624        | 0.8   | 9700  | 0.8973          |
+| 0.9418        | 0.81  | 9800  | 0.8963          |
+| 0.9394        | 0.81  | 9900  | 0.8933          |
+| 0.947         | 0.82  | 10000 | 0.8919          |
+| 0.9326        | 0.83  | 10100 | 0.8900          |
+| 0.9326        | 0.84  | 10200 | 0.8886          |
+| 0.9343        | 0.85  | 10300 | 0.8860          |
+| 0.9263        | 0.85  | 10400 | 0.8841          |
+| 0.9256        | 0.86  | 10500 | 0.8818          |
+| 0.9373        | 0.87  | 10600 | 0.8807          |
+| 0.9314        | 0.88  | 10700 | 0.8789          |
+| 0.9203        | 0.89  | 10800 | 0.8770          |
+| 0.927         | 0.9   | 10900 | 0.8754          |
+| 0.934         | 0.9   | 11000 | 0.8744          |
+| 0.9193        | 0.91  | 11100 | 0.8727          |
+| 0.9185        | 0.92  | 11200 | 0.8714          |
+| 0.9188        | 0.93  | 11300 | 0.8702          |
+| 0.9165        | 0.94  | 11400 | 0.8693          |
+| 0.9209        | 0.95  | 11500 | 0.8682          |
+| 0.9241        | 0.95  | 11600 | 0.8670          |
+| 0.9182        | 0.96  | 11700 | 0.8662          |
+| 0.9076        | 0.97  | 11800 | 0.8653          |
+| 0.9225        | 0.98  | 11900 | 0.8643          |
+| 0.9094        | 0.99  | 12000 | 0.8640          |
+| 0.913         | 0.99  | 12100 | 0.8635          |
+### Framework versions
+- Transformers 4.33.0
+- Pytorch 2.0.0
+- Datasets 2.1.0
+- Tokenizers 0.13.3

generation_config.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "_from_model_config": true,
+  "bos_token_id": 50256,
+  "eos_token_id": 50256,
+  "transformers_version": "4.33.0"
+}