huggingartists

Browse files

Files changed (13) hide show

README.md +20 -14
config.json +4 -2
evaluation.txt +1 -0
flax_model.msgpack +3 -0
optimizer.pt +1 -1
pytorch_model.bin +2 -2
rng_state.pth +1 -1
scheduler.pt +1 -1
special_tokens_map.json +5 -1
tokenizer.json +0 -0
tokenizer_config.json +10 -1
trainer_state.json +345 -7
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -5,6 +5,8 @@ datasets:
 tags:
 - huggingartists
 - lyrics
 widget:
 - text: "I am"
 ---
@@ -12,7 +14,7 @@ widget:
 <div class="inline-flex flex-col" style="line-height: 1.5;">
     <div class="flex">
         <div
-			style="display:DISPLAY_1; margin-left: auto; margin-right: auto; width: 92px; height:92px; border-radius: 50%; background-size: cover; background-image: url(&#39;https://images.genius.com/c6b5142a09ff5bd361d0f42a55692edc.1000x1000x1.jpg&#39;)">
         </div>
     </div>
     <div style="text-align: center; margin-top: 3px; font-size: 16px; font-weight: 800">🤖 HuggingArtists Model 🤖</div>
@@ -43,25 +45,15 @@ from datasets import load_dataset
 dataset = load_dataset("huggingartists/drake")
 ```
-Or with Transformers library:
-```python
-from transformers import AutoTokenizer, AutoModelWithLMHead
-tokenizer = AutoTokenizer.from_pretrained("huggingartists/drake")
-model = AutoModelWithLMHead.from_pretrained("huggingartists/drake")
-```
-[Explore the data](https://wandb.ai/huggingartists/huggingartists/runs/1ba7t9q0/artifacts), which is tracked with [W&B artifacts](https://docs.wandb.com/artifacts) at every step of the pipeline.
 ## Training procedure
 The model is based on a pre-trained [GPT-2](https://huggingface.co/gpt2) which is fine-tuned on Drake's lyrics.
-Hyperparameters and metrics are recorded in the [W&B training run](https://wandb.ai/huggingartists/huggingartists/runs/18yhidb2) for full transparency and reproducibility.
-At the end of training, [the final model](https://wandb.ai/huggingartists/huggingartists/runs/18yhidb2/artifacts) is logged and versioned.
 ## How to use
@@ -74,6 +66,16 @@ generator = pipeline('text-generation',
 generator("I am", num_return_sequences=5)
 ```
 ## Limitations and bias
 The model suffers from [the same limitations and bias as GPT-2](https://huggingface.co/gpt2#limitations-and-bias).
@@ -86,6 +88,10 @@ In addition, the data present in the user's tweets further affects the text gene
 [![Follow](https://img.shields.io/github/followers/AlekseyKorshuk?style=social)](https://github.com/AlekseyKorshuk)
 For more details, visit the project repository.
 [![GitHub stars](https://img.shields.io/github/stars/AlekseyKorshuk/huggingartists?style=social)](https://github.com/AlekseyKorshuk/huggingartists)

 tags:
 - huggingartists
 - lyrics
+- lm-head
+- causal-lm
 widget:
 - text: "I am"
 ---
 <div class="inline-flex flex-col" style="line-height: 1.5;">
     <div class="flex">
         <div
+			style="display:DISPLAY_1; margin-left: auto; margin-right: auto; width: 92px; height:92px; border-radius: 50%; background-size: cover; background-image: url(&#39;https://images.genius.com/d45e35e82e356571a1d4a78dde726e23.1000x1000x1.png&#39;)">
         </div>
     </div>
     <div style="text-align: center; margin-top: 3px; font-size: 16px; font-weight: 800">🤖 HuggingArtists Model 🤖</div>
 dataset = load_dataset("huggingartists/drake")
 ```
+[Explore the data](https://wandb.ai/huggingartists/huggingartists/runs/zdmi3tvf/artifacts), which is tracked with [W&B artifacts](https://docs.wandb.com/artifacts) at every step of the pipeline.
 ## Training procedure
 The model is based on a pre-trained [GPT-2](https://huggingface.co/gpt2) which is fine-tuned on Drake's lyrics.
+Hyperparameters and metrics are recorded in the [W&B training run](https://wandb.ai/huggingartists/huggingartists/runs/26dol7sm) for full transparency and reproducibility.
+At the end of training, [the final model](https://wandb.ai/huggingartists/huggingartists/runs/26dol7sm/artifacts) is logged and versioned.
 ## How to use
 generator("I am", num_return_sequences=5)
 ```
+Or with Transformers library:
+```python
+from transformers import AutoTokenizer, AutoModelWithLMHead
+tokenizer = AutoTokenizer.from_pretrained("huggingartists/drake")
+model = AutoModelWithLMHead.from_pretrained("huggingartists/drake")
+```
 ## Limitations and bias
 The model suffers from [the same limitations and bias as GPT-2](https://huggingface.co/gpt2#limitations-and-bias).
 [![Follow](https://img.shields.io/github/followers/AlekseyKorshuk?style=social)](https://github.com/AlekseyKorshuk)
+[![Follow](https://img.shields.io/twitter/follow/alekseykorshuk?style=social)](https://twitter.com/intent/follow?screen_name=alekseykorshuk)
+[![Follow](https://img.shields.io/badge/dynamic/json?color=blue&label=Telegram%20Channel&query=%24.result&url=https%3A%2F%2Fapi.telegram.org%2Fbot1929545866%3AAAFGhV-KKnegEcLiyYJxsc4zV6C-bdPEBtQ%2FgetChatMemberCount%3Fchat_id%3D-1001253621662&style=social&logo=telegram)](https://t.me/joinchat/_CQ04KjcJ-4yZTky)
 For more details, visit the project repository.
 [![GitHub stars](https://img.shields.io/github/stars/AlekseyKorshuk/huggingartists?style=social)](https://github.com/AlekseyKorshuk/huggingartists)

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "gpt2",
   "activation_function": "gelu_new",
   "architectures": [
     "GPT2LMHeadModel"
@@ -18,7 +18,9 @@
   "n_inner": null,
   "n_layer": 12,
   "n_positions": 1024,
   "resid_pdrop": 0.1,
   "scale_attn_weights": true,
   "summary_activation": null,
   "summary_first_dropout": 0.1,
@@ -35,7 +37,7 @@
     }
   },
   "torch_dtype": "float32",
-  "transformers_version": "4.9.1",
   "use_cache": true,
   "vocab_size": 50257
 }

 {
+  "_name_or_path": "drake",
   "activation_function": "gelu_new",
   "architectures": [
     "GPT2LMHeadModel"
   "n_inner": null,
   "n_layer": 12,
   "n_positions": 1024,
+  "reorder_and_upcast_attn": false,
   "resid_pdrop": 0.1,
+  "scale_attn_by_inverse_layer_idx": false,
   "scale_attn_weights": true,
   "summary_activation": null,
   "summary_first_dropout": 0.1,
     }
   },
   "torch_dtype": "float32",
+  "transformers_version": "4.20.0",
   "use_cache": true,
   "vocab_size": 50257
 }

evaluation.txt ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"eval_loss": 3.1339056491851807, "eval_runtime": 9.8906, "eval_samples_per_second": 45.498, "eval_steps_per_second": 5.763, "epoch": 2.0}

flax_model.msgpack ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e9f955a88973413396f7ec4b9fc3d18ac455d9b19f440f5135cfa7291c70a600
+size 497764120

optimizer.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f54b613b898ec9cd602b10294cad4372affedb9070359bf8fb1e1efc0c38351b
 size 995604017

 version https://git-lfs.github.com/spec/v1
+oid sha256:510cf9d28093435a6f14d9c62457eacbdaf147c9a782198e086f9ae40697394f
 size 995604017

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9d646844d7a1967fa54609f85a5452b8eae50cafd2f88c1e934be317c5f71252
-size 510403817

 version https://git-lfs.github.com/spec/v1
+oid sha256:936dde6623deebfd8fa2cff18479909a6f6dd90daf7f48f9bf42d9a3df5c2744
+size 510396521

rng_state.pth CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0c479098b6ff1a7d6e1a01f9b8d8a54011a8cc009f5aa71d8d828ce9e600a222
 size 14567

 version https://git-lfs.github.com/spec/v1
+oid sha256:fe27f0f573405bd8961b9cdbc34715f4f83d3b82d9647c877952759bfcd31b44
 size 14567

scheduler.pt CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9dab3de057a89d68db2661a52fbf111edc97698719020db0127718fdc9badb3d
 size 623

 version https://git-lfs.github.com/spec/v1
+oid sha256:aa8f8a4ea2104deacbe371cff162540194a1c1633aaca63b65c5835d3a06f383
 size 623

special_tokens_map.json CHANGED Viewed

	@@ -1 +1,5 @@
1	- {~~"bos_token": "<\|endoftext\|>", "eos_token": "<\|endoftext\|>", "unk_token": "<\|endoftext\|>"}~~

+{
+  "bos_token": "<|endoftext|>",
+  "eos_token": "<|endoftext|>",
+  "unk_token": "<|endoftext|>"
+}

tokenizer.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json CHANGED Viewed

	@@ -1 +1,10 @@
1	- {"unk_token": "<\|endoftext\|>", "bos_token": "<\|endoftext\|>", "eos_token": "<\|endoftext\|>", "add_prefix_space": false, "model_max_length": 1024, "special_tokens_map_file": null, "name_or_path": "gpt2", "tokenizer_class": "GPT2Tokenizer"}

+{
+  "add_prefix_space": false,
+  "bos_token": "<|endoftext|>",
+  "eos_token": "<|endoftext|>",
+  "model_max_length": 1024,
+  "name_or_path": "huggingartists/drake",
+  "special_tokens_map_file": null,
+  "tokenizer_class": "GPT2Tokenizer",
+  "unk_token": "<|endoftext|>"
+}

trainer_state.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
-  "best_metric": null,
-  "best_model_checkpoint": null,
-  "epoch": 1.0,
-  "global_step": 386,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
@@ -468,11 +468,349 @@
       "learning_rate": 2.2720446338799106e-09,
       "loss": 3.2786,
       "step": 385
     }
   ],
-  "max_steps": 386,
-  "num_train_epochs": 1,
-  "total_flos": 403042959360000.0,
   "trial_name": null,
   "trial_params": null
 }

 {
+  "best_metric": 3.1339056491851807,
+  "best_model_checkpoint": "output/drake/checkpoint-660",
+  "epoch": 2.0,
+  "global_step": 660,
   "is_hyper_param_search": false,
   "is_local_process_zero": true,
   "is_world_process_zero": true,
       "learning_rate": 2.2720446338799106e-09,
       "loss": 3.2786,
       "step": 385
+    },
+    {
+      "epoch": 1.18,
+      "learning_rate": 1.0890007647780969e-05,
+      "loss": 3.2655,
+      "step": 390
+    },
+    {
+      "epoch": 1.2,
+      "learning_rate": 1.2720089689346961e-05,
+      "loss": 3.1689,
+      "step": 395
+    },
+    {
+      "epoch": 1.21,
+      "learning_rate": 1.4676757700644785e-05,
+      "loss": 3.15,
+      "step": 400
+    },
+    {
+      "epoch": 1.23,
+      "learning_rate": 1.6755579199297876e-05,
+      "loss": 3.3109,
+      "step": 405
+    },
+    {
+      "epoch": 1.24,
+      "learning_rate": 1.8951844985992176e-05,
+      "loss": 3.3557,
+      "step": 410
+    },
+    {
+      "epoch": 1.26,
+      "learning_rate": 2.1260579812327114e-05,
+      "loss": 3.2466,
+      "step": 415
+    },
+    {
+      "epoch": 1.27,
+      "learning_rate": 2.3676553651353433e-05,
+      "loss": 3.4927,
+      "step": 420
+    },
+    {
+      "epoch": 1.29,
+      "learning_rate": 2.6194293545266464e-05,
+      "loss": 3.5059,
+      "step": 425
+    },
+    {
+      "epoch": 1.3,
+      "learning_rate": 2.8808096003415798e-05,
+      "loss": 3.3394,
+      "step": 430
+    },
+    {
+      "epoch": 1.32,
+      "learning_rate": 3.151203992254596e-05,
+      "loss": 2.9776,
+      "step": 435
+    },
+    {
+      "epoch": 1.33,
+      "learning_rate": 3.429999999999997e-05,
+      "loss": 3.1045,
+      "step": 440
+    },
+    {
+      "epoch": 1.35,
+      "learning_rate": 3.716566060949963e-05,
+      "loss": 3.3412,
+      "step": 445
+    },
+    {
+      "epoch": 1.36,
+      "learning_rate": 4.0102530108070535e-05,
+      "loss": 3.3821,
+      "step": 450
+    },
+    {
+      "epoch": 1.38,
+      "learning_rate": 4.3103955541701554e-05,
+      "loss": 3.2771,
+      "step": 455
+    },
+    {
+      "epoch": 1.39,
+      "learning_rate": 4.6163137716424864e-05,
+      "loss": 3.2889,
+      "step": 460
+    },
+    {
+      "epoch": 1.41,
+      "learning_rate": 4.927314660067792e-05,
+      "loss": 3.1803,
+      "step": 465
+    },
+    {
+      "epoch": 1.42,
+      "learning_rate": 5.242693702405331e-05,
+      "loss": 3.3752,
+      "step": 470
+    },
+    {
+      "epoch": 1.44,
+      "learning_rate": 5.561736463687583e-05,
+      "loss": 3.4172,
+      "step": 475
+    },
+    {
+      "epoch": 1.45,
+      "learning_rate": 5.883720209445263e-05,
+      "loss": 3.1247,
+      "step": 480
+    },
+    {
+      "epoch": 1.47,
+      "learning_rate": 6.207915542933309e-05,
+      "loss": 3.1979,
+      "step": 485
+    },
+    {
+      "epoch": 1.48,
+      "learning_rate": 6.533588057449125e-05,
+      "loss": 3.2403,
+      "step": 490
+    },
+    {
+      "epoch": 1.5,
+      "learning_rate": 6.859999999999999e-05,
+      "loss": 3.3824,
+      "step": 495
+    },
+    {
+      "epoch": 1.52,
+      "learning_rate": 7.186411942550872e-05,
+      "loss": 3.2565,
+      "step": 500
+    },
+    {
+      "epoch": 1.53,
+      "learning_rate": 7.512084457066689e-05,
+      "loss": 3.1016,
+      "step": 505
+    },
+    {
+      "epoch": 1.55,
+      "learning_rate": 7.836279790554734e-05,
+      "loss": 3.4194,
+      "step": 510
+    },
+    {
+      "epoch": 1.56,
+      "learning_rate": 8.158263536312414e-05,
+      "loss": 3.1877,
+      "step": 515
+    },
+    {
+      "epoch": 1.58,
+      "learning_rate": 8.477306297594667e-05,
+      "loss": 3.4791,
+      "step": 520
+    },
+    {
+      "epoch": 1.59,
+      "learning_rate": 8.792685339932205e-05,
+      "loss": 3.2052,
+      "step": 525
+    },
+    {
+      "epoch": 1.61,
+      "learning_rate": 9.103686228357512e-05,
+      "loss": 3.4702,
+      "step": 530
+    },
+    {
+      "epoch": 1.62,
+      "learning_rate": 9.409604445829843e-05,
+      "loss": 3.152,
+      "step": 535
+    },
+    {
+      "epoch": 1.64,
+      "learning_rate": 9.709746989192944e-05,
+      "loss": 3.2925,
+      "step": 540
+    },
+    {
+      "epoch": 1.65,
+      "learning_rate": 0.00010003433939050033,
+      "loss": 3.2007,
+      "step": 545
+    },
+    {
+      "epoch": 1.67,
+      "learning_rate": 0.00010290000000000001,
+      "loss": 3.444,
+      "step": 550
+    },
+    {
+      "epoch": 1.68,
+      "learning_rate": 0.00010568796007745401,
+      "loss": 3.227,
+      "step": 555
+    },
+    {
+      "epoch": 1.7,
+      "learning_rate": 0.00010839190399658417,
+      "loss": 3.133,
+      "step": 560
+    },
+    {
+      "epoch": 1.71,
+      "learning_rate": 0.00011100570645473351,
+      "loss": 3.1161,
+      "step": 565
+    },
+    {
+      "epoch": 1.73,
+      "learning_rate": 0.00011352344634864656,
+      "loss": 3.2043,
+      "step": 570
+    },
+    {
+      "epoch": 1.74,
+      "learning_rate": 0.00011593942018767285,
+      "loss": 3.265,
+      "step": 575
+    },
+    {
+      "epoch": 1.76,
+      "learning_rate": 0.00011824815501400781,
+      "loss": 3.3016,
+      "step": 580
+    },
+    {
+      "epoch": 1.77,
+      "learning_rate": 0.00012044442080070208,
+      "loss": 3.3944,
+      "step": 585
+    },
+    {
+      "epoch": 1.79,
+      "learning_rate": 0.0001225232422993552,
+      "loss": 3.4325,
+      "step": 590
+    },
+    {
+      "epoch": 1.8,
+      "learning_rate": 0.00012447991031065301,
+      "loss": 3.3161,
+      "step": 595
+    },
+    {
+      "epoch": 1.82,
+      "learning_rate": 0.000126309992352219,
+      "loss": 3.2818,
+      "step": 600
+    },
+    {
+      "epoch": 1.83,
+      "learning_rate": 0.00012800934269961248,
+      "loss": 3.2362,
+      "step": 605
+    },
+    {
+      "epoch": 1.85,
+      "learning_rate": 0.00012957411177772773,
+      "loss": 3.1107,
+      "step": 610
+    },
+    {
+      "epoch": 1.86,
+      "learning_rate": 0.00013100075488131993,
+      "loss": 3.2821,
+      "step": 615
+    },
+    {
+      "epoch": 1.88,
+      "learning_rate": 0.00013228604020490257,
+      "loss": 3.3709,
+      "step": 620
+    },
+    {
+      "epoch": 1.89,
+      "learning_rate": 0.00013342705616382626,
+      "loss": 3.3293,
+      "step": 625
+    },
+    {
+      "epoch": 1.91,
+      "learning_rate": 0.00013442121798995453,
+      "loss": 3.5064,
+      "step": 630
+    },
+    {
+      "epoch": 1.92,
+      "learning_rate": 0.00013526627358699495,
+      "loss": 3.2024,
+      "step": 635
+    },
+    {
+      "epoch": 1.94,
+      "learning_rate": 0.00013596030863222166,
+      "loss": 3.378,
+      "step": 640
+    },
+    {
+      "epoch": 1.95,
+      "learning_rate": 0.000136501750913032,
+      "loss": 3.215,
+      "step": 645
+    },
+    {
+      "epoch": 1.97,
+      "learning_rate": 0.0001368893738885136,
+      "loss": 3.2645,
+      "step": 650
+    },
+    {
+      "epoch": 1.98,
+      "learning_rate": 0.00013712229946795436,
+      "loss": 3.2699,
+      "step": 655
+    },
+    {
+      "epoch": 2.0,
+      "learning_rate": 0.0001372,
+      "loss": 2.9558,
+      "step": 660
+    },
+    {
+      "epoch": 2.0,
+      "eval_loss": 3.1339056491851807,
+      "eval_runtime": 9.8661,
+      "eval_samples_per_second": 45.611,
+      "eval_steps_per_second": 5.777,
+      "step": 660
     }
   ],
+  "max_steps": 660,
+  "num_train_epochs": 2,
+  "total_flos": 688765796352000.0,
   "trial_name": null,
   "trial_params": null
 }

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:781ffb618640ff70dd3baac13df255d8dc426f031550cac003228fcbafce41fc
-size 2671

 version https://git-lfs.github.com/spec/v1
+oid sha256:b30fd30de2e4d780d9dc1c08eade8a0e7ffd2b47e4ccbd978c8cce0f74354c17
+size 3311