ai-msgbot GPT2-L + daily dialogues
NOTE: this model card is a WIP
GPT2-L (774M parameters) trained on the Wizard of Wikipedia dataset for 40k steps with 34/36 layers frozen using aitextgen
. This model was then subsequently trained on the Daily Dialogues dataset for an additional 40k steps, this time with 35 of 36 layers frozen.
Designed for use with ai-msgbot.