This repo contains several sweeps of SAEs trained on ChessGPT, including sweeps used for the paper "Measuring Progress in Dictionary Learning for | |
Language Model Interpretability with Board Game Models". | |
The Chess SAEs from the paper are: | |
chess-trained_model-layer_5-2024-05-23.zip and chess-random_model-layer_5-standard.zip | |
The SAEs are stored in zip files with a particular file structure. For download and usage directions, refer to https://github.com/adamkarvonen/SAE_BoardGameEval |