tim-lawson/mlsae-pythia-70m-deduped-x1-k32
Updated
•
115
Single SAEs trained on the residual stream activation vectors from every transformer layer simultaneously: https://arxiv.org/abs/2409.04185