Successor Heads: Recurring, Interpretable Attention Heads In The Wild Paper • 2312.09230 • Published Dec 14, 2023
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small Paper • 2211.00593 • Published Nov 1, 2022 • 2
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 Paper • 2408.05147 • Published Aug 9, 2024 • 39
Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders Paper • 2407.14435 • Published Jul 19, 2024 • 7