VConm

ArthurConmy

AI & ML interests

None yet

Recent Activity

new activity 3 days ago

Putnam-AXIOM/putnam-axiom-dataset:~10% of the problem statements may have flaws (according to Claude Sonnet 3.6 and a tiny bit of manual checking)

authored a paper 5 months ago

Successor Heads: Recurring, Interpretable Attention Heads In The Wild

authored a paper 5 months ago

Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small

View all activity

Organizations

None yet

ArthurConmy's activity

New activity in Putnam-AXIOM/putnam-axiom-dataset 3 days ago

~10% of the problem statements may have flaws (according to Claude Sonnet 3.6 and a tiny bit of manual checking)

#2 opened 3 days ago by

ArthurConmy

authored 3 papers 5 months ago

Successor Heads: Recurring, Interpretable Attention Heads In The Wild

Paper • 2312.09230 • Published Dec 14, 2023

Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small

Paper • 2211.00593 • Published Nov 1, 2022 • 2

Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2

Paper • 2408.05147 • Published Aug 9, 2024 • 39

New activity in google/gemma-scope-2b-pt-res 6 months ago

Update README.md

#4 opened 6 months ago by

ArthurConmy

New activity in google/gemma-scope-9b-pt-res 6 months ago

Update README.md

#2 opened 6 months ago by

ArthurConmy

authored a paper 6 months ago

Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders

Paper • 2407.14435 • Published Jul 19, 2024 • 7

updated a model 6 months ago

ArthurConmy/l2-13b-renamed-just-to-test

Updated Jul 14, 2024

authored a paper 11 months ago

Stealing Part of a Production Language Model

Paper • 2403.06634 • Published Mar 11, 2024 • 91

liked a model 12 months ago

ckkissane/attn-saes-gpt2-small-all-layers

Updated Jan 24, 2024 • 3

New activity in NeelNanda/gpt-neox-tokenizer-digits about 1 year ago

Remove added tokens to make compatible with tokenizers>=0.14

#1 opened over 1 year ago by

ArthurConmy

liked a model about 1 year ago

NeelNanda/sparse_autoencoder

Updated Oct 28, 2023 • 3

updated 4 models about 1 year ago

New activity in ArthurConmy/alternative-neel-tokenizer about 1 year ago

Remove spaces

#1 opened about 1 year ago by

ArthurConmy

updated 3 models almost 2 years ago

ArthurConmy/ppo_hh_exponential_2

Text Generation • Updated Mar 23, 2023 • 10

ArthurConmy/ppo_hh_exponential

Updated Mar 23, 2023

ArthurConmy/redwood_attn_2l

Updated Mar 20, 2023 • 1.76k