Sparse Autoencoders for ESM-2 (650M)

Interpret protein language model representations using sparse autoencoders trained on ESM-2-650m layers. These models decompose complex neural representations into interpretable features, enabling deeper understanding of how protein language models process sequence information.

Model Details

  • Base Model: ESM-2 650M (33 layers)
  • Architecture: Sparse Autoencoder
  • Input Dimension: 1,280
  • Feature Dimension: 10,240

Available Models

We provide SAE models trained on different layers of ESM-2-650M:

Model name ESM2 model ESM2 layer
InterPLM-esm2-650m-l1 esm2_t33_650m_UR50D 1
InterPLM-esm2-650m-l9 esm2_t33_650m_UR50D 9
InterPLM-esm2-650m-l18 esm2_t33_650m_UR50D 18
InterPLM-esm2-650m-l24 esm2_t33_650m_UR50D 24
InterPLM-esm2-650m-l30 esm2_t33_650m_UR50D 30
InterPLM-esm2-650m-l33 esm2_t33_650m_UR50D 33

All models share the same architecture and dictionary size (10,240). You can find SAEs trained on ESM-2 8M here. The 650M SAEs capture more known biological concepts than the 8M but require additional compute for both ESM embedding and SAE feature extraction.

Usage

from interplm.sae.inference import load_sae_from_hf
from interplm.esm.embed import embed_single_sequence

# Get ESM embeddings for protein sequence
embeddings = embed_single_sequence(
   sequence="MRWQEMGYIFYPRKLR",
   model_name="esm2_t33_650M_UR50D",
   layer=18  # Choose ESM layer (1,9,18,24,30,33)
)

# Load SAE model and extract features 
sae = load_sae_from_hf(plm_model="esm2-650m", plm_layer=18)
features = sae.encode(embeddings)

For details on training and analyzing SAEs on PLMs, see the GitHub README.

Model Normalization

The SAEs we've trained have arbitrary scales between features since encoder/decoder weights could be linearly scaled without changing reconstructions. To make features comparable, we normalize them to activate between 0-1 based on max activation values from Swiss-Prot (since this is our primary analysis dataset). By default, use our pre-normalized SAEs (ae_normalized.pt). As this might not perfectly scale features not present in Swiss-Prot proteins, for custom normalization use ae_unnormalized.pt with this code.

Downloads last month
15
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Collection including Elana/InterPLM-esm2-650m