--- license: mit tags: - METL - biology - protein --- # METL Mutational Effect Transfer Learning (METL) is a framework for pretraining and finetuning biophysics-informed protein language models. ## Model Details This repository contains a wrapper meant to facilitate the ease of use of METL models. Usage of this wrapper will be provided below. Models are hosted on [Zenodo](https://zenodo.org/doi/10.5281/zenodo.11051644) and will be downloaded by this wrapper when used. ### Model Description METL is discussed in the [paper](https://doi.org/10.1101/2024.03.15.585128) in further detail. The GitHub [repo](https://github.com/gitter-lab/metl) contains more documentation and includes scripts for training and predicting with METL. Google Colab notebooks for finetuning and predicting on publicly available METL models are available as well [here](https://github.com/gitter-lab/metl/tree/main/notebooks). ### Model Sources - **Repository:** [METL repo](https://github.com/gitter-lab/metl) - **Paper:** [METL preprint](https://doi.org/10.1101/2024.03.15.585128) - **Demo:** [Hugging Face Spaces demo](https://huggingface.co./spaces/gitter-lab/METL_demo) ## How to Get Started with the Model Use the code below to get started with the model. Running METL requires the following packages: ``` transformers==4.42.4 numpy>=1.23.2 networkx>=2.6.3 scipy>=1.9.1 biopandas>=0.2.7 ``` In order to run the example, a PDB file for the GB1 protein structure must be installed. It is provided [here](https://github.com/gitter-lab/metl-pretrained/blob/main/pdbs/2qmt_p.pdb) and in raw format [here](https://raw.githubusercontent.com/gitter-lab/metl-pretrained/main/pdbs/2qmt_p.pdb). After installing those packages and downloading the above file, you may run METL with the following code example (assuming the downloaded file is in the same place as the script): ```python from transformers import AutoModel import torch metl = AutoModel.from_pretrained('gitter-lab/METL', trust_remote_code=True) model = "metl-l-2m-3d-gb1" wt = "MQYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTE" variants = '["T17P,T54F", "V28L,F51A"]' pdb_path = './2qmt_p.pdb' metl.load_from_ident(model_id) metl.eval() encoded_variants = metl.encoder.encode_variants(sequence, variant) with torch.no_grad(): predictions = metl(torch.tensor(encoded_variants), pdb_fn=pdb_path) ``` ## Citation Biophysics-based protein language models for protein engineering Sam Gelman, Bryce Johnson, Chase Freschlin, Sameer D’Costa, Anthony Gitter, Philip A. Romero bioRxiv 2024.03.15.585128; doi: https://doi.org/10.1101/2024.03.15.585128 ## Model Card Contact For questions and comments about METL, the best way to reach out is through opening a GitHub issue in the [METL repository](https://github.com/gitter-lab/metl/issues).