hybrid-20m / README.md
anrilombard's picture
Upload README.md with huggingface_hub
af48d34 verified
metadata
tags:
  - safe
  - mamba
  - attention
  - hybrid
  - molecular-generation
  - smiles
  - generated_from_trainer
datasets:
  - katielink/moses
model-index:
  - name: HYBRID_20M
    results: []

HYBRID_20M

HYBRID_20M is a model developed for molecular generation tasks, incorporating both Mamba and Attention layers to utilize the advantages of each architecture. The training code is available at https://github.com/Anri-Lombard/Mamba-SAFE. The model was trained from scratch on the MOSES dataset, which has been converted from SMILES to the SAFE (SMILES Augmented For Encoding) format to improve molecular representation for machine learning applications. HYBRID_20M exhibits performance comparable to both transformer-based models such as SAFE_20M and mamba-based models like SSM_20M.

Evaluation Results

HYBRID_20M demonstrates performance that is on par with both transformer-based and mamba-based models in molecular generation tasks. The model ensures high validity and diversity in the generated molecular structures, indicating the effectiveness of combining Mamba's sequence modeling with Attention mechanisms.

Model Description

HYBRID_20M employs a hybrid architecture that integrates the Mamba framework with Attention layers. This integration allows the model to benefit from Mamba's efficient sequence modeling capabilities and the contextual understanding provided by Attention mechanisms.

Mamba Framework

The Mamba framework, utilized in HYBRID_20M, was introduced in the following publication:

@article{gu2023mamba,
  title={Mamba: Linear-time sequence modeling with selective state spaces},
  author={Gu, Albert and Dao, Tri},
  journal={arXiv preprint arXiv:2312.00752},
  year={2023}
}

We acknowledge the authors for their contributions to sequence modeling.

Attention Mechanisms

Attention layers enhance the model's ability to focus on relevant parts of the input sequence, facilitating the capture of long-range dependencies and contextual information. This capability is essential for accurately generating complex molecular structures.

SAFE Framework

The SAFE framework, also employed in HYBRID_20M, was introduced in the following publication:

@article{noutahi2024gotta,
  title={Gotta be SAFE: a new framework for molecular design},
  author={Noutahi, Emmanuel and Gabellini, Cristian and Craig, Michael and Lim, Jonathan SC and Tossou, Prudencio},
  journal={Digital Discovery},
  volume={3},
  number={4},
  pages={796--804},
  year={2024},
  publisher={Royal Society of Chemistry}
}

We acknowledge the authors for their contributions to molecular design.

Intended Uses & Limitations

Intended Uses

HYBRID_20M is intended for:

  • Generating Molecular Structures: Creating novel molecules with desired properties.
  • Exploring Chemical Space: Investigating the vast array of possible chemical compounds for research and development.
  • Assisting in Material Design: Facilitating the creation of new materials with specific functionalities.

Limitations

  • Validation Required: Outputs should be validated by domain experts before practical application.
  • Synthetic Feasibility: Generated molecules may not always be synthetically feasible.
  • Dataset Scope: The model's knowledge is limited to the chemical space represented in the MOSES dataset.

Training and Evaluation Data

The model was trained on the MOSES (MOlecular SEtS) dataset, a benchmark dataset for molecular generation. The dataset was converted from SMILES to the SAFE format to enhance molecular representation for machine learning tasks.

Training Procedure

Training Hyperparameters

The following hyperparameters were used during training:

  • Learning Rate: 0.0005
  • Training Batch Size: 32
  • Evaluation Batch Size: 32
  • Seed: 42
  • Gradient Accumulation Steps: 2
  • Total Training Batch Size: 64
  • Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)
  • Learning Rate Scheduler: Linear with 20,000 warmup steps
  • Number of Epochs: 10

Framework Versions

  • Mamba: [Specify version]
  • PyTorch: [Specify version]
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Acknowledgements

We acknowledge the authors of the Mamba and SAFE frameworks for their contributions to sequence modeling and molecular design.

References

@article{gu2023mamba,
  title={Mamba: Linear-time sequence modeling with selective state spaces},
  author={Gu, Albert and Dao, Tri},
  journal={arXiv preprint arXiv:2312.00752},
  year={2023}
}

@article{noutahi2024gotta,
  title={Gotta be SAFE: a new framework for molecular design},
  author={Noutahi, Emmanuel and Gabellini, Cristian and Craig, Michael and Lim, Jonathan SC and Tossou, Prudencio},
  journal={Digital Discovery},
  volume={3},
  number={4},
  pages={796--804},
  year={2024},
  publisher={Royal Society of Chemistry}
}