anrilombard commited on
Commit
af48d34
1 Parent(s): d3721c7

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +134 -0
README.md CHANGED
@@ -0,0 +1,134 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - safe
4
+ - mamba
5
+ - attention
6
+ - hybrid
7
+ - molecular-generation
8
+ - smiles
9
+ - generated_from_trainer
10
+ datasets:
11
+ - katielink/moses
12
+ model-index:
13
+ - name: HYBRID_20M
14
+ results: []
15
+ ---
16
+
17
+ # HYBRID_20M
18
+
19
+ HYBRID_20M is a model developed for molecular generation tasks, incorporating both **Mamba** and **Attention** layers to utilize the advantages of each architecture. **The training code is available at [https://github.com/Anri-Lombard/Mamba-SAFE](https://github.com/Anri-Lombard/Mamba-SAFE).** The model was trained from scratch on the [MOSES](https://huggingface.co/datasets/katielink/moses) dataset, which has been converted from SMILES to the SAFE (SMILES Augmented For Encoding) format to improve molecular representation for machine learning applications. HYBRID_20M exhibits performance comparable to both transformer-based models such as [SAFE_20M](https://huggingface.co/anrilombard/safe-20m) and mamba-based models like [SSM_20M](https://huggingface.co/anrilombard/ssm-20m).
20
+
21
+ ## Evaluation Results
22
+
23
+ HYBRID_20M demonstrates performance that is on par with both transformer-based and mamba-based models in molecular generation tasks. The model ensures high validity and diversity in the generated molecular structures, indicating the effectiveness of combining Mamba's sequence modeling with Attention mechanisms.
24
+
25
+ ## Model Description
26
+
27
+ HYBRID_20M employs a hybrid architecture that integrates the **Mamba** framework with **Attention** layers. This integration allows the model to benefit from Mamba's efficient sequence modeling capabilities and the contextual understanding provided by Attention mechanisms.
28
+
29
+ ### Mamba Framework
30
+
31
+ The Mamba framework, utilized in HYBRID_20M, was introduced in the following publication:
32
+
33
+ ```bibtex
34
+ @article{gu2023mamba,
35
+ title={Mamba: Linear-time sequence modeling with selective state spaces},
36
+ author={Gu, Albert and Dao, Tri},
37
+ journal={arXiv preprint arXiv:2312.00752},
38
+ year={2023}
39
+ }
40
+ ```
41
+
42
+ We acknowledge the authors for their contributions to sequence modeling.
43
+
44
+ ### Attention Mechanisms
45
+
46
+ Attention layers enhance the model's ability to focus on relevant parts of the input sequence, facilitating the capture of long-range dependencies and contextual information. This capability is essential for accurately generating complex molecular structures.
47
+
48
+ ### SAFE Framework
49
+
50
+ The SAFE framework, also employed in HYBRID_20M, was introduced in the following publication:
51
+
52
+ ```bibtex
53
+ @article{noutahi2024gotta,
54
+ title={Gotta be SAFE: a new framework for molecular design},
55
+ author={Noutahi, Emmanuel and Gabellini, Cristian and Craig, Michael and Lim, Jonathan SC and Tossou, Prudencio},
56
+ journal={Digital Discovery},
57
+ volume={3},
58
+ number={4},
59
+ pages={796--804},
60
+ year={2024},
61
+ publisher={Royal Society of Chemistry}
62
+ }
63
+ ```
64
+
65
+ We acknowledge the authors for their contributions to molecular design.
66
+
67
+ ## Intended Uses & Limitations
68
+
69
+ ### Intended Uses
70
+
71
+ HYBRID_20M is intended for:
72
+
73
+ - **Generating Molecular Structures:** Creating novel molecules with desired properties.
74
+ - **Exploring Chemical Space:** Investigating the vast array of possible chemical compounds for research and development.
75
+ - **Assisting in Material Design:** Facilitating the creation of new materials with specific functionalities.
76
+
77
+ ### Limitations
78
+
79
+ - **Validation Required:** Outputs should be validated by domain experts before practical application.
80
+ - **Synthetic Feasibility:** Generated molecules may not always be synthetically feasible.
81
+ - **Dataset Scope:** The model's knowledge is limited to the chemical space represented in the MOSES dataset.
82
+
83
+ ## Training and Evaluation Data
84
+
85
+ The model was trained on the [MOSES (MOlecular SEtS)](https://huggingface.co/datasets/katielink/moses) dataset, a benchmark dataset for molecular generation. The dataset was converted from SMILES to the SAFE format to enhance molecular representation for machine learning tasks.
86
+
87
+ ## Training Procedure
88
+
89
+ ### Training Hyperparameters
90
+
91
+ The following hyperparameters were used during training:
92
+
93
+ - **Learning Rate:** 0.0005
94
+ - **Training Batch Size:** 32
95
+ - **Evaluation Batch Size:** 32
96
+ - **Seed:** 42
97
+ - **Gradient Accumulation Steps:** 2
98
+ - **Total Training Batch Size:** 64
99
+ - **Optimizer:** Adam (betas=(0.9, 0.999), epsilon=1e-08)
100
+ - **Learning Rate Scheduler:** Linear with 20,000 warmup steps
101
+ - **Number of Epochs:** 10
102
+
103
+ ### Framework Versions
104
+
105
+ - **Mamba:** [Specify version]
106
+ - **PyTorch:** [Specify version]
107
+ - **Datasets:** 2.20.0
108
+ - **Tokenizers:** 0.19.1
109
+
110
+ ## Acknowledgements
111
+
112
+ We acknowledge the authors of the [Mamba](https://github.com/Anri-Lombard/Mamba-SAFE) and SAFE frameworks for their contributions to sequence modeling and molecular design.
113
+
114
+ ## References
115
+
116
+ ```bibtex
117
+ @article{gu2023mamba,
118
+ title={Mamba: Linear-time sequence modeling with selective state spaces},
119
+ author={Gu, Albert and Dao, Tri},
120
+ journal={arXiv preprint arXiv:2312.00752},
121
+ year={2023}
122
+ }
123
+
124
+ @article{noutahi2024gotta,
125
+ title={Gotta be SAFE: a new framework for molecular design},
126
+ author={Noutahi, Emmanuel and Gabellini, Cristian and Craig, Michael and Lim, Jonathan SC and Tossou, Prudencio},
127
+ journal={Digital Discovery},
128
+ volume={3},
129
+ number={4},
130
+ pages={796--804},
131
+ year={2024},
132
+ publisher={Royal Society of Chemistry}
133
+ }
134
+ ```