. School of Computer Science and Technology, Shandong University, Qingdao 266230, China . Taishan College, Shandong University, Qingdao 266230, China . State Key Laboratory of Microbial Technology, Shandong University, Qingdao 266230, China
Recent advancements in AI-based synthesis of small molecules have led to the creation of extensive databases, housing billions of small molecules. Given this vast scale, traditional quantum chemistry (QC) methods become inefficient for determining the chemical and physical properties of such an extensive array of molecules. To address this challenge, we present MetaGIN, a lightweight deep learning framework designed for efficient and accurate molecular property prediction.
While traditional GNN models with 1-hop edges (i.e., covalent bonds) are sufficient for abstract graph representation, they are inadequate for capturing 3D features. Our MetaGIN model shows that including 2-hop and 3-hop edges (representing bond and torsion angles, respectively) is crucial to fully comprehend the intricacies of 3D molecules. Moreover, MetaGIN is a streamlined model with fewer than 10 million parameters, making it ideal for fine-tuning on a single GPU. It also adopts the widely acknowledged MetaFormer framework, which has consistently shown high accuracy in many computer vision tasks.
In our experiments, MetaGIN achieved a mean absolute error (MAE) of 0.0851 with just 8.87M parameters on the PCQM4Mv2 dataset, outperforming leading techniques across several datasets in the MoleculeNet benchmark. These results demonstrate MetaGIN’s potential to significantly accelerate drug discovery processes by enabling rapid and accurate prediction of molecular properties for large-scale databases.
Just Accepted Date: 11 September 2024Issue Date: 28 October 2024
Cite this article:
Xuan ZHANG,Cheng CHEN,Xiaoting WANG, et al. MetaGIN: a lightweight framework for molecular property prediction[J]. Front. Comput. Sci.,
2025, 19(5): 195912.
Fig.1 Performance evaluation on the PCQM4Mv2 dataset (without 3D structures). MetaGIN achieves a nearly optimal Mean Absolute Error (MAE) with a compact model that contains fewer than 10 million parameters
Fig.2 To accurately represent 3D structures, it is necessary to incorporate at least 3-hop features, which include bond length, bond angle, and torsion angle. This means that traditional Graph Convolutional Networks (GCNs) that only utilize 1-hop features fall short in representing 3D structures. MetaGIN, on the other hand, efficiently uses 3-hop features, satisfying the bare minimum requirements for 3D structure representation. (a) Minimal bond features to represent 3D structure; (b) traditional graph convolution; (c) 3 hop convolution
Fig.3 (a) The architecture of MetaGIN is based on MetaFormer, which includes both a token mixer block and a feed-forward neural (FFN) network block. Particularly, MetaGIN introduces two types of token mixer blocks: (b) the 3-hop convolution block and the (c) graph propagation block
Edge type
Feature description
1-hop
Type of the bond (e.g., single, double, triple).
Rank of the bond for chirality atom.
The stereochemistry of the bond.
Indicates if the bond is part of a conjugated system.
Indicates if the bond is rotatable.
2-hop
Number of paths for 2 hop neighbors.
3-hop
Number of paths for 3 hop neighbors.
Tab.1 Edge feature for different hops
Method
Complexity
# Params
MAE ↓
GCN [9]
GIN [9]
GCN-VN [9]
GIN-VN [9]
MetaGIN
0.0851
GRPE [14]
EGT [13]
GPS [29]
GEM-2 [11]
Tab.2 HOMO/LUMO gap prediction on the PCQM4Mv2 dataset
Mols
FreeSolv
ESOL
Lipophilicity
642
1128
4200
ECFP [30]
5.275(0.751)
2.359(0.454)
1.188(0.061)
TF_Robust [31]
4.122(0.085)
1.722(0.038)
0.909(0.060)
GraphConv [15]
2.900(0.135)
1.068(0.050)
0.712(0.049)
Weave [32]
2.398(0.250)
1.158(0.055)
0.813(0.042)
SchNet [33]
3.215(0.755)
1.045(0.064)
0.909(0.098)
MGCN [34]
3.349(0.097)
1.266(0.147)
1.113(0.041)
AttentiveFP [35]
2.030(0.420)
0.853(0.060)
0.650(0.030)
TrimNet [36]
2.529(0.111)
1.282(0.029)
0.702(0.008)
MPNN [37]
2.185(0.952)
1.167(0.430)
0.885(0.030)
DMPNN [38]
2.177(0.914)
0.980(0.258)
0.653(0.046)
FunQG-MPNN [39]
1.542(0.460)
0.879(0.091)
0.638(0.020)
FunQG-DMPNN [39]
1.501(0.376)
0.818(0.047)
0.622(0.028)
MetaGIN
1.397(0.062)
0.780(0.061)
0.532(0.013)
Tab.3 Regression tasks on MoleculeNet datasets
Mols
SIDER
ClinTox
BBBP
Tox21
ToxCast
1427
1478
2039
7831
8576
ECFP [30]
0.630(0.019)
0.673(0.031)
0.783(0.050)
0.760(0.009)
0.615(0.017)
TF_Robust [31]
0.607(0.033)
0.765(0.085)
0.860(0.087)
0.698(0.012)
0.585(0.031)
GraphConv [15]
0.593(0.035)
0.845(0.051)
0.877(0.036)
0.772(0.041)
0.650(0.025)
Weave [32]
0.543(0.034)
0.823(0.023)
0.837(0.065)
0.741(0.044)
0.678(0.024)
SchNet [33]
0.545(0.038)
0.717(0.042)
0.847(0.024)
0.767(0.025)
0.679(0.021)
MGCN [34]
0.552(0.018)
0.634(0.042)
0.850(0.064)
0.707(0.016)
0.663(0.009)
AttentiveFP [35]
0.605(0.060)
0.933(0.020)
0.908(0.050)
0.807(0.020)
0.579(0.001)
TrimNet [36]
0.606(0.006)
0.906(0.017)
0.892(0.025)
0.812(0.019)
0.652(0.032)
MPNN [37]
0.595(0.030)
0.879(0.054)
0.913(0.041)
0.808(0.024)
0.691(0.013)
DMPNN [38]
0.632(0.023)
0.897(0.040)
0.919(0.030)
0.826(0.023)
0.718(0.011)
FunQG-MPNN [39]
0.632(0.056)
0.838(0.025)
0.902(0.014)
0.842(0.012)
0.717(0.005)
FunQG-DMPNN [39]
0.642(0.034)
0.841(0.037)
0.914(0.010)
0.845(0.008)
0.721(0.009)
MetaGIN
0.645(0.024)
0.908(0.081)
0.917(0.018)
0.830(0.001)
0.714(0.015)
Tab.4 Classification tasks on MoleculeNet datasets
Hop
Repeat
Depth
Width
#Params
MAE ↓
1
1
4
256
2.82M
0.0920
2
1
4
256
3.63M
0.0881
3
1
4
256
4.45M
0.0871
1
3
4
256
4.29M
0.0883
3
1
8
256
8.92M
0.0855
3
1
4
512
8.87M
0.0851
Tab.5 Ablation studies
Fig.4 Table and image as subfigures. (a) Prediction of K hop distance with different hop edges; (b) relationship between 3D structure shifting and HOMO-LUMO gap deviation; (c) most influential K edges for different hop distance
1
X, Lin X, Li X Lin . A review on applications of computational methods in drug screening and design. Molecules, 2020, 25( 6): 1375
2
M M, Hann A R, Leach G Harper . Molecular complexity and its impact on the probability of finding leads for drug discovery. Journal of Chemical Information and Computer Sciences, 2001, 41( 3): 856–864
3
D T, Manallack R J, Prankerd E, Yuriev T I, Oprea D K Chalmers . The significance of acid/base properties in drug discovery. Chemical Society Reviews, 2013, 42( 2): 485–496
4
P, Geerlings Proft F, De W Langenaeker . Conceptual density functional theory. Chemical Reviews, 2003, 103( 5): 1793–1874
5
M, Motta S Zhang . Ab initio computations of molecular systems by the auxiliary-field quantum Monte Carlo method. WIREs Computational Molecular Science, 2018, 8( 5): e1364
6
H G Kümmel . A biography of the coupled cluster method. International Journal of Modern Physics B, 2003, 17( 28): 5311–5325
7
X, Zhang C, Chen Z, Meng Z, Yang H, Jiang X Cui . CoAtGIN: marrying convolution and attention for graph-based molecule property prediction. In: Proceedings of 2022 IEEE International Conference on Bioinformatics and Biomedicine. 2022, 374−379
8
Z, Wang Y, Wang X, Zhang Z, Meng Z, Yang W, Zhao X Cui . Graph-based reaction classification by contrasting between precursors and products. In: Proceedings of 2022 IEEE International Conference on Bioinformatics and Biomedicine. 2022, 354−359
9
W, Hu M, Fey H, Ren M, Nakata Y, Dong J Leskovec . OGB-LSC: a large-scale challenge for machine learning on graphs. In: Proceedings of the 35th Conference on Neural Information Processing Systems. 2021
10
A, Vaswani N, Shazeer N, Parmar J, Uszkoreit L, Jones A N, Gomez Ł, Kaiser I Polosukhin . Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6000−6010
11
L, Liu D, He X, Fang S, Zhang F, Wang J, He H Wu . GEM-2: next generation molecular property prediction network by modeling full-range many-body interactions. 2022, arXiv preprint arXiv: 2208.05863
12
V P, Dwivedi A T, Luu T, Laurent Y, Bengio X Bresson . Graph neural networks with learnable structural and positional representations. In: Proceedings of the 10th International Conference on Learning Representations. 2022
13
S, Hussain M J, Zaki D Subramanian . Global self-attention as a replacement for graph convolution. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022, 655−665
14
W, Park W, Chang D, Lee J, Kim S W Hwang . GRPE: relative positional encoding for graph transformer. 2022, arXiv preprint arXiv: 2201.12787
15
T N, Kipf M Welling . Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations. 2017
16
K, Xu W, Hu J, Leskovec S Jegelka . How powerful are graph neural networks? In: Proceedings of the 7th International Conference on Learning Representations. 2019
C, Bannwarth E, Caldeweyher S, Ehlert A, Hansen P, Pracht J, Seibert S, Spicher S Grimme . Extended tight-binding quantum chemistry methods. WIREs Computational Molecular Science, 2021, 11( 2): e1493
19
J, Feng Y, Chen F, Li A, Sarkar M Zhang . How powerful are K-hop message passing graph neural networks. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 345
J J, Irwin K G, Tang J, Young C, Dandarchuluun B R, Wong M, Khurelbaatar Y S, Moroz J, Mayfield R A Sayle . ZINC20—a free ultralarge-scale chemical database for ligand discovery. Journal of Chemical Information and Modeling, 2020, 60( 12): 6065–6073
22
H E, Pence A Williams . ChemSpider: an online chemical information resource. Journal of Chemical Education, 2010, 87( 11): 1123–1124
23
W, Hu M, Fey M, Zitnik Y, Dong H, Ren B, Liu M, Catasta J Leskovec . Open graph benchmark: Datasets for machine learning on graphs. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020, 1855
24
W, Yu M, Luo P, Zhou C, Si Y, Zhou X, Wang J, Feng S Yan . MetaFormer is actually what you need for vision. In: Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 10809−10819
25
Y, Wu K He . Group normalization. In: Proceedings of the 15th European Conference on Computer Vision. 2018, 3−19
Z, Wu B, Ramsundar E N, Feinberg J, Gomes C, Geniesse A S, Pappu K, Leswing V Pande . MoleculeNet: a benchmark for molecular machine learning. Chemical Science, 2018, 9( 2): 513–530
28
X, Xie P, Zhou H, Li Z, Lin S Yan . Adan: adaptive nesterov momentum algorithm for faster optimizing deep models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, doi: 10.1109/TPAMI.2024.3423382
29
L, Rampášek M, Galkin V P, Dwivedi A T, Luu G, Wolf D Beaini . Recipe for a general, powerful, scalable graph transformer. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 1054
30
B, Ramsundar S, Kearnes P, Riley D, Webster D, Konerding V Pande . Massively multitask networks for drug discovery. 2015, arXiv preprint arXiv: 1502.02072
31
D, Rogers M Hahn . Extended-connectivity fingerprints. Journal of Chemical Information and Modeling, 2010, 50( 5): 742–754
32
S, Kearnes K, McCloskey M, Berndl V, Pande P Riley . Molecular graph convolutions: Moving beyond fingerprints. Journal of Computer-Aided Molecular Design, 2016, 30( 8): 595–608
33
K T, Schütt P J, Kindermans H E, Sauceda S, Chmiela A, Tkatchenko K R Müller . SchNet: a continuous-filter convolutional neural network for modeling quantum interactions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 992−1002
34
C, Lu Q, Liu C, Wang Z, Huang P, Lin L He . Molecular property prediction: a multilevel quantum interactions modeling perspective. In: Proceedings of the 33rd AAAI Conference on Artificial Intelligence. 2019, 1052−1060
35
Z, Xiong D, Wang X, Liu F, Zhong X, Wan X, Li Z, Li X, Luo K, Chen H, Jiang M Zheng . Pushing the boundaries of molecular representation for drug discovery with the graph attention mechanism. Journal of Medicinal Chemistry, 2020, 63( 16): 8749–8760
36
R, Liaw E, Liang R, Nishihara P, Moritz J E, Gonzalez I Stoica . Tune: a research platform for distributed model selection and training. 2018, arXiv preprint arXiv: 1807.05118
37
J, Gilmer S S, Schoenholz P F, Riley O, Vinyals G E Dahl . Neural message passing for quantum chemistry. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 1263−1272
38
K, Yang K, Swanson W, Jin C, Coley P, Eiden H, Gao A, Guzman-Perez T, Hopper B, Kelley M, Mathea A, Palmer V, Settels T, Jaakkola K, Jensen R Barzilay . Analyzing learned molecular representations for property prediction. Journal of Chemical Information and Modeling, 2019, 59( 8): 3370–3388
39
H, Hajiabolhassan Z, Taheri A, Hojatnia Y T Yeganeh . FunQG: molecular representation learning via quotient graphs. Journal of Chemical Information and Modeling, 2023, 63( 11): 3275–3287