Please wait a minute...
Frontiers of Chemical Science and Engineering

ISSN 2095-0179

ISSN 2095-0187(Online)

CN 11-5981/TQ

Postal Subscription Code 80-969

2018 Impact Factor: 2.809

Front. Chem. Sci. Eng.    2024, Vol. 18 Issue (12) : 149    https://doi.org/10.1007/s11705-024-2500-7
Machine learning meets enzyme engineering: examples in the design of polyethylene terephthalate hydrolases
Rohan Ali1,2, Yifei Zhang1,2()
1. State Key Laboratory of Chemical Resources Engineering, Beijing University of Chemical Technology, Beijing 100029, China
2. Beijing Advanced Innovation Center for Soft Matter Science and Engineering, Beijing University of Chemical Technology, Beijing 100029, China
 Download: PDF(867 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

The trend of employing machine learning methods has been increasing to develop promising biocatalysts. Leveraging the experimental findings and simulation data, these methods facilitate enzyme engineering and even the design of new-to-nature enzymes. This review focuses on the application of machine learning methods in the engineering of polyethylene terephthalate (PET) hydrolases, enzymes that have the potential to help address plastic pollution. We introduce an overview of machine learning workflows, useful methods and tools for protein design and engineering, and discuss the recent progress of machine learning-aided PET hydrolase engineering and de novo design of PET hydrolases. Finally, as machine learning in enzyme engineering is still evolving, we foresee that advancements in computational power and quality data resources will considerably increase the use of data-driven approaches in enzyme engineering in the coming decades.

Keywords machine learning      artificial intelligence      enzyme engineering      polyethylene terephthalate hydrolase      enzyme design     
Corresponding Author(s): Yifei Zhang   
Just Accepted Date: 27 June 2024   Issue Date: 23 September 2024
 Cite this article:   
Yifei Zhang,Rohan Ali. Machine learning meets enzyme engineering: examples in the design of polyethylene terephthalate hydrolases[J]. Front. Chem. Sci. Eng., 2024, 18(12): 149.
 URL:  
https://academic.hep.com.cn/fcse/EN/10.1007/s11705-024-2500-7
https://academic.hep.com.cn/fcse/EN/Y2024/V18/I12/149
Fig.1  Workflow diagram of machine learning model training. From left to right, step 1 involves fetching and sorting data, step 2 is training algorithm with sorted training/labeled data and building model, and step 3 is validating the model with test data and fine-tuning the model.
Objective Machine learning tool Machine learning algorithms Input data Availability Ref.
Enzyme classification DeepEC Convolutional neural network Protein sequences Downloadable [63]
ECPred Ensemble (SVM, k-nearest neighbors) Protein sequences Webserver [64]
mlDEEPre Ensemble (Convolutional neural network, recurrent neural networks) Protein sequences Webserver [65]
Substrate identification innov’SAR Partial least squares regression Protein structures and sequences Webserver [66]
pNPred Random forest Protein structures and sequences Webserver [67]
AdenylPred Random forest Protein sequences N/A [68]
Enzyme catalytic site prediction PREvaIL Random forest Protein structures and sequences Downloadable [69]
3DCNN Convolutional neural network Protein structure N/A [70]
MAHOMES Random forest Protein structures Downloadable [32]
POOL POOL Protein structures and sequences Webserver [71]
Optimum condition prediction TOME Random forest Protein sequences Downloadable [72]
TAXyl Random forest Protein sequences Downloadable [73]
Enzyme activity prediction DLKcat Graph-based and Convolutional neural networks Substrates from SMILES and enzyme sequences Downloadable [74]
MaxEnt Statistical Potts model Single and pairwise amino acid frequencies from MSA Downloadable [75]
MutCompute Self-supervised convolutional neural network Protein structures Webserver [76]
innov’SAR Partial least-squares regression N/A N/A [77]
Machine learning-variants-Hoie-et-al Random forest Custom preprocessing, derived from PRISM approach Downloadable [78]
innov’SAR Partial least squares regression Protein sequences N/A [77]
SolventNet Convolutional neural network Protein structures N/A [79]
EnzyKR Classifier-regressor architecture Substrate-hydrolase complexes Downloadable [80]
Stability prediction (ΔΔG) BayeStab Bayesian neural networks Protein structures Webserver [81]
PROST Ensemble model Protein sequences Downloadable [82]
KORPM Nonlinear regression Protein sequences Downloadable [83]
ABYSSAL Siamese deep neural networks Protein sequences Downloadable [84]
TOMER bagging with resampling Protein sequences Downloadable [85]
Protein solubility PON-Sol2 LightGBM Protein sequences Webserver [86]
Protein design bmDCA Direct coupling analysis Protein sequences Downloadable [87]
bmDCA Linear regression Protein structures Downloadable [88]
ProteinMPNN Message-passing neural network Protein structures Downloadable [89]
RFdiffusion Denoising diffusion probabilistic model Protein structures Downloadable [90]
FoldingDiff Denoising diffusion probabilistic model Backbones from the CATH data set Downloadable [91]
GearNet Graph neural network Protein structures Downloadable [92]
Tab.1  A short list of machine learning tools that are helpful in enzyme engineering
Fig.2  Typical machine learning-based strategies for performance and stability enhancement of PET hydrolase. (a) The features of the local microenvironment from proteins in the PDB were extracted by feeding structural characteristics into a series of layers, and as an output, each amino acid was given a probability value (showing chemical congruency). After classification, the predicted mutations obtained were experimentally validated and four mutations showed improvements. These four mutations were introduced into three PETase scaffolds, and with ThermoPETase as a scaffold, the most efficient variant FAST-PETase was obtained. Reprinted with permission from Ref. [16], copyright 2022, Springer Nature. (b) Promising mutations were predicted by the Transformer model after being trained on two different data sets. The mutations in PET binding site residues lead to the H218S/F222I variant (M2). There was a decrease in stability with further engineering of M2 by adding the remaining mutations at predicted sites. Therefore, the GRAPE strategy was used, resulting in the final mutant TurboPETase. Reprinted with permission from Ref. [18], copyright 2024, Springer Nature. (c) Three different machine learning algorithms (Logistic, SVM and Random Forest) were employed on MD trajectories generated by data from Protherm, to understand protein stability and structural features. The learned rules were employed for building a Random Forest-based model to predict thermal stability (Tm) changes leading to the generation of the TfCutPSP mutant. Reprinted with permission from Ref. [15], copyright 2022, Research Network of Computational and Structural Biotechnology.
Fig.3  Performances of PET hydrolases engineered by machine learning methods. The figure illustrates the increase in activity fold and changes in Tm of the engineered PET hydrolases sorted by their year of discovery. From left to right: TfCut2 (blue) was used as the WT to engineer ① TfCut2PSP (green) by using MD simulations with MDL [15], ② TfCut2EEQ (blue) by MutCompute model [17]; ③ Fast-PETase (purple) was engineered by employing MutCompute model from Thermo-PETase (purple) [16], ④ LCCICCG-I6M (red) by Preoptem and evolutionary analysis from LCCICCG (orange) [14]; ⑤ TurboPETase (green) was engineered from BhrPETase (red) by integrating protein language and model force-field-based algorithms [18]. The color reflects the Tm value of each PET hydrolase.
Fig.4  A computational workflow of protein scaffold remodeling for creating designer PET hydrolases. The workflow comprises four stages: in the scaffold remodeling stage, catalytic sites and some adjacent structures were extracted and the missing scaffold sequences were generated by inpainting. Then, in the computational screening stage, newly generated sequences were computationally analyzed based on protein microenvironment features. In the experimental validation stage, the sequences were expressed and evaluated based on activity and expression level. Designs exhibiting low activity and expression were then fed to the sequence refinement stage for improvement by employing machine learning-based strategies (RFjoint and ProteinMPNN). The quality designs were obtained by employing iterative rounds of sequence refinement, computational screening, and experimental validation.
1 A S BescondA Pujari. PET Polymer—Chemical Economics Handbook (IHS Markit). 2020
2 C M Carr , D J Clarke , A D W Dobson . Microbial polyethylene terephthalate hydrolases: current and future perspectives. Frontiers in Microbiology, 2020, 11: 571265
https://doi.org/10.3389/fmicb.2020.571265
3 R Wei , G von Haugwitz , L Pfaff , J Mican , C P S Badenhorst , W Liu , G Weber , H P Austin , D Bednar , J Damborsky . et al.. Mechanism-based design of efficient PET hydrolases. ACS Catalysis, 2022, 12(6): 3382–3396
https://doi.org/10.1021/acscatal.1c05856
4 Y Fang , K Chao , J He , Z Wang , Z Chen . High-efficiency depolymerization/degradation of polyethylene terephthalate plastic by a whole-cell biocatalyst. Biotech, 2023, 13(5): 138
5 E Ambrose-Dempster , L Leipold , D Dobrijevic , M Bawn , E M Carter , G Stojanovski , T D Sheppard , J W Jeffries , J M Ward , H C Hailes . Mechanoenzymatic reactions for the hydrolysis of PET. RSC Advances, 2023, 13(15): 9954–9962
https://doi.org/10.1039/D3RA01708G
6 F Cao , L Wang , R Zheng , L Guo , Y Chen , X Qian . Research and progress of chemical depolymerization of waste PET and high-value application of its depolymerization products. RSC Advances, 2022, 12(49): 31564–31576
https://doi.org/10.1039/D2RA06499E
7 J Lai , H Huang , M Lin , Y Xu , X Li , B Sun . Enzyme catalyzes ester bond synthesis and hydrolysis: the key step for sustainable usage of plastics. Frontiers in Microbiology, 2023, 13: 1113705
https://doi.org/10.3389/fmicb.2022.1113705
8 R P Magalhães , J M Cunha , S F Sousa . Perspectives on the role of enzymatic biocatalysis for the degradation of plastic PET. International Journal of Molecular Sciences, 2021, 22(20): 11257
https://doi.org/10.3390/ijms222011257
9 E Akram , Y Cao , H Xing , Y Ding , Y Luo , R Wei , Y Zhang . On the temperature dependence of enzymatic degradation of poly(ethylene terephthalate). Chinese Journal of Catalysis, 2024, 60: 284–293
https://doi.org/10.1016/S1872-2067(23)64628-5
10 R J Müller , H Schrader , J Profe , K Dresler , W D Deckwer . Enzymatic degradation of poly(ethylene terephthalate): rapid hydrolyse using a hydrolase from T. fusca. Macromolecular Rapid Communications, 2005, 26(17): 1400–1405
https://doi.org/10.1002/marc.200500410
11 S Sulaiman , S Yamato , E Kanaya , J J Kim , Y Koga , K Takano , S Kanaya . Isolation of a novel cutinase homolog with polyethylene terephthalate-degrading activity from leaf-branch compost by using a metagenomic approach. Applied and Environmental Microbiology, 2012, 78(5): 1556–1562
https://doi.org/10.1128/AEM.06725-11
12 S Yoshida , K Hiraga , T Takehana , I Taniguchi , H Yamaji , Y Maeda , K Toyohara , K Miyamoto , Y Kimura , K Oda . A bacterium that degrades and assimilates poly(ethylene terephthalate). Science, 2016, 351(6278): 1196–1199
https://doi.org/10.1126/science.aad6359
13 Y Cui , Y Chen , X Liu , S Dong , Y E Tian , Y Qiao , R Mitra , J Han , C Li , X Han . et al.. Computational redesign of a PETase for plastic biodegradation under ambient condition by the grape strategy. ACS Catalysis, 2021, 11(3): 1340–1350
https://doi.org/10.1021/acscatal.0c05126
14 Z Ding , G Xu , R Miao , N Wu , W Zhang , B Yao , F Guan , H Huang , J Tian . Rational redesign of thermophilic PET hydrolase LCCICCG to enhance hydrolysis of high crystallinity polyethylene terephthalates. Journal of Hazardous Materials, 2023, 453: 131386
https://doi.org/10.1016/j.jhazmat.2023.131386
15 Q Li , Y Zheng , T Su , Q Wang , Q Liang , Z Zhang , Q Qi , J Tian . Computational design of a cutinase for plastic biodegradation by mining molecular dynamics simulations trajectories. Computational and Structural Biotechnology Journal, 2022, 20: 459–470
https://doi.org/10.1016/j.csbj.2021.12.042
16 H Lu , D J Diaz , N J Czarnecki , C Zhu , W Kim , R Shroff , D J Acosta , B R Alexander , H O Cole , Y Zhang . et al.. Machine learning-aided engineering of hydrolases for PET depolymerization. Nature, 2022, 604(7907): 662–667
https://doi.org/10.1038/s41586-022-04599-z
17 S Meng , Z Li , P Zhang , F Contreras , Y Ji , U Schwaneberg . Deep learning guided enzyme engineering of Thermobifida fusca cutinase for increased PET depolymerization. Chinese Journal of Catalysis, 2023, 50: 229–238
https://doi.org/10.1016/S1872-2067(23)64470-5
18 Y Cui , Y Chen , J Sun , T Zhu , H Pang , C Li , W C Geng , B Wu . Computational redesign of a hydrolase for nearly complete PET depolymerization at industrially relevant high-solids loading. Nature Communications, 2024, 15(1): 1417
https://doi.org/10.1038/s41467-024-45662-9
19 E L Bell , R Smithson , S Kilbride , J Foster , F J Hardy , S Ramachandran , A A Tedstone , S J Haigh , A A Garforth , P J Day . et al.. Directed evolution of an efficient and thermostable PET depolymerase. Nature Catalysis, 2022, 5(8): 673–681
https://doi.org/10.1038/s41929-022-00821-3
20 F Liu , T Wang , W Yang , Y Zhang , Y Gong , X Fan , G Wang , Z Lu , J Wang . Current advances in the structural biology and molecular engineering of PETase. Frontiers in Bioengineering and Biotechnology, 2023, 11: 1263996
https://doi.org/10.3389/fbioe.2023.1263996
21 H F Son , I J Cho , S Joo , H Seo , H Y Sagong , S Y Choi , S Y Lee , K J Kim . Rational protein engineering of thermo-stable PETase from Ideonella sakaiensis for highly efficient PET degradation. ACS Catalysis, 2019, 9(4): 3519–3526
https://doi.org/10.1021/acscatal.9b00568
22 H S Zurier , J M Goddard . A high-throughput expression and screening platform for applications-driven PETase engineering. Biotechnology and Bioengineering, 2023, 120(4): 1000–1014
https://doi.org/10.1002/bit.28319
23 V Tournier , C Topham , A Gilles , B David , C Folgoas , E Moya Leclair , E Kamionka , M L Desrousseaux , H Texier , S Gavalda . et al.. An engineered PET depolymerase to break down and recycle plastic bottles. Nature, 2020, 580(7802): 216–219
https://doi.org/10.1038/s41586-020-2149-4
24 S Thiyagarajan , E Maaskant-Reilink , T A Ewing , M K Julsing , J Van Haveren . Back-to-monomer recycling of polycondensation polymers: opportunities for chemicals and enzymes. RSC Advances, 2022, 12(2): 947–970
https://doi.org/10.1039/D1RA08217E
25 K K YangZ WuF H Arnold. Machine learning in protein engineering. Preprint arXiv, 2018, arXiv:181110775
26 S Mazurenko , Z Prokop , J Damborsky . Machine learning in enzyme engineering. ACS Catalysis, 2020, 10(2): 1210–1223
https://doi.org/10.1021/acscatal.9b04321
27 C Chang , V L Deringer , K S Katti , V Van Speybroeck , C M Wolverton . Simulations in the era of exascale computing. Nature Reviews. Materials, 2023, 8(5): 309–313
https://doi.org/10.1038/s41578-023-00540-6
28 E O Pyzer-Knapp , J W Pitera , P W Staar , S Takeda , T Laino , D P Sanders , J Sexton , J R Smith , A Curioni . Accelerating materials discovery using artificial intelligence, high performance computing and robotics. npj Computational Materials, 2022, 8(1): 84
https://doi.org/10.1038/s41524-022-00765-z
29 V Singh , S Patra , N A Murugan , D C Toncu , A Tiwari . Recent trends in computational tools and data-driven modeling for advanced materials. Materials Advances, 2022, 3(10): 4069–4087
https://doi.org/10.1039/D2MA00067A
30 M Beller , M Bender , U T Bornscheuer , S Schunk . Catalysis—Far from Being a Mature Technology. Chemieingenieurtechnik, 2022, 94(11): 1559–1559
https://doi.org/10.1002/cite.202271102
31 J G Greener , S M Kandathil , L Moffat , D T Jones . A guide to machine learning for biologists. Nature Reviews. Molecular Cell Biology, 2022, 23(1): 40–55
https://doi.org/10.1038/s41580-021-00407-0
32 R Feehan , D Montezano , J S Slusky . Machine learning for enzyme engineering, selection and design. Protein Engineering, Design & Selection, 2021, 34: gzab019
33 B Markus , G C C , K Andreas , K Arkadij , L Stefan , O Gustav , S Elina , S Radka . Accelerating biocatalysis discovery with machine learning: a paradigm shift in enzyme engineering, discovery, and design. ACS Catalysis, 2023, 13(21): 14454–14469
https://doi.org/10.1021/acscatal.3c03417
34 P S Sampaio , P Fernandes . Machine learning: a suitable method for biocatalysis. Catalysts, 2023, 13(6): 961
https://doi.org/10.3390/catal13060961
35 B S Olivier ChapelleZ Alexander. A continuation method for semi-supervised SVMs. In: Proceedings of the 23rd International Conference on Machine learning, NY: ACM Press, 2006, 185–192
36 P Kouba , P Kohout , F Haddadi , A Bushuiev , R Samusevich , J Sedlar , J Damborsky , T Pluskal , J Sivic , S Mazurenko . Machine learning-guided protein engineering. ACS Catalysis, 2023, 13(21): 13863–13895
https://doi.org/10.1021/acscatal.3c02743
37 I Schomburg , A Chang , D Schomburg . Brenda, enzyme data and metabolic information. Nucleic Acids Research, 2002, 30(1): 47–49
https://doi.org/10.1093/nar/30.1.47
38 H M Berman , J Westbrook , Z Feng , G Gilliland , T N Bhat , H Weissig , I N Shindyalov , P E Bourne . The Protein Data Bank. Nucleic Acids Research, 2000, 28(1): 235–242
https://doi.org/10.1093/nar/28.1.235
39 B Yan , X Ran , A Gollu , Z Cheng , X Zhou , Y Chen , Z J Yang . IntEnzyDB: an integrated structure-kinetics enzymology database. Journal of Chemical Information and Modeling, 2022, 62(22): 5841–5848
https://doi.org/10.1021/acs.jcim.2c01139
40 T U Consortium . UniProt: a worldwide hub of protein knowledge. Nucleic Acids Research, 2019, 47(D1): D506–D515
https://doi.org/10.1093/nar/gky1049
41 T U Consortium . UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Research, 2021, 49(D1): D480–D489
https://doi.org/10.1093/nar/gkaa1100
42 J Pleiss . Standardized data, scalable documentation, sustainable storage-EnzymeML as a basis for FAIR data management in biocatalysis. ChemCatChem, 2021, 13(18): 3909–3913
https://doi.org/10.1002/cctc.202100822
43 J Velecký , M Hamsikova , J Stourac , M Musil , J Damborsky , D Bednar , S Mazurenko . SoluProtMutDB: a manually curated database of protein solubility changes upon mutations. Computational and Structural Biotechnology Journal, 2022, 20: 6339–6347
https://doi.org/10.1016/j.csbj.2022.11.009
44 J S Xavier , T B Nguyen , M Karmarkar , S Portelli , P M Rezende , J P Velloso , D B Ascher , D E Pires . ThermoMutDB: a thermodynamic database for missense mutations. Nucleic Acids Research, 2021, 49(D1): D475–D479
https://doi.org/10.1093/nar/gkaa925
45 J Stourac , J Dubrava , M Musil , J Horackova , J Damborsky , S Mazurenko , D Bednar . FireProtDB: database of manually curated protein stability data. Nucleic Acids Research, 2021, 49(D1): D319–D324
https://doi.org/10.1093/nar/gkaa981
46 R Nikam , A Kulandaisamy , K Harini , D Sharma , M M Gromiha . ProThermDB: thermodynamic database for proteins and mutants revisited after 15 years. Nucleic Acids Research, 2021, 49(D1): D420–D424
https://doi.org/10.1093/nar/gkaa1035
47 E Heid , D Probst , W H Green , G K Madsen . EnzymeMap: curation, validation and data-driven prediction of enzymatic reactions. Chemical Science, 2023, 48(14): 14229–14242
https://doi.org/10.1039/D3SC02048G
48 D Probst , M Manica , Y G Nana Teukam , A Castrogiovanni , F Paratore , T Laino . Biocatalysed synthesis planning using data-driven learning. Nature Communications, 2022, 13(1): 964
https://doi.org/10.1038/s41467-022-28536-w
49 M Ganter , T Bernard , S Moretti , J Stelling , M Pagni . MetaNetX. org: a website and repository for accessing, analysing and manipulating metabolic networks. Bioinformatics, 2013, 29(6): 815–816
https://doi.org/10.1093/bioinformatics/btt036
50 J Hafner , H MohammadiPeyhani , A Sveshnikova , A Scheidegger , V Hatzimanikatis . MohammadiPeyhani H, Sveshnikova A, Scheidegger A, Hatzimanikatis V. Updated atlas of biochemistry with new metabolites and improved enzyme prediction power. ACS Synthetic Biology, 2020, 9(6): 1479–1482
https://doi.org/10.1021/acssynbio.0c00052
51 D S Wishart , C Li , A Marcu , H Badran , A Pon , Z Budinski , J Patron , D Lipton , X Cao , E Oler . et al.. PathBank: a comprehensive pathway database for model organisms. Nucleic Acids Research, 2020, 48(D1): D470–D478
https://doi.org/10.1093/nar/gkz861
52 U Wittig , M Rey , A Weidemann , R Kania , W Müller . SABIO-RK: an updated resource for manually curated biochemical reaction kinetics. Nucleic Acids Research, 2018, 46(D1): D656–D660
https://doi.org/10.1093/nar/gkx1065
53 H M Afify , M B Abdelhalim , M S Mabrouk , A Y Sayed . Protein secondary structure prediction (PSSP) using different machine algorithms. Egyptian Journal of Medical Human Genetics, 2021, 22(1): 1–10
https://doi.org/10.1186/s43042-021-00173-w
54 B Liu , X Wang , L Lin , B Tang , Q Dong , X Wang . Prediction of protein binding sites in protein structures using hidden Markov support vector machine. BMC Bioinformatics, 2009, 10(1): 1–14
https://doi.org/10.1186/1471-2105-10-381
55 M Palla , S Punthambaker , B Stranges , F Vigneault , J Nivala , D Wiegand , A Ayer , T Craig , D Gremyachinskiy , H Franklin . et al.. Multiplex single-molecule kinetics of nanopore-coupled polymerases. ACS Nano, 2021, 15(1): 489–502
https://doi.org/10.1021/acsnano.0c05226
56 X Fang , J Huang , R Zhang , F Wang , Q Zhang , G Li , J Yan , H Zhang , Y Yan , L Xu . Convolution neural network-based prediction of protein thermostability. Journal of Chemical Information and Modeling, 2019, 59(11): 4833–4843
https://doi.org/10.1021/acs.jcim.9b00220
57 S Gelman , S A Fahlberg , P Heinzelman , P A Romero , A Gitter . Neural networks to learn protein sequence-function relationships from deep mutational scanning data. Proceedings of the National Academy of Sciences of the United States of America, 2021, 118(48): e2104878118
https://doi.org/10.1073/pnas.2104878118
58 J Mellor , I Grigoras , P Carbonell , J L Faulon . Semisupervised gaussian process for automated enzyme search. ACS Synthetic Biology, 2016, 5(6): 518–528
https://doi.org/10.1021/acssynbio.5b00294
59 D E Pires , D B Ascher , T L Blundell . mCSM: predicting the effects of mutations in proteins using graph-based signatures. Bioinformatics, 2014, 30(3): 335–342
https://doi.org/10.1093/bioinformatics/btt691
60 K Hakala , S Kaewphan , J Björne , F Mehryary , H Moen , M Tolvanen , T Salakoski , F Ginter . Neural network and random forest models in protein function prediction. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2022, 19(3): 1772–1781
https://doi.org/10.1109/TCBB.2020.3044230
61 C Kathuria , D Mehrotra , N K Misra . Predicting the protein structure using random forest approach. Procedia Computer Science, 2018, 132: 1654–1662
https://doi.org/10.1016/j.procs.2018.05.134
62 C Wang , Y Chen , Y Zhang , K Li , M Lin , F Pan , W Wu , J Zhang . A reinforcement learning approach for protein-ligand binding pose prediction. BMC Bioinformatics, 2022, 23(1): 1–18
https://doi.org/10.1186/s12859-022-04912-7
63 J Y Ryu , H U Kim , S Y Lee . Deep learning enables high-quality and high-throughput prediction of enzyme commission numbers. Proceedings of the National Academy of Sciences of the United States of America, 2019, 116(28): 13996–14001
https://doi.org/10.1073/pnas.1821905116
64 A Dalkiran , A S Rifaioglu , M J Martin , A R Cetin , V Atalay , T Doğan . ECPred: a tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature. BMC Bioinformatics, 2018, 19(1): 1–13
https://doi.org/10.1186/s12859-018-2368-y
65 Z Zou , S Tian , X Gao , Y Li . mlDEEPre: multi-functional enzyme function prediction with hierarchical multi-label deep learning. Frontiers in Genetics, 2019, 9: 714
https://doi.org/10.3389/fgene.2018.00714
66 F Cadet , N Fontaine , G Li , J Sanchis , F C M Ng , R Pandjaitan , I Vetrivel , B Offmann , M T Reetz . A machine learning approach for reliable prediction of amino acid interactions and its application in the directed evolution of enantioselective enzymes. Scientific Reports, 2018, 8(1): 16757
https://doi.org/10.1038/s41598-018-35033-y
67 S L Robinson , M D Smith , J E Richman , K G Aukema , L P Wackett . Machine learning-based prediction of activity and substrate specificity for OleA enzymes in the thiolase superfamily. Synthetic Biology, 2020, 5(1): ysaa004
https://doi.org/10.1093/synbio/ysaa004
68 S L Robinson , B R Terlouw , M D Smith , S J Pidot , T P Stinear , M H Medema , L P Wackett . Global analysis of adenylate-forming enzymes reveals β-lactone biosynthesis pathway in pathogenic nocardia. Journal of Biological Chemistry, 2020, 295(44): 14826–14839
https://doi.org/10.1074/jbc.RA120.013528
69 J Song , F Li , K Takemoto , G Haffari , T Akutsu , K C Chou , G I Webb . Prevail, an integrative approach for inferring catalytic residues using sequence, structural, and network features in a machine-learning framework. Journal of Theoretical Biology, 2018, 443: 125–137
https://doi.org/10.1016/j.jtbi.2018.01.023
70 W Torng , R B Altman . High precision protein functional site detection using 3D convolutional neural networks. Bioinformatics, 2019, 35(9): 1503–1512
https://doi.org/10.1093/bioinformatics/bty813
71 S Somarowthu , H Yang , D G Hildebrand , M J Ondrechen . High-performance prediction of functional residues in proteins with machine learning and computed input features. Biopolymers, 2011, 95(6): 390–400
https://doi.org/10.1002/bip.21589
72 G Li , K S Rabe , J Nielsen , M K Engqvist . Machine learning applied to predicting microorganism growth temperatures and enzyme catalytic optima. ACS Synthetic Biology, 2019, 8(6): 1411–1420
https://doi.org/10.1021/acssynbio.9b00099
73 S M Foroozandeh , K Farhadyar , K Kavousi , M H Azarabad , A Boroomand , S Ariaeenejad , S G Hosseini . A generalized machine-learning aided method for targeted identification of industrial enzymes from metagenome: a xylanase temperature dependence case study. Biotechnology and Bioengineering, 2021, 118(2): 759–769
https://doi.org/10.1002/bit.27608
74 F Li , L Yuan , H Lu , G Li , Y Chen , M K Engqvist , E J Kerkhoven , J Nielsen . Deep learning-based kcat prediction enables improved enzyme-constrained model reconstruction. Nature Catalysis, 2022, 5(8): 662–672
https://doi.org/10.1038/s41929-022-00798-z
75 W J Xie , M Asadi , A Warshel . Enhancing computational enzyme design by a maximum entropy strategy. Proceedings of the National Academy of Sciences of the United States of America, 2022, 119(7): e2122355119
https://doi.org/10.1073/pnas.2122355119
76 R Shroff , A W Cole , D J Diaz , B R Morrow , I Donnell , A Annapareddy , J Gollihar , A D Ellington , R Thyer . Discovery of novel gain-of-function mutations guided by structure-based deep learning. ACS Synthetic Biology, 2020, 9(11): 2927–2935
https://doi.org/10.1021/acssynbio.0c00345
77 R Ostafe , N Fontaine , D Frank , F C M Ng , R Prodanovic , R Pandjaitan , B Offmann , F Cadet , R Fischer . One-shot optimization of multiple enzyme parameters: tailoring glucose oxidase for pH and electron mediators. Biotechnology and Bioengineering, 2020, 117(1): 17–29
https://doi.org/10.1002/bit.27169
78 M H Høie , M Cagiada , A H B Frederiksen , A Stein , Larsen K Lindorff . Predicting and interpreting large-scale mutagenesis data using analyses of protein stability and conservation. Cell Reports, 2022, 38(2): 110207
https://doi.org/10.1016/j.celrep.2021.110207
79 A K Chew , S Jiang , W Zhang , V M Zavala , R C Van Lehn . Fast predictions of liquid-phase acid-catalyzed reaction rates using molecular dynamics simulations and convolutional neural networks. Chemical Science, 2020, 11(46): 12464–12476
https://doi.org/10.1039/D0SC03261A
80 X RanY JiangQ ShaoZ J Yang. EnzyKR: a chirality-aware deep learning model for predicting the outcomes of the hydrolase-catalyzed kinetic resolution. Chemical Science, 2023, 14(43): 12073–12082
81 S Wang , H Tang , Y Zhao , L Zuo . BayeStab: predicting effects of mutations on protein stability with uncertainty quantification. Protein Science, 2022, 31(11): e4467
https://doi.org/10.1002/pro.4467
82 S Iqbal , F Ge , F Li , T Akutsu , Y Zheng , R B Gasser , D J Yu , G I Webb , J Song . PROST: Alphafold2-aware sequence-based predictor to estimate protein stability changes upon missense mutations. Journal of Chemical Information and Modeling, 2022, 62(17): 4270–4282
https://doi.org/10.1021/acs.jcim.2c00799
83 I M Hernández , Y Dehouck , U Bastolla , J R López-Blanco , P Chacón . Predicting protein stability changes upon mutation using a simple orientational potential. Bioinformatics, 2023, 39(1): btad011
https://doi.org/10.1093/bioinformatics/btad011
84 M A Pak , K A Markhieva , M S Novikova , D S Petrov , I S Vorobyev , E S Maksimova , F A Kondrashov , D N Ivankov . Using Alphafold to predict the impact of single mutations on protein stability and function. PLoS One, 2023, 18(3): e0282689
https://doi.org/10.1371/journal.pone.0282689
85 J E Gado , G T Beckham , C M Payne . Improving enzyme optimum temperature prediction with resampling strategies and ensemble learning. Journal of Chemical Information and Modeling, 2020, 60(8): 4098–4107
https://doi.org/10.1021/acs.jcim.0c00489
86 Y Yang , L Zeng , M Vihinen . PON-Sol2: prediction of effects of variants on protein solubility. International Journal of Molecular Sciences, 2021, 22(15): 8027
https://doi.org/10.3390/ijms22158027
87 W P Russ , M Figliuzzi , C Stocker , P Barrat-Charlaix , M Socolich , P Kast , D Hilvert , R Monasson , S Cocco , M Weigt . An evolution-based model for designing chorismate mutase enzymes. Science, 2020, 369(6502): 440–445
https://doi.org/10.1126/science.aba3304
88 W S Mak , X Wang , R Arenas , Y Cui , S Bertolani , W Q Deng , I Tagkopoulos , D K Wilson , J B Siegel . Discovery, design, and structural characterization of alkane-producing enzymes across the ferritin-like superfamily. Biochemistry, 2020, 59(40): 3834–3843
https://doi.org/10.1021/acs.biochem.0c00665
89 J Dauparas , I Anishchenko , N Bennett , H Bai , R J Ragotte , L F Milles , B I Wicky , A Courbet , R J de Haas , N Bethel . et al.. Robust deep learning-based protein sequence design using ProteinMPNN. Science, 2022, 378(6615): 49–56
https://doi.org/10.1126/science.add2187
90 J L Watson , D Juergens , N R Bennett , B L Trippe , J Yim , H E Eisenach , W Ahern , A J Borst , R J Ragotte , L F Milles . et al.. De novo design of protein structure and function with RFdiffusion. Nature, 2023, 620(7976): 1089–1100
https://doi.org/10.1038/s41586-023-06415-8
91 K E Wu , K K Yang , R van den Berg , S Alamdari , J Y Zou , A X Lu , A P Amini . Berg R V D, Zou J Y, Lu A X, et al. Protein structure generation via folding diffusion. Nature Communications, 2024, 15(1): 1059
https://doi.org/10.1038/s41467-024-45051-2
92 Z ZhangM XuA JamasbV ChenthamarakshanA LozanoP DasJ Tang. Protein representation learning by geometric structure pretraining. Preprint arXiv: 2203.06125, 2022
93 J Zhang , H Wang , Z Luo , Z Yang , Z Zhang , P Wang , M Li , Y Zhang , Y Feng , D Lu . et al.. Computational design of highly efficient thermostable MHET hydrolases and dual enzyme system for PET recycling. Communications Biology, 2023, 6(1): 1135
https://doi.org/10.1038/s42003-023-05523-5
94 A Xu , J Zhou , L M Blank , M Jiang . Future focuses of enzymatic plastic degradation. Trends in Microbiology, 2023, 31(7): 668–671
https://doi.org/10.1016/j.tim.2023.04.002
95 Y Zhang . A relay for improving the catalytic efficiency and thermostability of PET hydrolases. Chem Catalysis, 2022, 2(10): 2420–2422
https://doi.org/10.1016/j.checat.2022.09.029
96 J SchymkowitzJ BorgF StricherR NysF RousseauL Serrano. The FoldX web server: an online force field. Nucleic Acids Research, 2005, 33: W382–W388
97 A GuptaS Agrawal. Machine learning-based enzyme engineering of PETase for improved efficiency in plastic degradation. Journal of Emgerging Investigators, 2023, 6: doi:10.59720/22-016
98 Y DingS ZhangH HessX KongY Zhang. Replicating enzymatic activity by positioning active sites with synthetic protein scaffolds. BioRxiv, 2024, bioRxiv 2024.01.31.577620
[1] Lihe Zhang, Changwei Zhang, Xi Zhao, Changliu He, Xu Zhang. Improving lipid production by Rhodotorula glutinis for renewable fuel production based on machine learning[J]. Front. Chem. Sci. Eng., 2024, 18(5): 51-.
[2] Shuangshuang Cao, Houjun Zhang, Haoyang Liu, Zhiyuan Lyu, Xiangyuan Li, Bin Zhang, You Han. Optimization of kinetic mechanism for hydrogen combustion based on machine learning[J]. Front. Chem. Sci. Eng., 2024, 18(11): 136-.
[3] Yi Cheng, Qiong Pan, Jie Li, Nan Zhang, Yang Yang, Jiawei Wang, Ningbo Gao. Machine learning facilitated the modeling of plastics hydrothermal pretreatment toward constructing an on-ship marine litter-to-methanol plant[J]. Front. Chem. Sci. Eng., 2024, 18(10): 117-.
[4] Xiaonan Wang, Jie Li, Yingzhe Zheng, Jiali Li. Smart systems engineering contributing to an intelligent carbon-neutral future: opportunities, challenges, and prospects[J]. Front. Chem. Sci. Eng., 2022, 16(6): 1023-1029.
[5] Yiming Ma, Zhenguo Gao, Peng Shi, Mingyang Chen, Songgu Wu, Chao Yang, Jing-Kang Wang, Jingcai Cheng, Junbo Gong. Machine learning-based solubility prediction and methodology evaluation of active pharmaceutical ingredients in industrial crystallization[J]. Front. Chem. Sci. Eng., 2022, 16(4): 523-535.
[6] Patrick Otto Ludl, Raoul Heese, Johannes Höller, Norbert Asprion, Michael Bortz. Using machine learning models to explore the solution space of large nonlinear systems underlying flowsheet simulations with constraints[J]. Front. Chem. Sci. Eng., 2022, 16(2): 183-197.
[7] Haoqin Fang, Jianzhao Zhou, Zhenyu Wang, Ziqi Qiu, Yihua Sun, Yue Lin, Ke Chen, Xiantai Zhou, Ming Pan. Hybrid method integrating machine learning and particle swarm optimization for smart chemical process operations[J]. Front. Chem. Sci. Eng., 2022, 16(2): 274-287.
[8] Quirin Göttl, Dominik G. Grimm, Jakob Burger. Automated synthesis of steady-state continuous processes using reinforcement learning[J]. Front. Chem. Sci. Eng., 2022, 16(2): 288-302.
[9] Ewan Chee, Wee Chin Wong, Xiaonan Wang. An integrated approach for machine-learning-based system identification of dynamical systems under control: application towards the model predictive control of a highly nonlinear reactor system[J]. Front. Chem. Sci. Eng., 2022, 16(2): 237-250.
[10] Jongmoon Park,Yunnam Choi. Cofactor engineering in cyanobacteria to overcome imbalance between NADPH and NADH: A mini review[J]. Front. Chem. Sci. Eng., 2017, 11(1): 66-71.
[11] J. Sargolzaei, A. Hedayati Moghaddam. Predicting the yield of pomegranate oil from supercritical extraction using artificial neural networks and an adaptive-network-based fuzzy inference system[J]. Front Chem Sci Eng, 2013, 7(3): 357-365.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed