Please wait a minute...
Quantitative Biology

ISSN 2095-4689

ISSN 2095-4697(Online)

CN 10-1028/TM

Postal Subscription Code 80-971

Quant. Biol.    2018, Vol. 6 Issue (2) : 163-174    https://doi.org/10.1007/s40484-018-0138-5
RESEARCH ARTICLE
Geometric and amino acid type determinants for protein-protein interaction interfaces
Yongxiao Yang1, Wei Wang2, Yuan Lou1, Jianxin Yin2, Xinqi Gong1()
1. Institute for Mathematical Sciences, Renmin University of China, Beijing 100872, China
2. School of Statistics, Renmin University of China, Beijing 100872, China
 Download: PDF(1634 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Background: Protein-protein interactions are essential to many biological processes. The binding site information of protein-protein complexes is extremely useful to obtain their structures from biochemical experiments. Geometric description of protein structures is the precondition of protein binding site prediction and protein-protein interaction analysis. The previous description of protein surface residues is incomplete, and little attention are paid to the implication of residue types for binding site prediction.

Methods: Here, we found three new geometric features to characterize protein surface residues which are very effective for protein-protein interface residue prediction. The new features and several commonly used descriptors were employed to train millions of residue type-nonspecific or specific protein binding site predictors.

Results: The amino acid type-specific predictors are superior to the models without distinction of amino acid types. The performances of the best predictors are much better than those of the sophisticated methods developed before.

Conclusions: The results demonstrate that the geometric properties and amino acid types are very likely to determine if a protein surface residue would become an interface one when the protein binds to its partner.

Keywords protein-protein interaction      protein-protein complex interface      geometry feature      residue type      binding site     
Corresponding Author(s): Xinqi Gong   
Online First Date: 23 May 2018    Issue Date: 11 June 2018
 Cite this article:   
Yongxiao Yang,Wei Wang,Yuan Lou, et al. Geometric and amino acid type determinants for protein-protein interaction interfaces[J]. Quant. Biol., 2018, 6(2): 163-174.
 URL:  
https://academic.hep.com.cn/qb/EN/10.1007/s40484-018-0138-5
https://academic.hep.com.cn/qb/EN/Y2018/V6/I2/163
Fig.1  Comparison of the performances of the geometric features in descending and ascending orders with AUC. The results of absEA, relEA, EC, EV and IC are shown in (A), (B), (C), (D) and (E) respectively. DO represents descending order and AO represents ascending order. They are shown in red and blue respectively. The dots are the outliers.
Fig.2  Comparison of the performances of the geometric features in descending and ascending orders with rank of the first binding site (RFBS).
Fig.3  Comparison of the performances of the geometric features. The results are the ones of absEA in descending order, relEA in descending order, EC in ascending order, EV in ascending order and IC in descending order. The curves of absEA, relEA, EC, EV and IC are shown in blue, red, orange, purple and green respectively. The horizontal coordinate is the number of retained surface residues for any protein monomer. The vertical coordinate is the percentage of positive monomers for which there exist at least one true binding sites among the retained surface residues.
NRSR MPPM (%) FC
1 52.6 absEA, relEA, EC, IC, pKa2
2 69.6 relEA, EC, EV, IC, H1
3 79.4 absEA, relEA, EC, EV, IC, H1
4 87.2 EC, IC, H2, pKa1
5 92.0 absEA, IC, H1, pKa1, pKa2
6 95.2 absEA, EV, IC, pKa2
7 97.2 EC, IC, pKa1
8 97.8 EV, IC, H1, H2, pKa1, pKa2
9 99.0 relEA, EC, EV, IC, H1, pKa1, pKa2
10 100 EV, IC, H1, H2, pKa1
10 100 relEA, EC, IC, H1
10 100 absEA, EV, IC, pKa2
10 100 absEA, relEA, EC, IC, H2, pKa2
10 100 absEA, relEA, EC, EV, IC, H1, pKa1
Tab.1  The best performance of models without distinction of residue types
AAT NM FC MNRSR
TRP 1302000 absEA, relEA, EC, IC 1
MET 1302000 relEA, IC 2
CYS 2646000 absEA, relEA, EV, IC, pKa1 2
HIS 2646000 absEA, relEA, EC, IC 3
PHE 1302000 relEA, EC, EV, IC 5
TYR 2646000 relEA, EC, IC, pKa1 5
ASN 1302000 absEA, EV, IC 5
GLN 1302000 relEA, EC 5
GLY 1302000 relEA, EC, IC 6
LEU 1302000 absEA, EC, EV, IC 6
PRO 1302000 absEA, EC, EV, IC 6
ARG 2646000 absEA, relEA, EC, IC, pKa1 6
ALA 1302000 absEA, relEA, EC, EV, IC 7
VAL 1302000 relEA, EC, IC 7
GLU 2646000 absEA, EC, EV 7
THR 1302000 absEA, EC, IC 7
ASP 2646000 relEA, EC, EV, IC 8
SER 1302000 absEA, EC, EV, IC 8
ILE 1302000 relEA, EC, EV, IC 2
LYS 2646000 absEA, EC, pKa1 9
Tab.2  The best performance of residue type-specific predictors
AAT MV SD
TRP 0.740691 0.026848
MET 0.709371 0.000889
LEU 0.691242 0.020786
PHE 0.681853 0.004221
CYS 0.673883 0.001095
VAL 0.655965 0.020019
ARG 0.643709 0.010168
HIS 0.623985 0.006025
ILE 0.619012 0.024085
GLY 0.617001 0.021070
TYR 0.605156 0.030080
PRO 0.603635 0.049412
GLN 0.598281 0.003730
LYS 0.591322 0.091578
SER 0.585913 0.002099
ASN 0.576256 0.020634
GLU 0.569275 0.087888
THR 0.558572 0.060780
MET 0.557027 0.048983
ASP 0.538908 0.046054
Tab.3  Comparison of residue type-specific predictors with AUC
NRSR MPPM (%)
1 33.9
2 47.8
3 78.4
4 86.7
5 91.6
6 94.1
7 97.5
8 99.2
9 100
Tab.4  The best performances of the integrated amino acid-specific predictors
Fig.4  Comparison of the performances of the best integrated amino acid-specific predictor (IAASP) with other methods. The curves of meta-PPISP, VORRIF, PredUs, IC and IAASP are shown in blue, red, orange, purple and green respectively. The horizontal coordinate is the number of retained residues for any protein monomer. The vertical coordinate is the percentage of positive monomers for which there exist at least one true binding sites among the retained residues.
Fig.5  An example predicted by our method (PDB code: 4H03). The colors of the receptor and ligand are green and cyan respectively. The first true interface residues among the retained residues are shown in sphere model.
Fig.6  Procedure for constructing datasets. Firstly, Protein dimers with high quality in protein-protein docking benchmark version 5.0 [18] are selected to obtain the interface and surface residues. Then, the dataset composed of interface and surface residues are divided into amino acid type-specific ones.
Fig.7  Schematic diagram of the geometric features. (A) Schematic façade in 2D of the geometric features. The colors of absolute exterior solvent accessible area (absEA), exterior contact area with other residues (EC), exterior void area (EV) and interior contact area of a surface residue (IC) are black, red, brown and purple respectively. (B) The absolute exterior solvent accessible area of a surface residue. The absEA is displayed in yellow mesh. (C) EC and IC of a surface residue. EC and IC are colored by red and purple respectively. (D) EV of a surface residue. EV is colored by brown.
1 Gao, M. and Skolnick, J. (2010) Structural space of protein-protein interfaces is degenerate, close to complete, and highly connected. Proc. Natl. Acad. Sci. USA, 107, 22517–22522
https://doi.org/10.1073/pnas.1012820107 pmid: 21149688
2 Chothia, C. and Janin, J. (1975) Principles of protein-protein recognition. Nature, 256, 705–708
https://doi.org/10.1038/256705a0 pmid: 1153006
3 Jones, S. and Thornton, J. M. (1996) Principles of protein-protein interactions. Proc. Natl. Acad. Sci. USA, 93, 13–20
https://doi.org/10.1073/pnas.93.1.13 pmid: 8552589
4 Keskin, O., Gursoy, A., Ma, B. and Nussinov, R. (2008) Principles of protein-protein interactions: what are the preferred ways for proteins to interact? Chem. Rev., 108, 1225–1244
https://doi.org/10.1021/cr040409x pmid: 18355092
5 Koshland, D. E. (1995) The key-lock theroy and the induced fit theroy. Angew. Chem. Int. Ed., 33, 2375–2378
https://doi.org/10.1002/anie.199423751
6 Teichmann, S. A. (2002) Principles of protein-protein interactions. Bioinformatics, 18, S249
https://doi.org/10.1093/bioinformatics/18.suppl_2.S249 pmid: 12386009
7 Zhang, Q. C., Petrey, D., Norel, R. and Honig, B. H. (2010) Protein interface conservation across structure space. Proc. Natl. Acad. Sci. USA, 107, 10896–10901
https://doi.org/10.1073/pnas.1005894107 pmid: 20534496
8 Aumentado-Armstrong, T. T., Istrate, B. and Murgita, R. A. (2015) Algorithmic approaches to protein-protein interaction site prediction. Algorithms Mol. Biol., 10, 7
https://doi.org/10.1186/s13015-015-0033-9 pmid: 25713596
9 Esmaielbeiki, R., Krawczyk, K., Knapp, B., Nebel, J. C. and Deane, C. M. (2016) Progress and challenges in predicting protein interfaces. Brief. Bioinformatics, 17, 117–131
pmid: 25971595
10 Maheshwari, S. and Brylinski, M. (2015) Predicting protein interface residues using easily accessible on-line resources. Brief. Bioinform., 16, 1025–1034
https://doi.org/10.1093/bib/bbv009 pmid: 25797794
11 Xue, L. C., Dobbs, D., Bonvin, A. M. and Honavar, V. (2015) Computational prediction of protein interfaces: a review of data driven methods. FEBS Lett., 589, 3516–3526
https://doi.org/10.1016/j.febslet.2015.10.003 pmid: 26460190
12 Pintar, A., Carugo, O. and Pongor, S. (2002) CX, an algorithm that identifies protruding atoms in proteins. Bioinformatics, 18, 980–984
https://doi.org/10.1093/bioinformatics/18.7.980 pmid: 12117796
13 de Moraes, F. R., Neshich, I. A., Mazoni, I., Yano, I. H., Pereira, J. G., Salim, J. A., Jardine, J. G. and Neshich, G. (2014) Improving predictions of protein-protein interfaces by combining amino acid-specific classifiers based on structural and physicochemical descriptors with their weighted neighbor averages. PLoS One, 9, e87107
https://doi.org/10.1371/journal.pone.0087107 pmid: 24489849
14 Qin, S. and Zhou, H. X. (2007) meta-PPISP: a meta web server for protein-protein interaction site prediction. Bioinformatics, 23, 3386–3387
https://doi.org/10.1093/bioinformatics/btm434 pmid: 17895276
15 Segura, J., Jones, P. F. and Fernandez-Fuentes, N. (2011) Improving the prediction of protein binding sites by combining heterogeneous data and Voronoi diagrams. BMC Bioinformatics, 12, 352
https://doi.org/10.1186/1471-2105-12-352 pmid: 21861881
16 Zhang, Q. C., Deng, L., Fisher, M., Guan, J., Honig, B. and Petrey, D. (2011) PredUs: a web server for predicting protein interfaces using structural neighbors. Nucleic Acids Res., 39, W283–W287
https://doi.org/10.1093/nar/gkr311 pmid: 21609948
17 Wang, L., Wang, Y. and Chang, Q. (2016) Feature selection methods for big data bioinformatics: a survey from the search perspective. Methods, 111, 21–31
https://doi.org/10.1016/j.ymeth.2016.08.014 pmid: 27592382
18 Vreven, T., Moal, I. H., Vangone, A., Pierce, B. G., Kastritis, P. L., Torchala, M., Chaleil, R., Jimenez-Garcia, B., Bates, P. A., Fernandez-Recio, J., Bonvin, A. M. and Weng, Z. (2015) Updates to the integrated protein-protein interaction benchmarks: docking benchmark version 5 and affinity benchmark version 2. J. Mol. Biol. 427, 3031–3041
https://doi.org/https://www.sciencedirect.com/science/article/pii/S0022283615004180
19 Hwang, H., Vreven, T., Janin, J. and Weng, Z. (2010) Protein-protein docking benchmark version 4.0. Proteins, 78, 3111–3114
https://doi.org/10.1002/prot.22830 pmid: 20806234
20 Hwang, H., Pierce, B., Mintseris, J., Janin, J. and Weng, Z. (2008) Protein-protein docking benchmark version 3.0. Proteins, 73, 705–709
https://doi.org/10.1002/prot.22106 pmid: 18491384
21 Hubbard, S.J. and Thornton, M. (1993) Naccess Version 2.1.1. Department of Biochemistry and Molecular Biology, University College, London
22 Fischer, T. B., Holmes, J. B., Miller, I. R., Parsons, J. R., Tung, L., Hu, J. C. and Tsai, J. (2006) Assessing methods for identifying pair-wise atomic contacts across binding interfaces. J. Struct. Biol., 153, 103–112
https://doi.org/10.1016/j.jsb.2005.11.005 pmid: 16377205
23 Eisenberg, D. (1984) Three-dimensional structure of membrane and surface proteins. Annu. Rev. Biochem., 53, 595–623
https://doi.org/10.1146/annurev.bi.53.070184.003115 pmid: 6383201
24 Kyte, J. and Doolittle, R. F. (1982) A simple method for displaying the hydropathic character of a protein. J. Mol. Biol., 157, 105–132
https://doi.org/10.1016/0022-2836(82)90515-0 pmid: 7108955
25 Olsson, M. H., Søndergaard, C. R., Rostkowski, M. and Jensen, J. H. (2011) PROPKA3: consistent treatment of internal and surface residues in empirical pKa predictions. J. Chem. Theory Comput., 7, 525–537
https://doi.org/10.1021/ct100578z pmid: 26596171
26 Møller, M. F. (1993) A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw., 6, 525–533
https://doi.org/10.1016/S0893-6080(05)80056-5
27 Kishore, R. and Kaur, M. T. (2012) Backpropagation algorithm: an artificial neural network approach for pattern recognition. Inter. J. Sci. & Engin.Res ., 3, 1–4
28 Rumelhart, D. E., Hinton, G. E. and Williams, R. J. (1986) Learning representations by back-propagating errors. Nature, 323, 533–536
https://doi.org/10.1038/323533a0
[1] QB-18138-OF-GXQ_suppl_1 Download
[2] QB-18138-OF-GXQ_suppl_2 Download
[3] QB-18138-OF-GXQ_suppl_3 Download
[4] QB-18138-OF-GXQ_suppl_4 Download
[5] QB-18138-OF-GXQ_suppl_5 Download
[1] Xianyi Lian, Xiaodi Yang, Jiqi Shao, Fujun Hou, Shiping Yang, Dongli Pan, Ziding Zhang. Prediction and analysis of human-herpes simplex virus type 1 protein-protein interactions by integrating multiple methods[J]. Quant. Biol., 2020, 8(4): 312-324.
[2] Guojun Liu, Mikhail A. Bolkov, Irina A. Tuzankina, Irina G. Danilova. Identification of candidate disease genes in patients with common variable immunodeficiency[J]. Quant. Biol., 2019, 7(3): 190-201.
[3] Hammad Naveed, Jingdong J. Han. Structure-based protein-protein interaction networks and drug design[J]. Quant. Biol., 2013, 1(3): 183-191.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed