An Evaluation of beta-turn prediction methods
ORIGINAL PARAMETERS AND THRESHOLDS |
Beta-turn is an important element of protein structure. In past three decades numerous beta-turn prediction methods have been developed based on various strategies. At present, it is difficult to say which method is better. This is because these methods were developed on different sets of data. Thus, it is important to evaluate the performance of beta-turn prediction methods.
We evaluate the performance of six methods of beta-turn prediction. Original parameters available in the literature are used to test the methods on a set of 426 non-homologus protein chains. It is observed that the performance of neural network based method BTPRED is significantly better than the statistical methods. We also train, test and evaluate the performance of all methods except BTPRED and GORBTURN, on a new data set using sevenfold cross-validation technique. There is a siginficant improvement in performance of all methods when secondary structure information is incorporated. In this study, both threshold dependent and independent (ROC) measures are used for evaluation.
The representative protein chains are selected so that no two chains have more than 25% sequence identity. Protein chains determined by X-ray crystallography at 2.0 resolution or better and containing atleast one beta-turn are used in the analysis. Following are the pdb codes of 426 protein chains used for analysis. It is the same data set as described by K.Guruprasad.
_____________________________________________________________________________________________________________________________________________________________
119l | 153l | 1aliA | 1a1x | 1a28B | 1a2pA | 1a2yA | 1a2zA | 1a2yA | 1a2zA | 1a34A | 1a62 | 1a68 | 1a6q | 1a7tA | 1a8e | 1a8i | 1a9s | 1aac | 1aba |
1ad2 | 1adoA | 1af7 | 1afwA | 1agjA | 1agqD | 1ah7 | 1aho | 1aj2 | 1ajj | 1ajsA | 1ak0 | 1ak1 | 1ako | 1akz | 1al3 | 1alo | 1alu | 1alvA | 1aly |
1amm | 1amp | 1amuA | 1amx | 1anf | 1aocA | 1aohA | 1aol | 1aop | 1aoqA | 1aozA | 1apyB | 1aq0A | 1aq6A | 1aqb | 1aqzB | 1arb | 1arv | 1at0 | 1at1A |
1atzB | 1avmA | 1awd | 1awsA | 1axn | 1ay1 | 1azo | 1ba1 | 1bbpA | 1bdmB | 1bdo | 1bebA | 1benB | 1bfd | 1bfg | 1bftA | 1bgc | 1bgp | 1bkf | 1bkrA |
1brt | 1btkB | 1btn | 1bv1 | 1byb | 1c52 | 1cbn | 1cem | 1ceo | 1cewI | 1cex | 1cfb | 1chd | 1chmA | 1ckaA | 1c1c | 1cnv | 1cpcB | 1cpo | 1cseE |
1cseI | 1csh | 1csn | 1ctj | 1cydA | 1dad | 1dkzA | 1dokA | 1dorA | 1dosA | 1dun | 1dupA | 1dxy | 1eca | 1ecl | 1ecpA | 1ede | 1edg | 1edmB | 1edt |
1erv | 1ezm | 1fdr | 1fds | 1fit | 1fleI | 1fmtB | 1fna | 1fua | 1furA | 1fus | 1fvkA | 1fwcA | 1g3p | 1gai | 1garA | 1gd1O | 1gdoA | 1gifA | 1gky |
1gnd | 1gotB | 1gotG | 1gsa | 1guqA | 1gvp | 1ha1 | 1havA | 1hcrA | 1hfc | 1hgxA | 1hoe | 1hsbA | 1htrP | 1hxn | 1iakA | 1idaA | 1idk | 1ido | 1ifc |
1igd | 1iibA | 1iso | 1isuA | 1ixh | 1jdw | 1jer | 1jetA | 1jfrA | 1jpc | 1kid | 1knb | 1kpf | 1kptA | 1kuh | 1kveA | 1kveB | 1kvu | 1kwaB | 1lam |
1latB | 1lbu | 1lcl | 1lis | 1lit | 1lki | 1lkkA | 1lmb3 | 1lml | 1lt5D | 1ltsA | 1lucB | 1mai | 1mbd | 1mkaA | 1mldA | 1mml | 1molA | 1mpgA | 1mrj |
1mrp | 1msc | 1msi | 1msk | 1mtyB | 1mtyD | 1mtyG | 1mucA | 1mugA | 1mwe | 1mzm | 1nar | 1nbaB | 1nbcA | 1nciB | 1neu | 1nfn | 1nif | 1nls | 1nox |
1np1A | 1npk | 1nulB | 1nwpA | 1nxb | 1ois | 1onc | 1onrA | 1opd | 1opy | 1orc | 1ospO | 1ovaA | 1oyc | 1pcfA | 1pda | 1pdo | 1pgs | 1phe | 1phnA |
1php | 1pii | 1ple | 1pmi | 1pne | 1pnkB | 1poa | 1poc | 1pot | 1ppn | 1ppt | 1prxB | 1ptq | 1pty | 1pud | 1qba | 1qnf | 1r69 | 1ra9 | 1rcf |
1rec | 1regY | 1reqD | 1rgeA | 1rhs | 1rie | 1rmg | 1rro | 1rss | 1rsy | 1rvaA | 1ryp1 | 1ryp2 | 1rypF | 1rypI | 1rypJ | 1sbp | 1sfp | 1sftB | 1sgpI |
1skz | 1sltA | 1sluA | 1smd | 1spuA | 1sra | 1stmA | 1svb | 1svpA | 1tadC | 1tca | 1tfe | 1thv | 1thx | 1tib | 1tif | 1tml | 1trkA | 1tsp | 1tvxA |
1tys | 1uae | 1ubi | 1uch | 1unkA | 1urnA | 1uxy | 1v39 | 1vcaA | 1vcc | 1vhh | 1vid | 1vif | 1vin | 1vjs | 1vls | 1vpsA | 1vsd | 1vwlB | 1wab |
1wba | 1wdcA | 1wer | 1whi | 1who | 1whtB | 1wpoB | 1xgsA | 1xikA | 1xjo | 1xnb | 1xsoA | 1xyzA | 1yaiC | 1yasA | 1ycc | 1yer | 1ytbA | 1yveI | 1zin |
256bA | 2a0b | 2abk | 2acy | 2arcA | 2ayh | 2baa | 2bbkH | 2bbkL | 2bopA | 2cba | 2ccyA | 2chsA | 2ctc | 2cyp | 2dri | 2end | 2eng | 2erl | 2fdn |
2fha | 2fivA | 2gdm | 2hbg | 2hft | 2hmzA | 2hpdA | 2hts | 2ilb | 2ilk | 2kinA | 2kinB | 2lbd | 2mcm | 2msbB | 2nacA | 2pgd | 2phy | 2pia | 2pii |
2plc | 2por | 2pspA | 2pth | 2rn2 | 2rspB | 2sak | 2scpA | 2sicI | 2sil | 2sn3 | 2sns | 2tgi | 2tysA | 2vhbB | 2wea | 3b5c | 3chy | 3cla | 3cox |
3cyr | 3daaA | 3grs | 3lzt | 3nul | 3pcgM | 3pte | 3sdhA | 3seb | 3tss | 3vub | 4bcl | 4mt2 | 4pgaA | 4xis | 5csmA | 5hpgA | 5icb | 5p2I | 5pti |
5ptp | 6cel | 6gsvA | 7ahlA | 7rsa | 8abp | 8rucI | 8rxnA |
___________________________________________________________________________________________________________________________________________________________________
Chou-Fasman algorithm |
Thornton's algorithm |
1-4 & 2-3 Correlation Model |
Sequence Coupled Model |
GORBTURN |
BTPRED |
ORIGINAL PARAMETERS
AND THRESHOLDS
Original conformational parameters and positional frequencies for helix,ß-sheet and ß-turn residues.
Original Threshold = 0.000075
Original conformational parameters and positional frequencies:
Original Threshold for Type I turn = 4.0
Original Threshold for Type II turn = 2.7
Original probabilities and conditional probabilities:
Original Threshold = 0.1875
Original probabilities and conditional probabilities for turns:
Original probabilitis and conditional probabilities for non-turns:
Original Threshold = 0
Original positional frequencies for Type I turns
Original positional frequencies for Type II turns
New Threshold : A new
threshold value is chosen at which the sensitivity and specificity are
nearly equal.
Following four different parameters were used to measure the performance of prediction methods:
Qtotal, the percentage of correctly predicted residues, defined as
MCC, the Matthews Correlation Coefficient, defined as
Qpredicted, probability of correct prediction, defined as
Qobserved, percent coverage, defined as
Table 1 Results of 7-fold cross-validation at original thresholds.
Methods |
Qtotal |
Qpredicted |
Qobserved |
MCC |
Chou-Fasman algorithm | 74.9 (65.2) | 46.1 (37.6) | 16.9 (63.5) | 0.16 (0.26) |
Thornton's algorithm | 74.5 (68.0) | 44.0 (38.6) | 16.7 (52.4) | 0.15 (0.23) |
1-4 & 2-3 Correlation Model | 63.2 (59.1) | 35.3 (32.4) | 60.4 (61.9) | 0.21 (0.17) |
Sequence Coupled Model | 50.6 (53.3) | 31.7 (32.4) | 88.4 (72.8) | 0.23 (0.17) |
*values in brackets are calculated by using original parameters of methods.
Following are the results of cross-validation in terms of Qtotal, Qpredicted, Qobserved and MCC.
(a) Qtotal
(b) Qpredicted
(c) Qobserved
(d) MCC
Effect of different secondary structure methods on performance of BTPRED
Table 2 ROC values for different methods without cross-validation.
Methods |
ROC (without cross validation) |
Chou-Fasman |
0.69 |
Thornton's algorithm |
0.66 |
1-4 & 2-3 Correlation Model |
0.64 |
Sequence coupled model |
0.64 |
Table 3 ROC values for different methods with cross validation.
Methods |
ROC (with cross validation) |
Chou-Fasman |
0.59 |
Thornton's algorithm |
0.57 |
1-4 & 2-3 Correlation model |
0.67 |
Sequence coupled model |
0.70 |
ROC with and without cross-validation
Chou,P.Y. and Fasman,G.D. (1974) Conformational parameters for amino acids in helical, beta-sheet and random coil regions calculated from proteins. Biochemistry, 13, 211-222.
Chou,P.Y. and Fasman,G.D. (1979) Prediction of beta-turns. Biophys. J., 26, 367-384. [Abstract]
Chou,K.C. (1997) Prediction of beta-turns. J. Pept. Res., 49, 120-144. [Abstract]
Chou,K.C. and Blinn,J.R. (1997) Classification and prediction of beta-turn types. J. Protein Chem., 16, 575-595. [Abstract]
Chou,K.C. (2000) Prediction of tight turns and their types in proteins. Analytical Biochem., 286, 1-16. [Abstract]
Cohen,F.E., Abarbanel,R.M., Kuntz,I.D. and Fletterick,R.J. (1986) Turn prediction in proteins using a pattern-matching approach. Biochemistry, 25, 266-275. [Abstract]
Deleo,J.M. (1993) Preceedings of the Second International Symposium on Uncertainity Modelling and Analysis, pp. 318-325. IEEE, Computer Society Press, College Park, MD.
Garnier,J., Osguthorpe,D.J. and Robson,B. (1978) Analysis and implications of simple methods for predicting the secondary structure of globular proteins. J. Mol. Biol. 120, 97-120. [Abstract]
Gibrat,J.-F., Garnier,J. and Robson,B. (1987) J. Mol. Biol., 198, 425-433. [Abstract]
Guruprasad,K. and Rajkumar,S. (2000) Beta- and gamma-turns in proteins revisited: A new set of amino acid dependent positional preferences and potential. J. Biosci. 25(2), 143-156. [Abstract]
Hutchinson,G. and Thornton,J.M. (1996) PROMOTIF-a program to identify and analyze structural motifs in proteins. Protein Sci., 5, 212-220.[Abstract]
Hutchinson, G. & Thornton,J.M. (1994) Revised set of potentials for beta-turn formation in proteins. Protein Science, 3, 2207-2216. [Abstract]
Jones,D.T. (1999) Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol., 292(2), 195-202. [Abstract]
Kabsch,W. and Sander,C. (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers, 22, 2577-2637. [Abstract]
King,R.D. and Sternberg,M.J.E. (1996) Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Protein Sci., 5(11), 2298-2310. [Abstract]
Matthews,B.W. (1975) Biochim. Biophys. Acta, 405, 442-451.
McGregor,M.J., Flores,T.P. and Sternberg,M.J. (1989) Prediction of beta-turns in proteins using neural networks. Protein Eng., 2, 521-526. [Abstract]
McGuffin,L.J., Bryson,K. and Jones,D.T. (2000) The PSIPRED protein structure prediction server. Bioinformatics, 16(4), 404-405. [Abstract]
Quali,M. and King,R.D. (2000) Cascaded multiple classifiers for secondary structure prediction. Protein Sci., 9, 1162-1176. [Abstract]
Rost,B. and Sander,C. (1993) Improved prediction of protein secondary structure by use of sequence profiles and neural networks. Proc. Natl. Acad. Sci. U.S.A., 90, 7558-7562. [Abstract]
Rost,B. and Sander,C. (1994) Combining evolutionary information and neural networks to predict protein secondary structure. Proteins, 19, 55-72. [Abstract]
Shepherd,A.J., Gorse,D. and Thornton,J.M. (1999) Prediction of the location and type of beta-tuurns in proteins using neural networks. Protein Sci., 8, 1045-1055. [Abstract]
Wilmot,C.M. and Thornton,J.M. (1988) Analysis and prediction of the different types of beta-turns in proteins. J.Mol.Biol., 203, 221-232. [Abstract]
Wilmot,C.M. and Thornton,J.M. (1990) Beta-turns and their distortion: a proposed new nomenclature. Protein Engg., 3(6), 479-493. [Abstract]