HOME BETA TURNS DATASETS PERFORMANCE MEASURES EVALUATION OF ALGORITHM HELP  REFERENCES CONTACT  


Datasets for evaluation of beta turn prediction method

The dataset has 426 non-homologus protein chains, as described by Guruprasad and Rajkumar (2000). In this data set, no two protein chains have more than 25% sequence identity. The structure of these proteins is determined by X-ray crystallography at 2.0 resolution or better. Each chain contains minimum one beta turn. The PROMOTIF program has been used to assign beta turns in proteins. For PDB codes of these proteins, Click here.

The following Table shows the distribution of all the 426 protein chains in terms of SCOP classes. Majority of proteins have 'alpha/beta' class followed by 'all beta' and 'alpha+beta' class.

Number of alpha proteins ( class a )68
Number of beta proteins (class b )97
Number of alpha/beta proteins (class c )102
Number of alpha+beta proteins (class d) 86
Number of multi-domain proteins (class e ) 9
Number of small proteins (class f )2
Number of coiled coil proteins (class g )22
Number of low resolution proteins (class h )0
Number of peptides (class i )0
Number of designed proteins (class j )1
Number of proteins having both a and b classes3
Number of proteins having both a and c classes7
Number of proteins having both a and d classes5
Number of proteins having both b and c classes6
Number of proteins having both b and d classes4
Number of proteins having both b and f classes1
Number of proteins having both c and d classes10
Number of proteins having both c and g classes1
Number of proteins having both b, c and d classes2

Evaluation can be done on a single protein, multiple proteins or datasets. There are two datasets: Complete and Subsets. The complete dataset has amino acid sequence of all 426 proteins in fasta format and assigned turns and nonturns. The complete dataset is further divided into seven different subsets (setsI-VII), each containing equal number of proteins. Each of these subsets have amino acid sequence in fasta format along with assigned turns/nonturns.


Complete Dataset
1. Amino acid sequence of 426 protein chains in fasta format [Download: Text, HTML]
2. Turns/nonturns in 426 protein chains assigned by PROMOTIF [Download: Text, HTML]
Subsets Dataset
Set I
  • Amino acid sequence of 61 proteins in fasta format [Download: Text, HTML]
  • Turns/nonturns in 61 proteins assigned by PROMOTIF [Download: Text, HTML]
  • Set II
  • Amino acid sequence of 61 proteins in fasta format [Download: Text, HTML]
  • Turns/nonturns in 61 proteins assigned by PROMOTIF [Download: Text, HTML]
  • Set III
  • Amino acid sequence of 61 proteins in fasta format [Download: Text, HTML]
  • Turns/nonturns in 61 proteins assigned by PROMOTIF [Download: Text, HTML]
  • Set IV
  • Amino acid sequence of 61 proteins in fasta format [Download: Text, HTML]
  • Turns/nonturns in 61 proteins assigned by PROMOTIF [Download: Text, HTML]
  • Set V
  • Amino acid sequence of 61 proteins in fasta format [Download: Text, HTML]
  • Turns/nonturns in 61 proteins assigned by PROMOTIF [Download: Text, HTML]
  • Set VI
  • Amino acid sequence of 60 proteins in fasta format [Download: Text, HTML]
  • Turns/nonturns in 60 proteins assigned by PROMOTIF [Download: Text, HTML]
  • SetVII
  • Amino acid sequence of 61 proteins in fasta format [Download: Text, HTML]
  • Turns/nonturns in 61 proteins assigned by PROMOTIF [Download: Text, HTML]



  • [Home] [Beta Turns] [Datasets] [Performance measures] [Evaluation of algorithm] [Help] [References] [Contact]