4. Threading: From Sequence to 3D Fold
4.1 Threading: Principles
It has been estimated that the total number of possible protein folds
is about 1,000. Hence, once the 3D structure of all the folds is known,
it should be possible to deduce the fold of any given amino acid sequence.
To address this question the sequence is simply aligned in 3D on each of
the folds, each of the fits is scored, and the best score indicates the
correct fold. This process has been coined threading.
Though the details differ between different threading methods all have
the following components:
- Creation of a fold database
SCOP Feb 2000 has 548 folds
- Representation of the protein
coordinates of all atoms, backbone atoms, or only CB carbon atoms?
- Scoring function
substitution matrices or residue versus environment matrices?
Examples of pair potentials:
- Short-range:Ala...Ala CB...CB in AXXXA arrangement
exhibits an energy minimum at 6 Å (alpha helix) and 9 Å (beta
sheet)
- Long-range:Cys...Cys CB...CB separated by more than 30 residues
exhibits an energy minimum at 4 Å (disulphide bridge)
- Solvation potential for Leu:prefers to be buried
- Solvation potential for Glu:prefers to be exposed on the surface
- Optimal alignment
dynamic programming or Monte Carlo?
- Statistical significance of the score
raw or normalized scores? Z-scores (number of standard deviations from
the mean)?
CAVEAT: Frequently threading identifies the correct fold but has
the detailed sequence alignment wrong. Always try to improve the alignment!
For more info:
- D.J. Jones, W.R. Taylor & J.M. Thornton (1992). A new approach
to protein fold recognition, Nature 358, 86-89.
- S.H. Bryant & S.F. Altshul (1995). Statistics of sequence-structure
threading, Curr. Op. Struct. Biol. 5, 236-244.
- T. Madej, M.S. Boguski & S.H. Bryant (1995). Threading analysis
suggests that the obese gene product may be a helical cytokine, FEBS
Lett. 373, 13-18.
- C.M.R. Lemer, M.J. Rooman & S.J. Wodak (1995). Protein Structure
Prediction by Threading Methods: Evaluation of Current Techniques,
Proteins 23, 337-355.
4.2 Threading: Servers and Freeware
contact: verlinde@gouda.bmsc.washington.edu; tel: (206) 543 8865;
fax: (206) 685 7002