582b Separating True Positive Residue Contacts from False Positive Ones in Proteins, Using Constrained Metropolis Monte Carlo Simulations

Spyridon Vicatos and Yiannis N. Kaznessis. Chemical Engineering and Materials Science, University of Minnesota, 407 7th Street SE #215, Minneapolis, MN 55414

In this work, we investigate the potential of constrained Metropolis Monte Carlo simulations of proteins, to separate true positive residue contact predictions from the false positive ones, in an existing set of predicted pairs. When ensembles of protein conformations are generated using Metropolis Monte Carlo with residue constraints, different sets of constrained residue pairs, can give a variety of native or non native folds ensembles, depending on the proximity of the constrained residues . When an original residue pair constraint is replaced by a different neighboring one, we expect dramatic changes in the average fold of the new conformation ensemble compared to the original, if the original constrain is a true positive, and only small changes if the original constrain is a false positive. The nature of the original constrain can be assessed by comparing the quality of the fold between the protein ensembles of the original constrain and its replacements. Proteins of known structures have been chosen from the Mainly Alpha CATH class and initial non native structures for each protein, incorporating accurate secondary structure information are created. Metropolis Monte Carlo using CHARMM molecular force field is performed on each proteins initial structure, subject to residue constraints. Once the ensembles have been generated, the simulated folds are assessed with the use of the Huang potential. Results show that both true positive and false positive residue contact predictions can be assessed with high accuracy. The protein test set covers the majority of the mainly alpha folds in CATH database, with proteins of size between 41 and 150. Accuracy is around 85%, for the true positive residue contacts, and 80% for the false positive ones. Protein size or slight changes in the secondary structure information does not largely affect the methods accuracy, although it should be mentioned that proteins above 150 residues have not been tested. This method can be an important step towards highly accurate protein contact map predictions.