Anders Irbäck, Carsten Peterson, Frank Potthast and Erik Sandelin
Design of Sequences with Good Folding Properties in Coarse-Grained Protein Models
Structure with Folding & Design 7, 347-360 (1999)

Abstract:
Background: Designing amino acid sequences that are stable in a given target structure amounts to maximizing a conditional probability. A straightforward approach to accomplish this is a nested Monte Carlo where the conformation space is explored over and over again for different fixed sequences, which requires excessive computational demand. Several approximate attempts to remedy this situation, based on energy minimization for fixed structure or high-T expansions, have been proposed. These methods are fast but often not accurate since folding occurs at low T.

Results: We develop a multisequence Monte Carlo procedure, where both sequence and conformation space are simultaneously probed with efficient prescriptions for pruning sequence space. The method is explored on hydrophobic/polar models. We first discuss short lattice chains, in order to compare with exact data and with other methods. The method is then successfully applied to lattice chains with up to 50 monomers, and to off-lattice 20-mers.

Conclusions: The multisequence Monte Carlo method offers a new approach to sequence design in coarse-grained models. It is much more efficient than previous Monte Carlo methods, and is, as it stands, applicable to a fairly wide range of two-letter models.

LU TP 98-10