Abstract Scope |
Peptide materials have a wide array of functions from tissue engineering, surface coatings, catalysis, and sensing. This class of biopolymer is composed of a sequence of 20 naturally occurring amino acids. As the peptide sequence increases, so does the searchable sequence space (trimer = 203 or 8,000 peptides and a pentamer = 205 or 3.2 M). Empirically, peptide design is guided by the use of structural propensity tables, hydrophobicity scales, or other desired properties and typically yields <10 peptides per study, barely scraping the surface of the search space. Here, we combine machine learning techniques, such as Monte Carlo tree search and random forest, with coarse-grained molecular dynamics (MD) simulations to efficiently search large spaces of trimer, pentamer and octamer peptide sequences with high self-assembly propensity. Subsequent experiments on identified sequences support our findings, and demonstrate the ability of this approach for peptide design. |