REAL

Succinct Amyloid and Nonamyloid Patterns in Hexapeptides

Keresztes, László and Szögi, Evelin and Varga, Bálint and Farkas, Viktor and Perczel, András and Grolmusz, Vince (2022) Succinct Amyloid and Nonamyloid Patterns in Hexapeptides. ACS OMEGA, 7. pp. 35532-35537. ISSN 2470-1343

[img]
Preview
Text
Keresztes-acsomega.2c02513.pdf
Available under License Creative Commons Attribution.

Download (1MB) | Preview

Abstract

Hexapeptides are widely applied as a model system for studying the amyloid-forming properties of polypeptides, including proteins. Recently, large experimental databases have become publicly available with amyloidogenic labels. Using these data sets for training and testing purposes, one may build artificial intelligence (AI)-based classifiers for predicting the amyloid state of peptides. In our previous work (Biomolecules 2021, 11, 500), we described the Support Vector Machine (SVM)-based Budapest Amyloid Predictor (https://pitgroup.org/bap). Here, we apply the Budapest Amyloid Predictor for discovering numerous amyloidogenic and nonamyloidogenic hexapeptide patterns with accuracy between 80% and 84%, as surprising and succinct novel rules for further understanding the amyloid state of peptides. For example, we have shown that for any independently mutated residue (position marked by “x”), the patterns CxFLWx, FxFLFx, or xxIVIV are predicted to be amyloidogenic, while those of PxDxxx, xxKxEx, and xxPQxx are nonamyloidogenic. We note that each amyloidogenic pattern with two x’s (e.g.,CxFLWx) describes succinctly 202 = 400 hexapeptides, while the nonamyloidogenic patterns comprising four point mutations (e.g.,PxDxxx) give 204 = 160 000 hexapeptides in total. We also examine the restricted substitutions for positions “x” from subclasses of proteinogenic amino acid residues; for example, if “x” is substituted with hydrophobic amino acids, then there exist patterns containing three x’s, like MxVVxx, predicted to be amyloidogenic. If we can choose for the x positions any hydrophobic amino acids, except the “structure breaker” proline, then we get amyloid patterns with five x positions, for example, xxxFxx, each corresponding to 32 768 hexapeptides. To our knowledge, no similar applications of artificial intelligence tools or succinct amyloid patterns were described before the present work.

Item Type: Article
Subjects: Q Science / természettudomány > QD Chemistry / kémia
Q Science / természettudomány > QD Chemistry / kémia > QD04 Organic chemistry / szerves kémia
Depositing User: Dóra K. Menyhárd
Date Deposited: 17 Mar 2023 09:38
Last Modified: 17 Mar 2023 09:38
URI: http://real.mtak.hu/id/eprint/162357

Actions (login required)

Edit Item Edit Item