REAL

A comprehensive assessment of long intrinsic protein disorder from the DisProt database

Necci, Marco and Piovesan, Damiano and Dosztányi, Zsuzsanna and Tompa, Péter and Tosatto, Silvio C. E. (2018) A comprehensive assessment of long intrinsic protein disorder from the DisProt database. BIOINFORMATICS, 34 (3). pp. 445-452. ISSN 1367-4803

[img]
Preview
Text
btx590.pdf

Download (588kB) | Preview

Abstract

Motivation Intrinsic disorder (ID), i.e.The lack of a unique folded conformation at physiological conditions, is a common feature for many proteins, which requires specialized biochemical experiments that are not high-Throughput. Missing X-ray residues from the PDB have been widely used as a proxy for ID when developing computational methods. This may lead to a systematic bias, where predictors deviate from biologically relevant ID. Large benchmarking sets on experimentally validated ID are scarce. Recently, the DisProt database has been renewed and expanded to include manually curated ID annotations for several hundred new proteins. This provides a large benchmark set which has not yet been used for training ID predictors. Results Here, we describe the first systematic benchmarking of ID predictors on the new DisProt dataset. In contrast to previous assessments based on missing X-ray data, this dataset contains mostly long ID regions and a significant amount of fully ID proteins. The benchmarking shows that ID predictors work quite well on the new dataset, especially for long ID segments. However, a large fraction of ID still goes virtually undetected and the ranking of methods is different than for PDB data. In particular, many predictors appear to confound ID and regions outside X-ray structures. This suggests that the ID prediction methods capture different flavors of disorder and can benefit from highly accurate curated examples. © The Author 2017.

Item Type: Article
Additional Information: Megjegyzés-27264672 N1 Funding details: K 108798 N1 Funding details: G.0029.12 N1 Funding details: 17753, AIRC, Associazione Italiana per la Ricerca sul Cancro N1 Funding details: FWO, Fonds Wetenschappelijk Onderzoek N1 Funding details: FWO, Fonds Wetenschappelijk Onderzoek N1 Funding details: LP201418/2016, MTA, Magyar Tudományos Akadémia N1 Funding details: OTKA, Országos Tudományos Kutatási Alapprogramok N1 Funding details: 16621, FIRC, Fondazione Italiana per la Ricerca sul Cancro N1 Funding details: BM1405, COST, European Cooperation in Science and Technology N1 Funding text: This work has been supported by COST Action BM1405 (NGP-net). D.P. is an FIRC research fellow [16621]. Z.D. acknowledges the support of the Hungarian Academy of Sciences ‘Lendület’ Grant [LP201418/2016] and the Hungarian Scientific Research Fund [OTKA K 108798 to ZD]. P.T. was supported by the Odysseus grant G.0029.12 from Research Foundation Flanders (FWO). Part of the work was supported by AIRC IG grant 17753 to S.T. Megjegyzés-27162235 N1 10.1093/bioinformatics/btx590
Subjects: Q Science / természettudomány > QH Natural history / természetrajz > QH301 Biology / biológia > QH3011 Biochemistry / biokémia
SWORD Depositor: MTMT SWORD
Depositing User: MTMT SWORD
Date Deposited: 07 Mar 2019 07:43
Last Modified: 07 Mar 2019 07:43
URI: http://real.mtak.hu/id/eprint/91847

Actions (login required)

Edit Item Edit Item