Pap, Gergely and Györgypál, Zoltán and Ádám, Krisztián and Tóth, László and Hegedűs, Zoltán (2021) Transcription factor binding site detection using convolutional neural networks with a functional group-based data representation. JOURNAL OF PHYSICS-CONFERENCE SERIES, 1824 (1). ISSN 1742-6588
|
Text
PapJPhysicsConfSer.pdf Available under License Creative Commons Attribution. Download (686kB) | Preview |
Abstract
Transcription factors (TFs) play an essential role in molecular biology by regulating gene expression. The binding sites of TFs can vary by a large amount and the numerous possible binding locations make their detection a challenging issue. Recently, several machine learning approaches using nucleotide sequence data were applied to classify DNA sequences regarding Transcription Factor Binding Sites (TFBS). We propose a novel training strategy without the traditional 1D nucleotide-based DNA sequence representation by instead using a 2D topological matrix of sub-nucleotide chemical functional groups substantially defining the protein binding ability of DNA fragments. We train convolutional neural networks using this novel Functional Group DNA Representation (FGDR) to solve a TFBS classification task. We compare our results with the efficiency of previous nucleotide-based training approaches and show that learning from an FGDR data sequence has several benefits regarding TFBS classification. Moreover, we reason that learning deep neural networks from the FGDR representation produces competitive results while only introducing a pre-processing conversion step. Finally, we show that employing an ensemble of models from the nucleotide and FGDR representations for network training results in higher classification performance than any of the single input approaches. © Published under licence by IOP Publishing Ltd.
Item Type: | Article |
---|---|
Additional Information: | Institute of Informatics, University of Szeged, Arpád Square 2, Szeged, H-6720, Hungary Institute of Biophysics, Biological Research Centre, Temesvári Blvd. 62, Szeged, H-6726, Hungary Department of Biochemistry and Medical Chemistry, University of Pécs, Pécs, Hungary Export Date: 28 January 2022 |
Subjects: | Q Science / természettudomány > QA Mathematics / matematika > QA76 Computer software / programozás Q Science / természettudomány > QH Natural history / természetrajz > QH301 Biology / biológia > QH3011 Biochemistry / biokémia Q Science / természettudomány > QH Natural history / természetrajz > QH301 Biology / biológia > QH3020 Biophysics / biofizika |
SWORD Depositor: | MTMT SWORD |
Depositing User: | MTMT SWORD |
Date Deposited: | 07 Feb 2022 10:38 |
Last Modified: | 07 Feb 2022 10:38 |
URI: | http://real.mtak.hu/id/eprint/137529 |
Actions (login required)
Edit Item |