篇名 | Choi-Williams Distribution to Describe Coding and Non-coding Regions in Primary Transcript Pre-mRNA |
---|---|
卷期 | 33:5 |
作者 | Umberto S. P. Melia 、 Montserrat Vallverdu 、 Francesc Claria |
頁次 | 504-512 |
關鍵字 | Bioinformatics 、 Classification and feature extraction 、 Stochastic processes 、 Time series analysis 、 EI 、 SCI |
出刊日期 | 201310 |
Deoxyribonucleic acid (DNA) information is discrete in both “time” (sequence positions) and “amplitude” (nucleotide values). This permits the use of signal processing techniques for its characterization. The conversion of DNA nucleotide symbols into discrete numerical values enables signal processing to be employed to solve problems related to sequence analysis,such as finding coding sequences. In this work,a numerical conversion method was chosen based on the thermodynamic data of free energy changes (A G°) of the formation of a duplex structure of DNA or ribonucleic acid (RNA),associated with the nucleotide sequence pre-mRNA (messenger RNA). The aim of this work was to characterize coding regions (exons) from non-coding regions (introns) using a methodology based on time-frequency representation (TFR). This permits the observation of the evolution of the periodicity and frequency components with time, introducing more variables related to the gene sequences compared to those used in traditional fast Fourier transform analysis. The parameters calculated from TFR are instantaneous frequency and instantaneous power. It was found that instantaneous frequency and power variables in different frequency bands allowed the correct classification between exons and introns with a prediction accuracy of more than 85%.