Repository logo

Novel methodologies for spectral classification of exon and intron sequences

dc.contributor.authorKwan, Hon K
dc.contributor.authorKwan, Benjamin Y M
dc.contributor.authorKwan, Jennifer Y Y
dc.date.accessioned2015-12-18T10:58:31Z
dc.date.available2015-12-18T10:58:31Z
dc.date.issued2012-02-28
dc.date.updated2015-12-18T10:58:31Z
dc.description.abstractAbstract Digital processing of a nucleotide sequence requires it to be mapped to a numerical sequence in which the choice of nucleotide to numeric mapping affects how well its biological properties can be preserved and reflected from nucleotide domain to numerical domain. Digital spectral analysis of nucleotide sequences unfolds a period-3 power spectral value which is more prominent in an exon sequence as compared to that of an intron sequence. The success of a period-3 based exon and intron classification depends on the choice of a threshold value. The main purposes of this article are to introduce novel codes for 1-sequence numerical representations for spectral analysis and compare them to existing codes to determine appropriate representation, and to introduce novel thresholding methods for more accurate period-3 based exon and intron classification of an unknown sequence. The main findings of this study are summarized as follows: Among sixteen 1-sequence numerical representations, the K-Quaternary Code I offers an attractive performance. A windowed 1-sequence numerical representation (with window length of 9, 15, and 24 bases) offers a possible speed gain over non-windowed 4-sequence Voss representation which increases as sequence length increases. A winner threshold value (chosen from the best among two defined threshold values and one other threshold value) offers a top precision for classifying an unknown sequence of specified fixed lengths. An interpolated winner threshold value applicable to an unknown and arbitrary length sequence can be estimated from the winner threshold values of fixed length sequences with a comparable performance. In general, precision increases as sequence length increases. The study contributes an effective spectral analysis of nucleotide sequences to better reveal embedded properties, and has potential applications in improved genome annotation.
dc.identifier.citationEURASIP Journal on Advances in Signal Processing. 2012 Feb 28;2012(1):50
dc.identifier.urihttp://dx.doi.org/10.1186/1687-6180-2012-50
dc.identifier.urihttp://hdl.handle.net/10393/33977
dc.language.rfc3066en
dc.rights.holderKwan et al; licensee Springer.
dc.titleNovel methodologies for spectral classification of exon and intron sequences
dc.typeJournal Article

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
13634_2011_Article_216.pdf
Size:
377.92 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
license.txt
Size:
4.92 KB
Format:
Item-specific license agreed upon to submission
Description: