MaSC: mappability-sensitive cross-correlation for estimating mean fragment length of single-end short-read sequencing data

FieldValue
dc.contributor.authorRamachandran, Parameswaran
dc.contributor.authorPalidwor, Gareth A.
dc.contributor.authorPorter, Christopher J.
dc.contributor.authorPerkins, Theodore J.
dc.date.accessioned2013-04-30T14:14:46Z
dc.date.available2013-04-30T14:14:46Z
dc.date.created2013
dc.date.issued2013-04-30
dc.identifier.urihttp://hdl.handle.net/10393/24088
dc.identifier.urihttp://bioinformatics.oxfordjournals.org/content/29/4/444.full
dc.description.abstractMotivation: Reliable estimation of the mean fragment length for next-generation short-read sequencing data is an important step in next-generation sequencing analysis pipelines, most notably because of its impact on the accuracy of the enriched regions identified by peak-calling algorithms. Although many peak-calling algorithms include a fragment-length estimation subroutine, the problem has not been adequately solved, as demonstrated by the variability of the estimates returned by different algorithms. Results: In this article, we investigate the use of strand crosscorrelation to estimate mean fragment length of single-end data and show that traditional estimation approaches have mixed reliability. We observe that the mappability of different parts of the genome can introduce an artificial bias into cross-correlation computations, resulting in incorrect fragment-length estimates. We propose a new approach, called mappability-sensitive cross-correlation (MaSC), which removes this bias and allows for accurate and reliable fragment-length estimation. We analyze the computational complexity of this approach, and evaluate its performance on a test suite of NGS datasets, demonstrating its superiority to traditional cross-correlation analysis. Availability: An open-source Perl implementation of our approach is available at http://www.perkinslab.ca/Software.html.
dc.language.isoen
dc.titleMaSC: mappability-sensitive cross-correlation for estimating mean fragment length of single-end short-read sequencing data
dc.typeArticle
dc.identifier.doi10.1093/bioinformatics/btt001
CollectionIRHO - Publications // OHRI - Publications
Publications en libre accès financées par uOttawa // uOttawa financed open access publications

Files