Related projects
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
The high throughput sequencing (HTS) platforms generate unprecedented amounts of data that introduce challenges in computational infrastructure. The requirement of large investments for this purpose almost signaled the end of the Sequence Read Archive hosted at the NCBI, which holds most of the sequence data generated worldwide.
Currently, most HTS data is compressed through general purpose algorithms such as gzip. These algorithms are not designed for compressing the data generated by the HTS platforms; for example they do not take advantage of the specific nature of the sequence data. Fast and efficient compression algorithms designed specifically for HTS data may be able to address some of the issues in data management, storage, and communication. Here we propose a “boosting” scheme based on Locally Consistent Parsing technique that reorganizes the reads in a way that results in a higher compression speed and compression rate, independent of the compression algorithm in use.
Cenk Sahinalp
University of British Columbia
Computer science
Simon Fraser University
Accelerate
Discover more projects across a range of sectors and discipline — from AI to cleantech to social innovation.
Find the perfect opportunity to put your academic skills and knowledge into practice!
Find ProjectsThe strong support from governments across Canada, international partners, universities, colleges, companies, and community organizations has enabled Mitacs to focus on the core idea that talent and partnerships power innovation — and innovation creates a better future.