CLP: Efficient Log Compression and Analytics for Big Data Platforms

Tech companies generate Petabytes of log data per day, with 50% – 100% year-over-year growth. Conventional log analytics systems no longer scale to such large data sizes. In addition, managing such large data is extremely costly at every level, from storage cost, to network bandwidth, and to the cost of compute resources.
This research proposes a novel system called CLP (Compressed Log Processor). CLP compresses the logs to unprecedented compression ratio, and more importantly, it allows one to search the compressed logs without decompression. CLP reduces the cost of log management and storage by over 40x, saving companies hundreds of millions of dollars per year. More importantly, it enables users to quickly search through Petabytes of logs efficiently. The partnership with YScope will make CLP production-ready, and it can be directly integrated with existing big data analytics systems and compress the logs at the source.

Faculty Supervisor:

Baochun Li

Student:

Partner:

YScope Inc

Discipline:

Computer science

Sector:

Information and cultural industries

University:

University of Toronto

Program:

Accelerate

Current openings

Find the perfect opportunity to put your academic skills and knowledge into practice!

Find Projects