An Efficient Data Analysis Pipeline
The proposed research project targets computational performance improvements of an data analysis pipeline. The project has a duration of four months and aims to achieve two objectives: (1) to properly characterize the performance of individual stages of the existing data analysis pipeline in terms of execution time, memory, and I/O, and (2) to improve the performance of individual stages where possible. The intern will use methods learnt and developed during the masters research and apply them to a real-world system at Acerta Analytics Solutions. The expected benefit to the partner organization, Acerta, is that the outcomes of the project will improve the performance of the existing data analysis pipeline.