Kali: Optimizing Resource Utilization in Distributed Clusters

Decreasing operational costs is a key criterion for organizations that manage compute clusters, such as Amazon, Microsoft, Google, Alibaba, etc. One way to decrease costs it to improve resource utilization in the cluster [13, 14]. Yet, high resource utilization can negatively affect workload performance and thus user satisfaction. Performance degradation happens when workloads running on the same machine “compete” for shared resources, e.g., a workload that consumes a large portion of memory delays execution of other, memory-intensive workloads. Such “competition” for resources is referred to as resource interference in the literature.
Existing work on predicting and avoiding interference mainly relies on (a) stress-testing the workloads before scheduling, to estimate their constraints and (b) extracting interference-related constraints while observing real executions. TO BE CONT'D

Intern: 
Harshavardhan Kadiyala
Faculty Supervisor: 
Julia Rubin
Province: 
British Columbia
Partner University: 
Program: