A statistical method for competing risk survival analysis with clustered big data

Over the last few years, the data revolution occurred with the emergence of “Big data”. In medical field, the term big data refers to large databases in terms of patients and/or information from varied sources. Nevertheless, heterogeneity is encountered in this kind of data. Indeed, data arise from different medical centers. Furthermore, we can’t perform traditional statistical methods on these large databases: major problem are multicollinearity and overfitting. Lots of regularization methods have been proposed in order to adapt classical methods. Mittal et al. have challenged to adapt survival analysis methods to these emerging data sets. Survival analyses consist in modelling time to event in presence of censoring (unobserved event). One of the main assumption of the most popular survival model is the non-informative censoring which means that censoring is independent of the event time.

Faculty Supervisor:

Mary Thompson





Statistics / Actuarial sciences



University of Waterloo


Globalink Research Award

Current openings

Find the perfect opportunity to put your academic skills and knowledge into practice!

Find Projects