Estimating Optimal Treatment Regimes from Electronic Health Records using Natural Language Processing for Unmeasured Confounding
n precision medicine, optimal dynamic treatment regimes (DTR) are a sequence of decision rules that individualize medical treatments. DTR estimation methods use observational data, such as electronic health records (EHR), which may lack variables that capture doctors’ rationale behind treatment assignment. However, while such variables, also referred as confounders, may not be directly recorded in EHRs, they may be embedded in unstructured medical notes. In this project, our project introduces Relational-variational Graph Autoencoders (R-VGAEs) in precision medicine via DTR estimation. R-VGAEs are particularly well-equipped at identifying latent features within graph-structured data, such as medical notes. We aim to compare our proposed method to DTR estimation via conventional unsupervised natural language processing methods, such as word2vec, doc2vec and ELMo, to showcase its performance. We also apply it on data from MIMIC-IV, a publicly available dataset, to show that the additional use of medical note data can further improve treatment personalization.
View Full Project DescriptionOlli Saarela
The University of Tokyo
Mathematics
Education
University of Toronto
Globalink Research Award