Learning to organize and discover biomedical scientific literature
This project will investigate machine learning methods to organize and discover the vast literature on biomedical science. First, it will focus on named entity recognition--the task of finding and classifying entities in text documents--on large collections of abstracts and full-text of research papers and investigate semi-supervised learning methods to leverage large collections of unlabeled research papers. Second, it will focus on designing new ranking algorithms for research papers by exploiting information from several sources, including citation networks of papers, future impact predictions of new research papers and domain ontologies. The solutions to these problems will directly impact the performance of the methods that are currently in use by the company to solve the discovery problem of biomedical research papers.