Development of an Al first molecular database to accelerate drug discovery

Using simplified language understandable to a layperson; provide a general, one-paragraph description of the proposed research project to be undertaken by the intern(s) as well as the expected benefit to the partner organization. {100 - 150 words)
The project aims to develop a molecular compounds database to accelerate drug discovery. Compounds shared by chemical providers are currently stored in large library files. Due to their size and number, these files are a bottleneck in virtual screening. The molecular database will gather them in a centralised entity, thus making library processing efficient. The project will also develop partner relationships by offering a better data sharing experience through a user interface and a programmatic client. Besides storing known compounds, the database will implement an unknown compound enumeration feature. Combined with an active learning model, experts will be able to use the database to manually explore chemical subspaces to find hit compounds. The last phase of the project will focus on extending open source work to provide a privacy preserving machine learning framework based on the developed molecular database to enable federated learning pipelines for virtual screening.

Sacha Levy
Faculty Supervisor: 
Reihaneh Rabbany
Partner University: