Machine Learning fusion proteins drug discovery targeting a cure for pediatric acute myelogenous leukemia - QC-669

Project type: Innovation
Desired discipline(s): Biochemistry / Molecular biology, Life Sciences, Genetics, Computer science, Mathematical Sciences
Company: sevenTM
Project Length: 6 months to 1 year
Preferred start date: As soon as possible.
Language requirement: English
Location(s): Montreal, QC, Canada; Canada
No. of positions: 2
Desired education level: Undergraduate/BachelorMaster'sPhDPostdoctoral fellowRecent graduate
Open to applicants registered at an institution outside of Canada: No

About the company: 

sevenTM accelerates pre-clinical drug development by reducing the risk and time to market through end-to-end automated Machine Learning pipelines. Where competitor products can only be used by bioinformatics experts after extensive training, sevenTM aims to build automated pipelines which put the state of the art in the hands of scientists without expertise necessarily in bioinformatics and computational biology, such as medicinal chemists, biochemists, genetic engineers, molecular biologists, clinical oncologists. Up to now, sevenTM’s in silico pipeline has achieved state-of-the-art capabilities in three crucial areas of drug discovery such as 1) modeling the tertiary structure of drug receptors, in particular G protein-coupled receptors, the first focus of sevenTM and target of about 35% of currently marketed drugs, 2) liver toxicity prediction of small molecules based on their structural features and chemical functional group distribution, and 3) de novo in silico design of compounds with desired or predicted physico-chemical features and realistic structures for real-world applications.

Describe the project.: 

In the project, we will develop a Machine Learning fusion oncoprotein pipeline that aims at dramatically increasing the efficiency and speed of targeting Leukemic cancer cells. Currently, available therapies for Acute Myeloid Leukemia (AML) are toxic and largely ineffective, with a 5-year survival rate of only 60% among pediatric, adolescent, and young adult patients. Since fusion oncoproteins drive AML pathogenesis and are uniquely expressed in leukemic cells, they represent ideal targets for therapy. However, the structure and function of these fusion oncoproteins render them unamenable to traditional drug targeting.

Among the intractable challenges that must be overcome to develop therapeutics to target these fusion oncoproteins, 1) their structure must be determined, 2) compounds that can selectively bind them must be identified. In the consortium, McGill University, JACOBB, and sevenTM, will extend a pipeline focused on the toxicity and efficacy of therapeutics for pediatric leukemia.

We address this problem by building Machine Learning for fusions-protein structure identification and design. The algorithms to be produced will perform molecular/peptide docking and simultaneous ligand design/generation, specific for the fusion portion of the target fusion oncoproteins structure.

Required expertise/skills: 

Knowledge in computer science/programming is necessary, and ideally with knowledge in machine learning and/or algorithm design.


  • Assets C, C+ Programming, Python Programming, Cuda programming, Rust
  • Expertise in genomics, bioinformatics, transcriptomics, computational biology
  • Expertise in fusion proteins, chromosomal aberrations leading to fusion proteins, transcription factors involved in leukemia/cancers, proteomics, and protein folding are assets.