Dr Didier Devaurs

XDF Research Fellow

Background

I am a researcher in computer science, with a diverse background in mathematics, biology, chemistry, physics and psychology. Before jumping into research, I worked as a software developer in the industry and as a mathematics teacher in secondary education. I am now avidly pursuing interdisciplinary research in the fields of biomedical computing, structural bioinformatics and computational structural biology.

In the past decade, my research has focused on computationally modelling, simulating and analysing complex physical systems, both in robotics and structural biology. At the algorithmic level, simulating mobile or flexible systems (such as robots and molecules) requires exploring a high-dimensional space: the space of all possible states of the system. My work has involved developing efficient algorithms and heuristics to address this challenge. I have published 26 articles (including 15 as first author) in international scientific journals, and 14 articles (including 8 as first author) in peer-reviewed conferences or workshops.

During my PhD in a robotics lab, I developed novel extensions of sampling-based path planning algorithms. I created the concept of optimal path planning in a cost space, and applied it to robot motion planning and molecular modelling (to efficiently explore the conformational space of a protein). Then, as a post-doctoral researcher, I developed computational methods for the conformational modelling and analysis of molecular systems, such as large proteins and protein-ligand complexes. First, I enhanced an adaptive-resolution conformational sampling method, so that its exploration could be guided by low-resolution structural information, such as the experimental data obtained via hydrogen exchange monitoring. Then, I implemented an incremental protein-ligand meta-docking tool, for the docking of large ligands to protein receptors, and applied it to peptide-HLA complexes (in the context of cancer immunotherapy).

For the past three years, at the Institute of Genetics and Cancer, I have been working in the field of quantitative biomedical research, on the following topics:

  • modelling the intracellular trafficking network of cell-adhesion proteins in a cancer cell
  • statistically analysing time series of white blood cell data from the Generation Scotland cohort
  • improving the coverage of deep mutational scanning experiments using machine learning techniques.

I am now pursuing a research project focused on addressing several quantitative limitations of deep mutational scanning experiments, with the long-term goal of producing clinical interpretations of protein variants in patients. This research is of great significance, as deep mutational scanning is a promising technique in the quest for personalised medicine. Indeed, a large component of the genetic basis of disease lies in rare variation, with every human carrying a few hundred genetic variants that are mostly so infrequent that they have never been detected by genome sequencing. Being able to derive the functional effects of rare mutations in important genes would be a crucial breakthrough for medical practice. To reach this goal, I apply state-of-the-art methods from deep learning to improve the quality of data produced by the most advanced mutation generation and DNA sequencing techniques, to systematically assess the effects of protein mutations.

CV

PDF icon 125727.pdf

Qualifications

Ph.D. in Artificial Intelligence, University of Toulouse, France

M.Sc. in Computer Science, Claude Bernard University, Lyon, France

B.Sc. in Computer Science, Blaise Pascal University, Clermont-Ferrand, France

Teacher certification in Mathematics, IUFM of Auvergne, Clermont-Ferrand, France

Responsibilities & affiliations

I serve on the editorial board of the following journals:

  • BMC Bioinformatics
  • Frontiers in Molecular Biosciences
  • Biophysical Reviews

Research summary

Protein structural sampling guided by experimental hydrogen-exchange data

Gathering experimental data about a protein’s three-dimensional structure allows understanding its function and possible dysfunctions. In addition, computational techniques exist to explore a protein's conformational space, i.e., the space of all possible states (or conformations) of the protein. However, experimentally observing and computationally modelling large proteins remain critical challenges for structural biology. To address this issue, I developed a novel approach integrating an experimental technique and a computational method to analyse large proteins. I studied how the computational exploration of a protein’s conformational space could be guided by low-resolution structural information, such as the experimental data obtained through hydrogen exchange (HX) monitoring. For that, I extended a computational framework called Structured Intuitive Move Selector (SIMS) performing coarse-grained structural sampling, i.e., in which not all perturbations (or moves) applied to protein conformations consider all degrees-of-freedom (DoFs) of a protein.

SIMS combines robotics-inspired structural sampling algorithms with the popular Rosetta library for protein modelling. I enhanced SIMS with a HX prediction method to allow structural sampling to be guided by experimental HX data. I proposed the following incremental method: at every round of the exploration process, conformations generated by structural sampling are filtered based on their fit to the experimental data, to select a starting point for the next round, in which protein regions with the worst fit are sampled more heavily than others. I published three applications of my method:

  • I showed that my method yields a better fit between HX data and computationally-generated protein conformations than other HX-guided conformational sampling methods.
  • I showed that I could analyse the inherent variability of a protein's native state (i.e., its equilibrium state in solution) and I confirmed the hypothesized stability of the complement protein C3d.
  • I showed that I could generate structural models for protein states described only by HX data. For example, I produced an atomistic model for the complement protein iC3b.

Main references:

  • Computational modeling of molecular structures guided by hydrogen-exchange data; Journal of the American Society for Mass Spectrometry; 2022
  • Revealing unknown protein structures using computational conformational sampling guided by experimental hydrogen-exchange data; International Journal of Molecular Sciences; 2018
  • Coarse-grained conformational sampling of protein structure improves the fit to experimental hydrogen-exchange data; Frontiers in Molecular Biosciences; 2017

 

Molecular docking of large ligands to protein receptors

Although there is a variety of software for the molecular docking of protein-ligand complexes, most docking tools can only deal with small drug-like ligands. The docking of large ligands, including peptides, is still considered a challenge in computational structural biology. To address this issue, I developed a molecular docking tool, called DINC, specifically aimed at dealing with large ligands, following a parallelized incremental meta-docking approach. DINC is a meta-docking tool in the sense that it uses existing docking software at its core. Following the divide-and-conquer paradigm, it was conceived as an incremental method that iteratively docks larger and larger overlapping fragments of a ligand in the protein’s binding site. For each fragment, several independent docking instances are run in parallel. Since only a subset of a fragment’s rotatable bonds are sampled by a given docking run, to ensure completeness, I made sure that all docking instances running in parallel would sample different sets of bonds. This research was motivated by the study of peptide-MHC complexes for their role in cancer immunotherapy.

I extended this approach to address a limitation of DINC and numerous other docking tools: the fact that they do not account for receptor flexibility when docking a flexible ligand. Because of the context of the COVID-19 pandemic, my collaborators and I chose to specifically implement a computational tool for ensemble docking with SARS-CoV-2 proteins. We extracted representative ensembles of protein conformations from the Protein Data Bank and from computer simulations. Twelve pre-computed ensembles of SARS-CoV-2 protein conformations are available for ensemble docking via a user-friendly webserver called DINC-COVID. We validated DINC-COVID using tested inhibitors of two SARS-CoV-2 proteins, obtaining good correlations between docking-derived binding energies and experimentally-determined binding affinities.

Main references:

  • Hall-Swan, Devaurs, Rigo, Antunes, Kavraki & Zanatta; DINC-COVID: A webserver for ensemble docking with flexible SARS-CoV-2 proteins; Computers in Biology and Medicine; 2021
  • Devaurs, Antunes, Hall-Swan, Mitchell, Moll, Lizée & Kavraki; Using parallelized incremental meta-docking can solve the conformational sampling issue when docking large ligands to proteins; BMC Molecular and Cell Biology; 2019
  • Antunes, Devaurs, Moll, Lizée & Kavraki; General prediction of peptide-MHC binding modes using incremental docking: A proof of concept; Scientific Reports; 2018

 

Optimal path planning in cost spaces with sampling-based algorithms

During my PhD, I developed novel extensions of sampling-based path planning algorithms. Despite their conceptual simplicity, these algorithms can efficiently explore a high-dimensional space in a probabilistic manner and build a graph representing the topology of this space. They had traditionally been used in simple robotic applications to find feasible (i.e., collision-free) paths, without considering path quality. However, many applications require to compute high-quality (i.e., low-cost) paths or even optimal paths, in the context of cost-space path planning or optimal path planning. To deal with ever more complex applications, I proposed the following contributions:

  • I enhanced a cost-space path planning algorithm, called Transition-based Rapidly-exploring Random Tree (T-RRT), by creating bidirectional and multiple-tree variants. I also proposed three parallel versions of T-RRT-like algorithms to improve scalability. Then, I used these algorithms to plan for 6-dimensional manipulation with a towed-cable system involving three aerial robots (in simulation).
  • I combined the paradigms of cost-space path planning and optimal path planning to create the concept of optimal path planning in a cost space. In this context, I developed two new algorithms (T-RRT* and Anytime T-RRT) for the Move3D robotic platform and the MoMA molecular modelling library. I also showed that both algorithms were probabilistically complete and asymptotically optimal. I applied them to the planning of industrial inspection tasks performed by flying robots (in simulation) and to the exploration of the energy landscape of small peptides.

Main references:

  • Optimal path planning in complex cost spaces with sampling-based algorithms; IEEE Transactions on Automation Science and Engineering; 2016
  • MoMA-LigPath: A web server to simulate protein-ligand unbinding; Nucleic Acids Research; 2013
  • Parallelizing RRT on large-scale distributed-memory architectures; IEEE Transactions on Robotics; 2013

Knowledge exchange

I have contributed to the following software:

  • DINC-COVID: Ensemble docking to SARS-CoV-2 proteins
  • DINC: Docking INCrementally
  • HLA-Arena: Structural modeling/analysis of peptide-HLA complexes
  • PEPSI-SAXS: Polynomial Expansions of Protein Structures and Interactions, applied to Small-Angle  X-ray Scattering
  • SIMS: Structured Intuitive Move Selector
  • MoMA: Molecular Motion Algorithms
  • Move3D: Motion planning for robots
  • KnowSe: User context detection as a knowledge service
  • EcoSim: Individual-based ecosystem simulation

 

I have also contributed to the following work:

  • EnGens: A computational framework for generation and analysis of representative protein conformational ensembles

 

Former members:

  • Zihan Kong, M.Sc. student in bioinformatics (2023)

  • Tianyu Zhao, M.Sc. student in bioinformatics (2023)

  • Xinyu Liu, M.Sc.R. student in integrative biomedical sciences (2023)
  • Nicole Li, Honours student in pharmacology (2023)
  • Natalie Cruz, Honours student in biomedical sciences (2023)
  • Erika Lapienyte, Honours student in biomedical sciences (2023)
  • Tiefeng Song, M.Sc. student in drug discovery and translational biology (2022)
  • Jie Mei, M.Sc. student in drug discovery and translational biology (2022)
  • Mengze Zhang, M.Sc.R. student in biomedical sciences (2022)