MRC Human Genetics Unit
Medical Research Council Human Genetics Unit

Finding the breaking point of cancer

Cancer is a disease of the genome, and tumour sequencing projects have highlighted the enormous changes that the human genome undergoes during the evolution of a tumour: February 2019

Statistics to support the research

Cancer is a disease of the genome, and tumour sequencing projects have highlighted the enormous changes that the human genome undergoes during the evolution of a tumour. These changes often include massive structural mutations to chromosomes, such that regions millions of base pairs long can be deleted, duplicated, inverted or even swapped between chromosomes. Many of these structural variants (SVs) are thought to arise via DNA double strand breaks (DSBs) which are not accurately repaired in the chaotic environment of a tumour cell. Although there is substantial evidence that SVs can drive the progression of tumours it has been challenging to decide which SVs, among the many thousands detected, might be most useful to the tumour. A new study from the Semple lab, in collaboration with the Crosetto lab at the Karolinska, takes a new approach to this problem.

A bioinformatician from the Semple lab, Tracy Ballinger, has shown that it is possible to construct remarkably accurate computational models of DSBs, allowing us to predict the frequency of breakage expected across the human genome. Due to their underlying structure it emerges that genomic regions show higher or lower susceptibility to breakage, and this allows us to make predictions across different cell types. When we compare the models to real SV breakpoints seen in tumours we can highlight regions of interest. For instance we see hundreds of genomic regions where few breaks are predicted but very high rates of breakage are seen in the real tumour data, suggesting that SVs in these regions may be selected to enhance tumour progression. Intriguingly we also see similar numbers of regions where high rates of breakage are predicted by the models but very few are observed in the tumours, suggesting regions of the genome that tumours may require to be preserved intact. The regions implicated include many genes of interest, which could be the source of novel diagnostic markers or targets for new treatments.

The Figure shows the relationship between predicted and observed rates of double strand breaks (DSBs) for a DSB model based upon cell lines studied in the lab, and suggests the model is accurate. In each case the (blue) model data is overlain with (red) data from three different collections of breaks observed in tumours. The graphs suggest that the vast majority of regions frequently broken in tumours are accurately encompassed by the model.

Links