Need to derive more representative and annotated molecular datasets from diverse populations
Edinburgh researchers led a study that emphasises the need for more diverse and better annotated cancer transcriptomics datasets: August 2020
Transcription is the first of several steps of DNA based gene expression in which a particular segment of DNA is copied into RNA (especially mRNA) by the enzyme RNA polymerase. Quantitative analysis of RNA molecules in the cell can provide valuable information on “activity state” and “expression patterns” of different genes. In fact, analysis of cellular RNA plays increasingly important role in anticancer drug discovery, cancer diagnosis and designing of personalised therapies.
As pools of RNA transcripts in cells are also known as transcriptomes, the techniques used to study them are often referred to as transcriptomics technologies or simply transcriptomics. Transcriptomics are powerful research tool. They are frequently used to analyse clinical samples from cancer patients providing important insights into molecular characteristics of their tumours.
An ever-increasing number of cancer transcriptomics datasets are now available enabling researchers to perform highly informative retrospective gene expression analyses. Thus, gene expression data from studies utilising cancer cell lines or animal models can be compared with clinical datasets to evaluate the reliability of model systems to recapitulate the disease. The clinical datasets can also be used to assess associations between putative oncogenes or tumour suppressors and different signalling pathways or clinical characteristics to examine whether certain subgroups of tumours have elevated or reduced expression of particular genes. It is therefore important that the spectrum of available clinical cancer transcriptomics datasets accurately reflects the spectrum of tumours at the population level, and that these clinical datasets are annotated with information needed to maximise their utility.
In a recent study titled “Breast cancer gene expression datasets do not reflect the disease at the population level” and published in the journal npj Breast Cancer, investigators from the University of Edinburgh, UK and the National Cancer Institute, USA describe their conclusions from analysing 70 breast cancer datasets accounting for 16,130 patients from 20 countries across 5 continents. The work, led by Doctor Andrew Sims and Doctor Jonine Figueroa from Edinburgh Cancer Research Centre, demonstrated that publicly available breast cancer gene expression datasets tend to be enriched for high grade, estrogen receptor negative (ER-) tumours from European ancestry patients. The results of the study emphasise the need to derive more representative and better annotated molecular datasets from diverse populations. Suggestions for possible ways to achieve this are also presented in the paper.
The work was supported by funding from Cancer Research UK, Breast Cancer Now, Wellcome Trust, UKRI Global Challenges Research Fund and the National Cancer Institute.
Article in npj Breast Cancer: https://www.nature.com/articles/s41523-020-00180-x
Doctor Andrew Sims Group website: https://www.ed.ac.uk/cancer-centre/research/sims-group
Doctor Jonine Figueroa Group website: https://www.ed.ac.uk/cancer-centre/research/jonine-figueroa
Information about breast cancer: https://www.cancerresearchuk.org/about-cancer/breast-cancer
Facing breast cancer: https://breastcancernow.org/information-support/facing-breast-cancer
Information about transcription: https://www.khanacademy.org/science/biology/gene-expression-central-dogma/transcription-of-dna-into-rna/a/overview-of-transcription
In HER2 positive early breast cancer 6 months treatment with Herceptin is as good as 12 months for preventing cancer return:
Professor David Cameron appointed BIG Chair:
Best poster prize at the Edinburgh Breast Cancer Special Symposium:
HER2 drives an increased hypoxic response in breast cancer:
Distinguishing acquired resistance from dormant tumours in neoadjuvant treatment of breast cancer:
Guidelines for treatment of breast cancer patients with delays in surgery due to COVID-19: