MRC Human Genetics Unit
Medical Research Council Human Genetics Unit

Catalina Vallejos Research Group

Biomedical Data Science

Dr Catalina Vallejos – Reader
Dr Catalina Vallejos – Reader

Research in a Nutshell 

While biomedical data sometimes classifies as “big data” (where the number of samples and/or variables is large), complexity is its most prominent feature. This arises from a combination of different sources of heterogeneity: heterogeneity across individuals in a population (e.g. response to treatment), heterogeneity in terms of the type of data we collect (e.g. health records & genomics) and heterogeneity that is introduced by the data collection process (e.g. measurement error).

We focus on the development of novel statistical methodology to address and study these sources of heterogeneity. This is a highly multidisciplinary task: from the understanding of complex biomedical problems and technologies, to the development of new methodology and the implementation of open-source analysis tools. Our current research focuses on two areas of application. Firstly, single-cell RNA-sequencing, a cutting-edge experimental technique that allows genome-wide quantification of gene expression on a cell-by-cell basis. Secondly, electronic health records research, to develop predictive models based on observational data that is routinely collected by health providers (e.g. NHS). Developing computational tools that can make full advantage of the rich information provided by these data sources is ought to improve our understanding of health and disease, playing an important role in precision medicine initiatives.

Group External Website

Catalina Vallejos Group

People

 
Dr Catalina Vallejos Group Leader

Begoña Bolos

CRUK PhD student (co-supervised)

Veronica Finazzi

EpiCrossBorders PhD student (co-supervised; based in Munich)

Yipeng Cheng

Edinburgh Helsinki Program in Human Genomics PhD student (co-supervised)

Louis Chislett

HDRUK/Turing Wellcome programme in Health Data Science PhD student

Dr Nathan Constantine-Cooke

Postdoctoral Research Associate

Dr Karla Monterrubio-Gomez

Postdoctoral Research Associate

Linda Nguyen

MRC Precision Medicine PhD student (co-supervised)

Emma Yang

MRC HGU PhD student (co-supervised)

Contact

catalina.vallejos@ed.ac.uk

Publications

  1. Liley, J., Emerson, S. R., Mateen, B. A., Vallejos, C. A., Aslett, L. J. M., & Vollmer, S. J. (2021). Model updating after interventions paradoxically introduces bias. Paper presented at 24th International Conference on Artificial Intelligence and Statistics.
  2. Kapourani, A., Argelaguet, R., Sanguinetti, G., & Vallejos, C. A. (2021). scMET: Bayesian modelling of DNA methylation heterogeneity at single-cell resolution. Genome Biology. 10.1186/s13059-021-02329-8
  3. Lähnemann, D., Köster, J., Szczurek, E., McCarthy, D. J., Hicks, S. C., Robinson, M. D., Vallejos, C. A., Campbell, K. R., Beerenwinkel, N., Mahfouz, A., Pinello, L., Skums, P., Stamatakis, A., Attolini, C. S-O., Aparicio, S., Baaijens, J., Balvert, M., Barbanson, B. D., Cappuccio, A., ... Schönhuth, A. (2020). Eleven grand challenges in single-cell data science. Genome Biology, 21(1), 31. 10.1186/s13059-020-1926-6
  4. Richter, M. L., Deligiannis, I. K., Yin, K., Danese, A., Lleshi, E., Coupland, P., Vallejos, C. A., Matchett, K. P., Henderson, N. C., Colome-Tatche, M., & Martinez-Jimenez, C. P. (2021). Single-nucleus RNA-seq2 reveals a functional crosstalk between liver zonation and ploidy. Nature Communications. 10.1038/s41467-021-24543-5
  5. Maniatis C, Vallejos CA, Sanguinetti G. SCRaPL: A Bayesian hierarchical framework for detecting technical associates in single cell multiomics data. PLoS Comput Biol. 2022 Jun 21;18(6):e1010163. doi: 10.1371/journal.pcbi.1010163. PMID: 35727848; PMCID: PMC9249169.

Full publication list can be found on Research Explorer: Catalina Vallejos Meneses — University of Edinburgh Research Explorer

Partners and Funders

  • The Alan Turing Institute
  • British Heart Foundation

Scientific Themes

statistical genomics, single cell sequencing, risk prediction, electronic health records