Catalina Vallejos Research Group
Biomedical Data Science
Section: Biomedical Genomics
Research in a Nutshell
While biomedical data sometimes classifies as “big data” (where the number of samples and/or variables is large), complexity is its most prominent feature. This arises from a combination of different sources of heterogeneity: heterogeneity across individuals in a population (e.g. response to treatment), heterogeneity in terms of the type of data we collect (e.g. health records & genomics) and heterogeneity that is introduced by the data collection process (e.g. measurement error).
We focus on the development of novel statistical methodology to address and study these sources of heterogeneity. This is a highly multidisciplinary task: from the understanding of complex biomedical problems and technologies, to the development of new methodology and the implementation of open-source analysis tools. Our current research focuses on two areas of application. Firstly, single-cell RNA-sequencing, a cutting-edge experimental technique that allows genome-wide quantification of gene expression on a cell-by-cell basis. Secondly, electronic health records research, to develop predictive models based on observational data that is routinely collected by health providers (e.g. NHS). Developing computational tools that can make full advantage of the rich information provided by these data sources is ought to improve our understanding of health and disease, playing an important role in precision medicine initiatives.
|Dr Catalina Vallejos||Group Leader|
|Dr Chantriolnt-Andreas Kapourani||Cross-Disciplinary Fellow|
|Nathan Constantine-Cooke||PhD student|
|Yipeng Cheng||PhD student|
|Dr James Liley||Postdoctoral research fellow|
|Christos Maniatis||PhD student|
|Dr Andrew Papanastasiou||Cross-Disciplinary Fellow|
|Alan O'Callaghan||PhD student|
Dr Karla Monterrubio-Gomez
|Rachel Jackson||BSc student - Honours project|
- Liley, J., Emerson, S. R., Mateen, B. A., Vallejos, C. A., Aslett, L. J. M., & Vollmer, S. J. (2021). Model updating after interventions paradoxically introduces bias. Paper presented at 24th International Conference on Artificial Intelligence and Statistics.
- Kapourani, A., Argelaguet, R., Sanguinetti, G., & Vallejos, C. A. (2021). scMET: Bayesian modelling of DNA methylation heterogeneity at single-cell resolution. Genome Biology. 10.1186/s13059-021-02329-8
- Lähnemann, D., Köster, J., Szczurek, E., McCarthy, D. J., Hicks, S. C., Robinson, M. D., Vallejos, C. A., Campbell, K. R., Beerenwinkel, N., Mahfouz, A., Pinello, L., Skums, P., Stamatakis, A., Attolini, C. S-O., Aparicio, S., Baaijens, J., Balvert, M., Barbanson, B. D., Cappuccio, A., ... Schönhuth, A. (2020). Eleven grand challenges in single-cell data science. Genome Biology, 21(1), 31. 10.1186/s13059-020-1926-6
- Richter, M. L., Deligiannis, I. K., Yin, K., Danese, A., Lleshi, E., Coupland, P., Vallejos, C. A., Matchett, K. P., Henderson, N. C., Colome-Tatche, M., & Martinez-Jimenez, C. P. (2021). Single-nucleus RNA-seq2 reveals a functional crosstalk between liver zonation and ploidy. Nature Communications. 10.1038/s41467-021-24543-5
- Maniatis C, Vallejos CA, Sanguinetti G. SCRaPL: A Bayesian hierarchical framework for detecting technical associates in single cell multiomics data. PLoS Comput Biol. 2022 Jun 21;18(6):e1010163. doi: 10.1371/journal.pcbi.1010163. PMID: 35727848; PMCID: PMC9249169.
Full publication list can be found on Research Explorer: Catalina Vallejos Meneses — University of Edinburgh Research Explorer
Partners and Funders
- The Alan Turing Institute
- British Heart Foundation
statistical genomics, electronic health records research