I am interested in understanding how statistical modelling and high-performance computing can exploit biological “big data” (microarrays, screens, next-generation sequencing), where the result of any new or improved methodology can provide a) direct insights into mechanisms of actions (gene signalling in host-pathogen model systems) or b) be of direct utility to infection and disease in humans in terms of diagnostics or prognostics. Biological “big data” are often underexploited, only identifying easy-to-detect biomarker candidates or data patterns. In my research, I aim to increase the biological and clinical use of very large data sets through familiarity with the biological domain knowledge (allowing analysts to propose biologically testable hypotheses), by improving statistical models or methodologies to detect smaller but consistent effects, and by using high-performance computing solutions to enable otherwise impossible modes of analysis (e.g. detection of subtle patterns through machine learning algorithms).
This article was published on Feb 7, 2014