Professor Rowland Kao on disease transmission networks

Modelling networks of transmission to better understand the spread of diseases.

Professor Rowland Kao, Chair of Veterinary Epidemiology and Data Science, studies the epidemiology of infectious diseases. He tells Science Communication Intern Maggie Szymanska how he uses data to model their spread, and the joy of making new discoveries from data sets.

Tell me about your research in a nutshell.

I work on the epidemiology of infectious diseases – that is, diseases which are transmitted from individual to individual. I study mostly livestock diseases, though I also work on some diseases of humans and in wildlife. I look upon disease spread as a network in which we can map out contact between individuals. The pattern of contact determines how fast disease spreads, where it goes, and whether there are ways to more efficiently control or stop it.

For example, suppose you can map out, for every individual person, not just some but all the contacts they have with others. One thing that you’ll often see is that, while most individuals are relatively inactive, some individuals come into contact with lots of other people. Because every contact acts to increase both your risk of becoming infected and the risk that you’d then infect others, these individuals are potentially way more important than average – twice as many contacts means four times as important, three times as many means nine times more important. Because those highly connected few are so important, they are called super-spreaders.

If we could identify underlying risk factors that predict who these individuals might be – without having to go to the trouble of finding all their contacts – and vaccinate these individuals against infection, that might contain disease much more easily than if you simply chose people at random to vaccinate – and so targeting super-spreaders becomes really important.

Conversely, if for some reason you can’t find all of the super-spreaders, it might be that disease would continue to spread even if you vaccinated the majority of the population. Of course, it’s very rare that we know exactly what types of contact one individual human has. However in livestock, looking for individuals that have high numbers of contact with others can be done quite easily. We have, in Britain and other countries, records of cows, pigs and sheep as they move from farm to farm on a day-to-day basis, which is an incredibly detailed record of populations being tracked. An additional advantage is that we can also track pathogens – bacteria and viruses – from samples taken from affected individuals. If the record of movement matches up to the way the disease transmits, this validates that our network information is useful.

How do you look at the pattern of disease spread?

We can look at simple measures that tell us that some individuals are more important. Super-spreaders, for example, are often especially important. The important quantity in this case is what is known as the variance-mean ratio. We take the variance in the number of contacts of a super-spreader over the entire population and divide it by the mean – that gives a measure of how rapidly the disease spreads. Of course simple measures only partially reflect the full complexity of population contact. If we want to use the data we have in more detail, nowadays computers are fast and cheap to run; therefore we also use simulations – creating a computer program that tracks the record of activity, simulate introducing disease and watch how it spreads in simulation. Simulations can contain lots of information that simple network measures cannot; for example, the order of events in time can be very important; if you get infected by somebody in the afternoon, then someone who has only had contact with you in the previous morning is protected. Similarly, if infectiousness only lasts for three hours, your entire infectious period may occur while you’re not in contact with anybody and disease might not transmit to others at all, even if the probability of infection once contact is made might be high.

In simulations you can do all this at once – combination of contacts, the order of contacts and the impact of other factors such as most infectious times of day. Of course, in order to generalise, you then have to figure out why a simulation is doing what it is doing, and how to make it so that it doesn’t happen again.

How did you become interested in this field?

I did my PhD in physics but I wanted to do something different after that and took a job modelling livestock infections in New Zealand. At the time I became really fascinated by the idea. I remember seeing in our local museum an exhibit about a small island – Isle Royale – where about 100 years ago, an especially hard winter meant a population of moose managed to make it onto the island. Maybe 50 years after that, a pair of wolves made it on to the island. Initially there weren’t many wolves, there were lots of deer, and the deer population went up. Then wolves had a lot to eat, their number went up, they ate more deer, the deer declined, and so on. The exhibit showed not only how the wolf and moose populations tracked each other, but also how you could use mathematical equations to describe how it all worked. I remember, as a child, thinking how fascinating it was to use equations to describe how populations grow or get smaller.

I came to the UK soon after the 2001 foot-and-mouth disease outbreak. It was very clear that it was started by movements of sheep – a few sheep from Cumbria moving around Britain made a huge impact and in that signature you could see network relationships including the one I described above –the super spreader effect. In this case it was a large auction market, Longtown, that was bringing infected sheep together with uninfected ones, allowing infection to spread on farms and then spreading even more infected sheep to farms all around the country.

For me, this combination of working on how infectious diseases might transmit in theory, then in real life, like the foot-and-mouth epidemic, and getting data together that showed the theory really works, from a scientific perspective is an amazing opportunity to see data, theory, real-life problem and impact all coming together.

How do you see the future of your research?

Things are moving towards greater integration of more datasets to understand problems we’ve known about for a long time, but that as a group we haven’t helped along the way. Traditional mathematical models are now being joined with machine learning tools that allow us to go through large volumes of data rapidly and identify the important factors involved. This implies a better use of complex data sets, better use of computers and better use of algorithms, in combination with process models that provide an underlying sense of the mechanisms by which transmission occurs.

A really important, relatively recent innovation has been the use of improved analysis of the genetic code of pathogens using next generation sequencing, which allows us to record the genetic code of pathogens – bacteria and viruses – very cheaply. By generating sequences from individuals who were infected in the same outbreak, and comparing the pathogen sequences, we are able to track diseases and understand how contact patterns interact with other risk factors and impact on disease, thereby improving control.

Of course as models become bigger and more sophisticated, the role of human behaviour becomes an increasingly important factor, especially as people will change their behaviour in response to circumstance. So for example, if a farmer buys infected livestock, they may become more cautious about their buying sources as a result. Integrating that aspect of human behaviour is another element in our research.

What’s your most interesting study from your time here?

I really like network analysis. I’ve been working with my staff scientist, Chris Banks, and with Jess Enright in the Global Academy for Agriculture and Food Security to look at Twitter and bovine TB – examining the pattern of tweets on polarised issues to see how rare, extreme behaviours influence rational debate. I think it is really interesting because it relates to fundamentals of human behaviour. The work on whole genome sequencing of pathogens is really great too. We were the first to study the sequences of the bacteria that leads to bovine TB in order to look at its transmission. You feel like you’re opening King Tut’s tomb – it’s like being the first ones to see what is inside. Every time you look into new data there’s a little bit of that feeling, because nobody has seen it. When it’s truly new, like the TB work, when it hasn’t been done before, you’ve got that extra kick that says “this is great”.

What’s the most rewarding part of your research?

When you can come up with a simple, theoretically driven argument and show that it actually has relevance to real data. The highly computational, highly algorithmic understanding of things such as machine learning is really powerful, and we’d be crazy not to exploit that, but there is so much more power in identifying mechanisms, finding a simple thing you can measure or predict and showing that simple thing to be actually useful. It’s all about understanding. Without understanding, the data can still be useful, but it’s missing a lot and, in my view at least, misses the best of the science.

What would you be doing if you hadn’t become a scientist?

When I finished school I had to decide between studying history or engineering physics. I chose the latter, but things could easily have turned out differently. I suspect I would have still been an academic, though – history or science, for me it’s all about the spirit of inquiry.

Professor Rowland Kao on disease transmission networks

Related links