Scientists deploy machine learning to close data gaps on Ethiopian animal health
Automation will speed up data-gathering
By Vanessa Meadu
Scientists at the University of Edinburgh have achieved an important milestone in their effort to improve the evidence base on livestock health in low and middle-income countries. Vital data and evidence on animal health is often disparate or hard to access. Now, scientists have devised a novel approach to automate and accelerate data extraction from a large number of scientific documents. This collaboration between the Bayes Centre and Supporting Evidence Based Interventions (SEBI) will help SEBI map the available evidence on disease prevalence and mortality, and expose current knowledge gaps and clusters.
Co-authors Seraphina Goldfarb-Tarrant and Alexander Robertson presented their work at the recent virtual 2020 Conference in Empirical Methods in Natural Language Processing. In an accompanying paper, they detail how automation is replacing some of the time-consuming manual processes for producing systematic evidence maps.
In the paper, the authors outline how automation can be used in the three main stages of producing systematic evidence maps: “searching for documents can be done via APIs and scrapers, selection of relevant documents can be done via binary classification, and extraction of data can be done via sequence-labelling classification,” they explain.
The authors constructed a new automation pipeline that also tests some of the trade-offs between human time saved vs. quality of outputs. They have found the system can gather data with “surprising accuracy and generalisability… in only 15% of the time it takes to do the whole review manually and can be repeated and extended to new data with no additional effort.”
This approach is currently being applied to assemble evidence on animal disease prevalence and mortality rates for Ethiopia. The data will eventually be presented via an interactive dashboard allowing researchers to quickly judge where the evidence is strong, and where further research needs to be done. The ultimate goal of this work is to provide better evidence for decision-makers who want to invest in animal health in low and middle-income countries.
This work is done under the Supporting Evidence Based Interventions (SEBI) programme, which is awarded to the Royal (Dick) School of Veterinary Studies at the University of Edinburgh. Additional funds are provided by the South East Scotland City Region Deal’s Data-Driven Innovation (DDI) initiative.
Read the paper
- Goldfarb-Tarrant, S., Robertson, A., Lazic, J., Tsouloufi, T., Donnison, L., & Smyth, K. (2020, November). Scaling Systematic Literature Reviews with Machine Learning Pipelines. In Proceedings of the First Workshop on Scholarly Document Processing (pp. 184-195). https://www.aclweb.org/anthology/2020.sdp-1.21/