Dr Beatrice Alex
Senior Lecturer and Chancellor's Fellow in Text Mining

- Edinburgh Futures Institute
- School of Literatures, Languages and Cultures
- School of Informatics
Contact details
Address
- Street
-
50 George Square
Room 2.46 - City
- Edinburgh
- Post code
- EH8 9JU
Availability
Office hour: Tuesdays 2-3pm
Background
Beatrice Alex graduated in Languages, Translation and Interpreting (French and Russian) from Heriot-Watt University and received post-graduate training in computational linguistics and speech and language processing at the University of Edinburgh. She obtained her MSc in Speech and Language Processing and her Euromasters in Speech Processing in 2002 and her PhD on automatically detecting anglicisms in French and German text in 2008. After that, she held a position as a Research Fellow at the School of Informatics at the University of Edinburgh for a number of years working on text mining for different applications in healthcare, biomedicine, literature and history. Since 2018, she has been Chancellor's Fellow at the Edinburgh Futures Institute (EFI) and the School of Literatures, Languages and Cultures as well as Turing Fellow at The Alan Turing Institute and the School of Informatics. She was promoted to Senior Lecturer in 2021.
Since 2018, she has been the head of the Edinburgh Language Technology Group (LTG), a research and development group working in the area of natural language engineering at the University of Edinburgh. She also leads the Edinburgh Clinical NLP Group.
Her research focuses on text mining and natural language processing to extract information from raw text. She's currently leading NLP work in the Advanced Care Research Centre and the AIM-CISC (Artificial Intelligence and Multimorbidity: Clustering in Individuals, Space and Clinical Context) project with a focus on developing tools that can assist in predicting multimorbidity and adverse drug events to improve care in later life. She's also leading the NLP work in the Warbler project with the aim to phenotype and analyse 1.7 mio brain imaging reports of the Scottish population.
On the humanities side, she leads the text mining and analysis work that is carried out by the Scottish Gaelic Algorithmic Research Group. Previously, she was part of the Palimpsest project on Mining Literary Edinburgh and is one of the core developers of the Edinburgh Geoparser.
She teaches text mining and computational text analysis methods to EFI students.
Responsibilities & affiliations
Most recently, Alex co-chaired the HealTAC 2021 conference on healthcare text analytics and is an editor on a special edition of Frontiers in Digital Health on "Healthcare Text Analytics: Unlocking the Evidence from Free Text".
Previously, she has been co-organiser of LaTeCH and LaTeCH-CLfL workshops and continues to server on its programme committee. She was co-convener of the Humanities and Data Science special interest group at The Alan Turing Institute. Alex also serves as an editor on the Journal of Open Humanities Data.
Postgraduate teaching
Course organiser:
- Text Mining for Social Research (fusion onsite and online)
Co-course organiser:
- Narrative and Computational Text Analysis
Open to PhD supervision enquiries?
Yes
Research summary
Dr. Alex's research interests include text mining for written text and speech transcripts and her work is applied in different domains such as healthcare and digital humanities.
Research activities
-
Ontology-driven and weakly supervised rare disease identification from clinical notes
In:
Bmc medical informatics and decision making, vol. 23
DOI: https://doi.org/10.1186/s12911-023-02181-9
Research output: Contribution to Journal › Article (Published) -
Detecting Adverse Drug Events from social media: A brief literature review
Research output: › Conference contribution (Published) -
Automated clinical coding: What, why, and where we are?
(8 pages)
In:
npj Digital Medicine, vol. 5, pp. 1-8
DOI: https://doi.org/10.1038/s41746-022-00705-7
Research output: Contribution to Journal › Review article (Published) -
Edinburgh_UCL_Health@ SMM4H'22: From Glove to Flair for handling imbalanced healthcare corpora related to Adverse Drug Events, Change in medication and self-reporting vaccination
Research output: › Conference contribution (Published) -
ISARIC-COVID-19 dataset: A Prospective, Standardized, Global Dataset of Patients Hospitalized with COVID-19
In:
Scientific Data, vol. 9, pp. 454
DOI: https://doi.org/10.1038/s41597-022-01534-9
Research output: Contribution to Journal › Article (Published) -
Uncertainty and inclusivity in gender bias annotation: An annotation taxonomy and annotated datasets of British English text
(28 pages)
DOI: https://doi.org/10.18653/v1/2022.gebnlp-1.4
Research output: Contribution to Workshop › Conference contribution (Published) -
Beyond explanation: A case for exploratory text visualizations of non-aggregated, annotated datasets
Research output: Contribution to Workshop › Paper (Published) -
Handwriting recognition for Scottish Gaelic
(11 pages)
Research output: Contribution to Workshop › Conference contribution (Published) -
Developing automatic speech recognition for Scottish Gaelic
(11 pages)
Research output: Contribution to Workshop › Conference contribution (Published) -
Horses to Zebras: Ontology-Guided Data Augmentation and Synthesis for ICD-9 Coding
(13 pages)
DOI: https://doi.org/10.18653/v1/2022.bionlp-1.39
Research output: Contribution to Workshop › Conference contribution (Published) -
Ontology-based and weakly supervised rare disease phenotyping from clinical notes
DOI: https://doi.org/10.48550/arXiv.2205.05656
Research output: › Preprint (Published) -
Automated Clinical Coding: What, Why, and Where We Are?
(8 pages)
DOI: https://doi.org/10.48550/arXiv.2203.11092
Research output: › Preprint (Published) -
The Lothian Diary Project: Sociolinguistic methods during the COVID-19 lockdown
(10 pages)
In:
Linguistics Vanguard, vol. 8, pp. 321-330
DOI: https://doi.org/10.1515/lingvan-2021-0053
Research output: Contribution to Journal › Article (Published) -
CoPHE: A Count-Preserving Hierarchical Evaluation Metric in Large-Scale Multi-Label Text Classification
(6 pages)
DOI: https://doi.org/10.18653/v1/2021.emnlp-main.69
Research output: Contribution to Conference › Conference contribution (Published) -
Extending defoe for the efficient analysis of historical texts at scale
(9 pages)
DOI: https://doi.org/10.1109/eScience51609.2021.00012
Research output: Contribution to Conference › Conference contribution (Published) -
The reporting quality of natural language processing studies - systematic review of studies of radiology reports
In:
BMC medical imaging
DOI: https://doi.org/10.1186/s12880-021-00671-8
Research output: Contribution to Journal › Article (Published) -
COVID-19 symptoms at hospital admission vary with age and sex: results from the ISARIC prospective multinational observational study
(17 pages)
In:
Infection, vol. 49, pp. 889-905
DOI: https://doi.org/10.1007/s15010-021-01599-5
Research output: Contribution to Journal › Article (Published) -
Classifying patient and professional voice in social media health posts
(10 pages)
In:
Bmc medical informatics and decision making, vol. 21, pp. 1-10
DOI: https://doi.org/10.1186/s12911-021-01577-9
Research output: Contribution to Journal › Article (Published) -
Towards Better Use of Ontological Structure in the Evaluation of Automated ICD Coding
(5 pages)
Research output: Contribution to Conference › Paper (Published) -
A systematic review of natural language processing applied to radiology reports
In:
Bmc medical informatics and decision making, vol. 21
DOI: https://doi.org/10.1186/s12911-021-01533-7
Research output: Contribution to Journal › Article (Published)