School of Informatics

Menu

Paper by Sharon Goldwater and former Informatics student receives ISCA Award

Sharon Goldwater and her former PhD student, Herman Kamper, have been awarded the ISCA Award for the Best Research Paper published in Computer Speech and Language for their paper: 'A segmental framework for fully-unsupervised large-vocabulary speech recognition'.

Sharon Goldwater
Sharon Goldwater, Professor in the Institute for Language, Cognition and Computation at the School of Informatics

Each year, the International Speech Communication Association (ISCA) awards 3 best student papers at INTERSPEECH based on anonymous reviewing and presentation at the conference. Each paper is awarded 250 euros to be split between the student authors. Best Papers of Speech Communication, Computer Speech and Language are also announced by ISCA during INTERSPEECH.

Herman Kamper and Sharon Goldwater(School of Informatics, University of Edinburgh) alongside Aren Jansen (Google, Inc.), have co-authored the winning article for this year's ISCA Award for the Best Research Paper published in Computer Speech and Language (2016-2020).

A segmental framework for fully-unsupervised large-vocabulary speech recognition

Zero-resource speech technology is a growing research area that aims to develop methods for speech processing in the absence of transcriptions, lexicons, or language modelling text. Early term discovery systems focused on identifying isolated recurring patterns in a corpus, while more recent full-coverage systems attempt to completely segment and cluster the audio into word-like units---effectively performing unsupervised speech recognition.

The article presents a framework to apply such a system to large-vocabulary multi-speaker data. This proposed system uses a Bayesian modelling framework with segmental word representations: each word segment is represented as a fixed-dimensional acoustic embedding obtained by mapping the sequence of feature frames to a single embedding vector. The article gives a comparison between English and Xitsonga datasets to state-of-the-art baselines, using a variety of measures including word error rate.

Related links

Paper abstract and download

Sharon Goldwater's personal page