Computational speech group

Main research areas: digital speech processing, machine learning, speech science

Current research topics: automatic speaker verification (ASV), spoofing and countermeasures for ASV, spoken language identification, voice conversion (VC)

Main topic of our group is speaker recognition: the task of connecting speech sample to an identity (who is speaking?). We work on improving both robustness (improved accuracy under varied channel, noise and other disturbances) and security of such systems (detection of representation attacks, also known as spoofing attacks). Our research group not only takes regularly part to the technology evaluations in our field, but also contributes to open data science as a co-organizer of public evaluation benchmarks (including ASVspoof and VC challenge) and collection of other data, such as the AVOID corpus.

In our view, understanding the limits of ASV requires deep understanding of the attacks as well; we therefore work also on related problems such as voice conversion (conversion of speaker identity) and disguise (avoiding to be identified as oneself). We believe in keeping mind open for new, unexpected research directions through multidisciplinary research to address fundamental problems in computer processing of speech. In our view, one should not build only data-driven black-boxes but aim at understanding what characterizes speaker, language and 'spoof' cues relevant for machine and human observers. Besides our core focus on computational speech processing methods that relies on machine learning and statistics, we also apply perceptual and acoustic methods in our research. As most of the problems within the speech field are beyond the reach of any single individual or a research group, we also believe in the importance of research collaboration that expand across borders and continent.

Interested to collaborate or work with us? Feel free to drop an e-mail to associate professor Tomi Kinnunen (tomi.kinnunen {at] to discuss further.