Transcription System for Semi-Spontaneous Estonian Speech

Alum&#228;e, Tanel

doi:10.3233/978-1-61499-133-5-10

IOS Press Ebooks

Guest Access

As a guest user you are not logged in or recognized by your IP address. You have access to the Front Matter, Abstracts, Author Index, Subject Index and the full text of Open Access publications.

loading subjects...

Transcription System for Semi-Spontaneous Estonian Speech

Authors

Tanel Alumäe

Pages

10 - 17

DOI

10.3233/978-1-61499-133-5-10

Series

Frontiers in Artificial Intelligence and Applications

Ebook

Volume 247: Human Language Technologies – The Baltic Perspective

Abstract

This paper describes a speech-to-text system for semi-spontaneous Estonian speech. The system is trained on about 100 hours of manually transcribed speech and a 300M word text corpus. Compound words are split before building the language model and reconstructed from recognizer output using a hidden event N-gram model. We use a three pass transcription strategy with unsupervised speaker adaptation between individual passes. The system achieves a word error rate of 34.6% on conference speeches and 25.6% on radio talk shows.

This website uses cookies

This website uses cookies