Statistical Parametric Speech Synthesis for Online Dictionaries – Problems and Solutions

Piits, Liisi; Kudritski, Elgar; Kiissel, Indrek; Hein, Indrek

doi:10.3233/978-1-61499-442-8-27

Abstract

This paper describes an attempt to use Estonian statistical parametric speech synthesis for audio pronunciation of words and word forms in online dictionaries. Two new HTS-voices were created and compared for this purpose. The paper gives an overview of a design and evaluation process for these voices. Different errors were detected including quantity errors, bad sound quality, accent errors, gemination at the boundary of compound word components, etc. The level of correctness and sound quality for the two parametric speech synthesisers ranged from 69% to 76%. The paper demonstrates that voice Eva-2, which can accept text with diacritics as input, produces fewer errors. Still, the error rate of both new voices is too high to fill the criteria of orthoepy in learner's dictionaries.

This website uses cookies

This website uses cookies