This article presents a cross-lingual study for agglutinative, fixed stressed languages, like Hungarian and Finnish, about the segmentation of continuous speech on word level by examination of supra-segmental parameters.
We have developed different algorithms based either on a rule based or a data-driven approach. The best results were obtained by data-driven algorithms (HMM-based methods) using the time series of fundamental frequency and energy together. This HMM based method will be described in this article.
Word boundaries were marked with acceptable accuracy, even if we were unable to find all of them. On the base of this study a word level segmentationer has been developed which can indicate the word boundaries with acceptable precision for both languages.
The evaluated method is easily adaptable to other fixed-stress languages.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 email@example.com
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 firstname.lastname@example.org