

To enhance estimation of emotion in speech, we propose three new approaches. First approach is that we use more synthetic speeches than our previous work. We define emotion in these speech based on human evaluation and use these speech data to make classifiers. Second approach is that we add some statistics values to our previous approach. Additional statistics values are quartile, range, interquartile range, the upper and lower half of interquartile range and the coefficient of the regression formula. We assume that these values show new viewpoints about speech features. Third approach is that we use phonemic features and syllabic features to estimate emotion in speech. In this paper, phonemic feature is a feature gotten from each phoneme in a speech by frequency analysis. Syllabic feature is a feature gotten from each syllable in a speech by frequency analysis. We use speech recognition to get phonemes and get syllables from a speech based on phonemes. Experimental result shows phonemic features and syllabic features are more useful than using the fundamental frequency and power to estimate anger, disgust fear and sad. The result also says that additional statistics values hardly contribute to estimate emotion. We need to analysis classifiers to evaluate contribution of these statistics. We have some future works. First work is that we use the frequency and power with phonemic features and syllabic features. Second work is that we modify our approach based on the analysis result of our experiment. Third work is that we use our approach in real-time.