Speech-Based Identification of Children&#39;s Gender and Age with Neural Networks

Piel, Leo Kristopher; Alum&#228;e, Tanel

doi:10.3233/978-1-61499-912-6-104

Abstract

In this paper, we investigate using different types of neural networks for age and gender identification from children's speech, based on the Corpus of Estonian Adolescent Speech. Feed-forward deep neural networks using i-vectors as input are compared with recurrent neural networks using MFCCs as input. Results show that feed-forward neural networks outperform recurrent neural networks for gender classification, while a model that combines both i-vectors and MFCC via feed-forward and recurrent branches achieve the best performance for age group classification. We also show that for age group classification, it is beneficial to first identify gender and then use a gender-specific age identification model. Experiments with human listeners show that the neural network models outperform humans on both tasks by a big margin.

This website uses cookies

This website uses cookies