Preface
Human faces, as well as human body motion and gestures, provide a wide range of information about a person's identity, race, sex, age, and emotional state. In this monograph, we study the perception of primarily facial expression of emotion and secondarily of motion and gestures. Our aim is to develop a fully automated visual affect recognition system, which could be useful in novel/future modes of human-computer interaction that include user affect recognition. Our studies begin with a survey of the literature on emotion perception from the scientific - psychological and medical - point of view. Based on these studies, we are led to the following conclusions: (1) a number of brain parts play a significant role in emotion perception and expression, (2) there are six ‘basic emotions’ that arise very commonly, namely: ‘anger’, ‘disgust’, ‘fear’, ‘happiness’, ‘sadness’ and ‘surprise’ and, (3) there is cultural specificity in emotion perception and expression. The latter assumption is further corroborated by two empirical studies that we conducted on humans. In these empirical studies, the participants were shown face images and asked to classify the emotions. The difference in the correct classification rates demonstrates that there is cultural specificity in the ways people express and recognize emotions. Moreover, from our empirical studies, we were able to identify the emotion classes that are present during a typical human-computer interaction session, namely ‘happiness’, ‘sadness’, ‘surprise’, ‘anger’, ‘disgust’, ‘boredom-sleepiness’, as well as the emotionless state that is referred to as ‘neutral’. Towards building our visual affect recognition system, we constructed our own face image database. This database consists of two sets of face images, all in both front and side view: (1) low quality images acquired with use of web cameras and (2) high quality face images acquired with use of high resolution digital cameras. On the basis of these empirical studies, we developed our own visual affect recognition system which consists of two modules: (1) a face detection subsystem and (2) a facial expression recognition subsystem. Our face detection subsystem uses neural network-based classifiers. For our facial expression recognition subsystem, we considered neural network-based and other classifiers, but concluded that Support Vector Machine-based Classifiers demonstrated better results. Details of our visual affect recognition system, such as feature extraction, classifier design, are demonstrated and analyzed along with extensive performance evaluations and test results. Current research avenues in the directions of visual affect recognition via analysis of human body motion and gestures are also discussed.