

Peer learning companions such as interactive tablets and social robots have shown great promise in supporting language development in young children. However, studies have shown that the perceived credibility of a robot as an educator and peer companion is contingent on how socially it behaves. We specifically focus on two roles of a peer learning companion- as an engaging storyteller and active listener. To this end, we develop models to predict whether the listener will lose attention (Listener Disengagement Prediction, LDP) and whether the robot should generate listener backchannels with high probability (Backchanneling Extent Prediction, BEP) during a specific time window. We formulate LDP and BEP as Time Series Classification problems and through extensive evaluation in multiple experimental settings, demonstrate our models’ promising results. Inspired by prior work, we also investigate socio-demographic and developmental features, which may give rise to variations in children’s backchanneling responses. Moreover, we examine critical features responsible for the predictive utility of our models using Permutation Feature Importance and Partial Dependency Plots. Our findings suggest that features such as pupil dilation, blink rate, acceleration of head, gaze direction, and some facial action units which have not been considered in prior work, are in fact, critical in predicting backchanneling extent and listener disengagement.