

Visual Lifelogging is the process of keeping track of one's life through wearable cameras. The focus of this research is to automatically classify images, captured from a wearable camera, into indoor and outdoor scenes. The results of this classification may be used in several applications. For instance, one can quantify the time a person spends outdoors and indoors which may give insights about the psychology of the concerned person. We use transfer learning from two VGG convolutional neural networks (CNN), one that is pre-trained on the ImageNet data set and the other on the Places data set. We investigate two methods of combining features from the two pre-trained CNNs. We evaluate the performance on the new UBRug data set and the benchmark SUN397 data set and achieve accuracy rates of 98.24% and 97.06%, respectively. Features obtained from the ImageNet pretrained CNN turned out to be more effective than those obtained from the Places pre-trained CNN. Fusing the feature vectors obtained from these two CNNs is an effective way to improve the classification. In particular, the performance that we achieve on the SUN397 data set outperforms the state-of-the-art.