Emotions are reactions that can be expressed through a variety of social signals. For example, anger can be expressed through a scowl, narrowed eyes, a long stare, or many other expressions. This complexity is problematic when attempting to recognize a human's expression in a human-robot interaction: categorical emotion models used in HRI typically use only a few prototypical classes, and do not cover the wide array of expressions in the wild. We propose a data-driven method towards increasing the number of known emotion classes present in human-robot interactions, to 28 classes or more. The method includes the use of automatic segmentation of video streams into short (<10s) videos, and annotation using the large set of widely-understood emojis as categories. We then investigate the meaning behind these emojis by studying how humans perceive these emojis. We showcase our results in a taxonomy which includes each emoji and the different meanings people perceived from it. This framework for social signal analysis can be used in the future by researchers to capture what social signals are happening in the wild. Furthermore, researchers can use emojis as social signal representation labels for training machine learning models, towards more accurate human emotion recognition by robots.
Copyright is held by the author(s).
This thesis may be printed or downloaded for non-commercial research and scholarly purposes.
Supervisor or Senior Supervisor
Thesis advisor: Lim, Angelica
Member of collection