Authors:
(1) Krist Shingjergji, Educational Sciences, Open University of the Netherlands, Heerlen, The Netherlands (krist.shingjergji@ou.nl);
(2) Deniz Iren, Center for Actionable Research, Open University of the Netherlands, Heerlen, The Netherlands (deniz.iren@ou.nl);
(3) Felix Bottger, Center for Actionable Research, Open University of the Netherlands, Heerlen, The Netherlands;
(4) Corrie Urlings, Educational Sciences, Open University of the Netherlands, Heerlen, The Netherlands;
(5) Roland Klemke, Educational Sciences, Open University of the Netherlands, Heerlen, The Netherlands.
Editor's note: This is Part 1 of 6 of a study detailing the development of a gamified method of acquiring annotated facial emotion data. Read the rest below.
Training facial emotion recognition models requires large sets of data and costly annotation processes. To alleviate this problem, we developed a gamified method of acquiring annotated facial emotion data without an explicit labeling effort by humans. The game, which we named Facegame, challenges the players to imitate a displayed image of a face that portrays a particular basic emotion. Every round played by the player creates new data that consists of a set of facial features and landmarks, already annotated with the emotion label of the target facial expression. Such an approach effectively creates a robust, sustainable, and continuous machine learning training process. We evaluated Facegame with an experiment that revealed several contributions to the field of affective computing. First, the gamified data collection approach allowed us to access a rich variation of facial expressions of each basic emotion due to the natural variations in the players’ facial expressions and their expressive abilities. We report improved accuracy when the collected data were used to enrich well-known in-the-wild facial emotion datasets and consecutively used for training facial emotion recognition models. Second, the natural language prescription method used by the Facegame constitutes a novel approach for interpretable explainability that can be applied to any facial emotion recognition model. Finally, we observed significant improvements in the facial emotion perception and expression skills of the players through repeated game play.
Index Terms—Affective computing, facial emotion recognition, gamification, explainable AI, interpretable machine learning
Facial expressions are imperative to non-verbal human communication as they provide a means of conveying information regarding the emotional state [1] as well as the behavioral intentions [2] of the individual. Emotions are fundamental components of social interaction [3], and the ability to express and perceive emotions is an invaluable asset for building social connections. The holy grail of affective computing is to empower computer systems with the ability to perceive and express emotions, and be able to form social ties with human users [4]. Until very recently, this ability has been considered unique to humans. However, especially with the recent advances in Artificial Intelligence (AI), many studies have been conducted that focus on the automated recognition of emotions [5].
The common approach of training machine learning models for facial emotion recognition (FER) is supervised learning, which requires large sets of data [6]. Specifically, deep FER models are challenged by a lack of sufficient data for training [7]. Collecting and curating such large datasets is a costly and time-consuming endeavour since labeling by human annotators is necessary [8]. This poses an obstacle to achieving significant performance improvements in emotion recognition research.
Another major challenge lies in the explainability and interpretability of emotion recognition models. Studies mostly evaluate emotion recognition models using accuracy and confusion matrices; however, these metrics often fall short in reporting the utility of the models for humans. Interpretable models should provide explanations that are simple enough to be understood by their users, and are given in a language that is meaningful to them [9]. The explainability of emotion recognition models has been very rarely addressed in literature. The approaches to achieve explainability are limited to modelagnostic methods that explain the output of the model based on the inputs, and model-transparent methods (e.g., [10], [11]) that highlight the activation in different layers of artificial neural networks [12]. However, neither approach necessarily provides human-friendly explanations that are interpretable by their users.
The challenges regarding collecting and curating excessive amounts of labeled data for training FER models, and yielding interpretable explanations from such models call for heterodox methods. In this study, we propose a gamification approach towards the collection of annotated facial emotion data. The proposed game, which is named; Facegame, embodies a method for providing natural language prescription as feedback to the players, effectively serving as a means of achieving interpretable explainability. In summary, our contributions to the field of affective computing are as follows:
• We present a gamification approach for rapidly collecting annotated facial emotion data that is rich in variety of facial expressions, in a low-cost, low-effort manner.
• We propose a novel approach for interpretable explanability by translating the intermediary facial features into natural language prescriptions, and providing them as an explanation for the emotion classifications provided by any FER model.
• The presented gamification approach leads to significant improvements in the facial emotion perception and expression skills of the players.
The remaining of this paper is structured as follows. Section II provides a literature review on emotion recognition, explainable AI, and gamified data collection. Section III describes the core contributions. Section IV presents the details of our experimental study. Section V discloses the results of our experiments. Finally, Section VI provides a discussion on the theoretical and practical implications of our contributions, and concludes the paper.
This paper is available on arxiv under CC BY 4.0 DEED license.