Multimodal Emotion Recognition Using Temporal Convolutional Networks
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Université d'Ottawa / University of Ottawa
Abstract
Over the past decade, the field of affective computing has received increasing attention. With advancements in machine learning, a wide range of methodologies have been developed to better understand human emotions. However, one of the major challenges in this field is accurately modeling emotions on a set of continuous dimensions, such as arousal and valence. This type of modeling is essential to represent complex and subtle emotions, and to capture the full spectrum of human emotional experiences. Additionally, predicting changes in emotions across time series adds another layer of complexity, as emotions can shift continuously.
Our work addresses these challenges using a dataset that includes natural and spontaneous emotions from diverse individuals. We extract multiple features from different modalities, including audio, video, and text, and use them to predict emotions across three axes: arousal, valence, and liking. To achieve this, we employ deep features and multiple fusion techniques to combine the modalities. Our results demonstrate that temporal convolutional networks outperform long short-term memory models in multimodal emotion prediction.
Overall, our research contributes to advancing the field of affective computing by developing more accurate and comprehensive methods for modeling and predicting human emotions.
Description
Keywords
machine learning, temporal convolutional networks, neural networks, emotion recognition, artificial intelligence
