Multimodal Emotion Recognition Using Physiological Signals

Dhothar, Mehakdeep Kaur

Multimodal Emotion Recognition Using Physiological Signals

dc.contributor.author	Dhothar, Mehakdeep Kaur
dc.contributor.supervisor	Bolić, Miodrag
dc.date.accessioned	2025-11-19T19:37:06Z
dc.date.available	2025-11-19T19:37:06Z
dc.date.issued	2025-11-19
dc.description.abstract	Affective computing aims to develop systems capable of recognizing and interpreting human emotions, yet existing multimodal datasets frequently suffer from limitations such as poor signal quality, high inter-subject variability, and inconsistent evaluation protocols. To address these gaps, this thesis develops and validates a comprehensive framework for multimodal emotion recognition using physiological signals - Electrocardiogram (ECG), Electrodermal Activity (EDA), and Respiration (RSP) - augmented with speech-based representations. The goal was to establish standardized preprocessing workflows, rigorous signal quality assessment (SQA), and reproducible baseline experiments to support the development and technical validation of a large-scale physiological dataset. This framework was applied to a dataset collected from 99 participants, containing synchronized physiological recordings, speech responses, and self-reported emotional annotations during exposure to validated video stimuli. To ensure data integrity, a rigorous SQA and artifact-removal pipeline was applied across modalities, integrating established ECG and respiration metrics with newly designed EDA-specific indicators. Using this refined dataset, multiple emotion-classification experiments were conducted under a strict subject-independent evaluation protocol, comparing fixed 30-second windows with emotion-triggered temporal segments. Across all tasks - binary arousal, binary valence, and multiclass emotion recognition - trigger-based segments consistently produced clearer and more discriminative physiological patterns. Random Forest achieved the strongest overall performance, including 78.8% multiclass accuracy using physiological features alone. To explore multimodal enhancement, speech embeddings were fused with handcrafted physiological features. This early-fusion approach led to substantial improvements across all tasks, most notably increasing multiclass accuracy from 78.8% to 97% when using trigger-based segments. These findings demonstrate that speech provides complementary affective information that enhances physiological representations. A subject-wise evaluation was also conducted to examine emotion separability across individuals and to identify video-specific misclassification patterns that reveal how different stimuli elicit varying physiological responses. Overall, this thesis delivers a validated multimodal dataset, reproducible processing pipelines, and strong baseline benchmarks that provide a solid foundation for future research in physiological and multimodal emotion recognition.
dc.identifier.uri	http://hdl.handle.net/10393/51064
dc.identifier.uri	https://doi.org/10.20381/ruor-31529
dc.language.iso	en
dc.publisher	Université d'Ottawa / University of Ottawa
dc.rights	Attribution-NonCommercial 4.0 International	en
dc.rights.uri	http://creativecommons.org/licenses/by-nc/4.0/
dc.subject	Machine Learning
dc.subject	Affect Recognition
dc.subject	Multimodal Machine Learning
dc.subject	Affective Computing
dc.subject	Emotion Recognition
dc.title	Multimodal Emotion Recognition Using Physiological Signals
dc.type	Thesis	en
thesis.degree.discipline	Génie / Engineering
thesis.degree.level	Masters
thesis.degree.name	MASc
uottawa.department	Science informatique et génie électrique / Electrical Engineering and Computer Science

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Dhothar_Mehakdeep_Kaur_2025_thesis.pdf
Size:: 9.41 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.65 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

- Thèses, 2011 - // Theses, 2011 -