Repository logo

Multimodal Emotion Recognition Using Physiological Signals

dc.contributor.authorDhothar, Mehakdeep Kaur
dc.contributor.supervisorBolić, Miodrag
dc.date.accessioned2025-11-19T19:37:06Z
dc.date.available2025-11-19T19:37:06Z
dc.date.issued2025-11-19
dc.description.abstractAffective computing aims to develop systems capable of recognizing and interpreting human emotions, yet existing multimodal datasets frequently suffer from limitations such as poor signal quality, high inter-subject variability, and inconsistent evaluation protocols. To address these gaps, this thesis develops and validates a comprehensive framework for multimodal emotion recognition using physiological signals - Electrocardiogram (ECG), Electrodermal Activity (EDA), and Respiration (RSP) - augmented with speech-based representations. The goal was to establish standardized preprocessing workflows, rigorous signal quality assessment (SQA), and reproducible baseline experiments to support the development and technical validation of a large-scale physiological dataset. This framework was applied to a dataset collected from 99 participants, containing synchronized physiological recordings, speech responses, and self-reported emotional annotations during exposure to validated video stimuli. To ensure data integrity, a rigorous SQA and artifact-removal pipeline was applied across modalities, integrating established ECG and respiration metrics with newly designed EDA-specific indicators. Using this refined dataset, multiple emotion-classification experiments were conducted under a strict subject-independent evaluation protocol, comparing fixed 30-second windows with emotion-triggered temporal segments. Across all tasks - binary arousal, binary valence, and multiclass emotion recognition - trigger-based segments consistently produced clearer and more discriminative physiological patterns. Random Forest achieved the strongest overall performance, including 78.8% multiclass accuracy using physiological features alone. To explore multimodal enhancement, speech embeddings were fused with handcrafted physiological features. This early-fusion approach led to substantial improvements across all tasks, most notably increasing multiclass accuracy from 78.8% to 97% when using trigger-based segments. These findings demonstrate that speech provides complementary affective information that enhances physiological representations. A subject-wise evaluation was also conducted to examine emotion separability across individuals and to identify video-specific misclassification patterns that reveal how different stimuli elicit varying physiological responses. Overall, this thesis delivers a validated multimodal dataset, reproducible processing pipelines, and strong baseline benchmarks that provide a solid foundation for future research in physiological and multimodal emotion recognition.
dc.identifier.urihttp://hdl.handle.net/10393/51064
dc.identifier.urihttps://doi.org/10.20381/ruor-31529
dc.language.isoen
dc.publisherUniversité d'Ottawa / University of Ottawa
dc.rightsAttribution-NonCommercial 4.0 Internationalen
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/
dc.subjectMachine Learning
dc.subjectAffect Recognition
dc.subjectMultimodal Machine Learning
dc.subjectAffective Computing
dc.subjectEmotion Recognition
dc.titleMultimodal Emotion Recognition Using Physiological Signals
dc.typeThesisen
thesis.degree.disciplineGénie / Engineering
thesis.degree.levelMasters
thesis.degree.nameMASc
uottawa.departmentScience informatique et génie électrique / Electrical Engineering and Computer Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
Dhothar_Mehakdeep_Kaur_2025_thesis.pdf
Size:
9.41 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail ImageThumbnail Image
Name:
license.txt
Size:
6.65 KB
Format:
Item-specific license agreed upon to submission
Description: