Affective Computing Through Time-series Representation Learning

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Affective computing is a vital area of research with potential applications across health and well-being, education, workplace, and user experience by allowing systems to recognize, incorporate, and respond to human emotions. Despite recent advances, several challenges remain in accurately capturing and processing affective states from various forms of time-series. In this thesis, we address three major problems in this area. First, in Speech Emotion Recognition (SER), accurately distinguishing the linguistic and prosodic components of emotions is challenging. Existing methods often require explicit text transcription, leading to computational overhead and potential errors. To address this, we introduce EmoDistill, a cross-modal knowledge distillation framework that captures both linguistic and prosodic aspects of emotions from speech without needing text transcription, thereby enhancing SER accuracy and efficiency. Second, continuous monitoring of wearable physiological time-series like Photoplethysmogram (PPG) lacks the detailed information provided by Electrocardiogram (ECG), limiting the predictive power in affective state detection. To overcome this, we present the Region Disentangled Diffusion Model (RDDM), a novel diffusion model for high-fidelity PPG-to-ECG translation. RDDM generates detailed ECG signals efficiently, providing more precise physiological data for enhanced affect recognition. Lastly, current affect recognition systems often overlook the influence of external factors, such as sleep-related measures, on mood, reducing their accuracy and robustness. To tackle this, we propose NapTune, an efficient tuning framework that integrates the previous night's sleep-related measures with wearable time-series data. NapTune significantly improves mood classification accuracy by incorporating this additional modality.

Description

Keywords

Affective Computing, Deep Learning, Photoplethysmogram, Electrocardiogram, Wearables, Diffusion Models, Speech, Sleep

Citation

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Except where otherwised noted, this item's license is described as Attribution-ShareAlike 4.0 International