The Emutivo research has leveraged datasets collected externally and internally.  Each of these datasets is detailed below with any papers our team has published on that data.  The git repository links following each paper contain the feature datasets, model code, results, and/or additional visualizations for that paper.  All internal datasets were collected under WPI IRB 00007374 File 18-0031 approved 23 October 2017.



The Student Suicidal Ideation and Depression Detection (StudentSADD) dataset was collected between August 2020 and January 2021 by the 2020 REU team and 2020-2021 MQP team advised by ML Tlachac and Prof. Rundensteiner with assistance from Ermal Toto. The StudentSADD dataset contains text prompt, voice recordings, retrospective smartphone data, and Twitter data labeled with demographics and PHQ-9 scores from over 300 college student participants.



The Early Mental Health Uncovering (EMU) framework is for mental illness screening with active and passive modalities. The EMU dataset was collected by the 2019-2020 MQP team advised by Ermal Toto, ML Tlachac, and Prof. Rundensteiner.  This dataset contains voice recordings, retrospective smartphone data, and twitter data labeled with PHQ-9 and GAD-7 scores from over 60 crowd-sourced participants.



The Mood Assessment Capable (Moodable) framework  is for depression assessment with retrospectively harvested smartphone and social media data.  This was the first dataset collected by a team at WPI, namely the 2017-2018 MQP team advised by Ermal Toto, Prof. Agu, and Prof. Rundensteiner.  The Moodable dataset contains retrospectively harvested smartphone and social media data labeled with PHQ-9 scores from over 300 crowd-sourced participants. 



The external Wizard-of-Oz subset of the Distress Analysis Interview Corpus contains clinical interviews with mental illness labels.  We have leveraged the audio component of these interviews to screen for depression.  



The external StudentLife dataset contains sensor and survey data from 48 students. We have leveraged the GPS component of this dataset to screen for depression.



This project started by performing emotion detection.  Emotivo is our name for the combination of the external Surrey Audio-Visual Expressed Emotion (SAVEE) database, the external RML Emotion Database, and the external Berlin Database of Emotional Speech.  RML used 6 basic emotions: anger, disgust, fear, happiness, sadness, and surprise, while SAVEE and Berlin added neutral state.