Active Hackathon

Top 8 Datasets Available For Emotion Detection

Emotion detection enables machines to detect various emotions. The technique that helps machines and computers to be capable of detecting, expressing and understanding emotions is known as emotional intelligence. In order to understand and detect emotions, the first and foremost requirement for machine learning models is the availability of a dataset. 

Below here, we are listing down top eight datasets, in no particular order, that are available for emotion detection.


Sign up for your weekly dose of what's up in emerging technology.

1| AffectNet

AffectNet is one of the largest datasets for facial affect in still images which covers both categorical and dimensional models. The dataset is collected by using 1250 emotion-related tags in six different languages, that are English, German, Spanish, Portuguese, Arabic, and Farsi. The dataset contains more than one million images with faces and extracted facial landmark points.

Get the dataset here.

2| Ascertain 

Ascertain is a multimodal database for impliCit pERsonaliTy and Affect recognitIoN that can be used for detecting personality traits and emotional states via physiological responses. The dataset contains big-five personality scales and emotional self-ratings of 58 users along with synchronously recorded Electroencephalogram (EEG), Electrocardiogram (ECG), Galvanic Skin Response (GSR) and facial activity data, recorded using off-the-shelf sensors while viewing affective movie clips. 

Get the dataset here.

3| Dreamer

Dreamer is a multi-modal database consisting of electroencephalogram (EEG) and electrocardiogram (ECG) signals recorded during affect elicitation by means of audio-visual stimuli. In this dataset, signals from 23 participants were recorded along with the participants’ self-assessment of their affective state after each stimulus, in terms of valence, arousal, and dominance.

Get the dataset here.

4| Extended Cohn-Kanade Dataset (CK+)

The Extended Cohn-Kanade Dataset (CK+) is a public benchmark dataset for action units and emotion recognition. The dataset comprises a total of 5,876 labelled images of 123 individuals, where the sequences range from neutral to peak expression. Images in the CK+ dataset are all posed with similar backgrounds, mostly grayscale, and 640×490 pixels.

Get the dataset here.


EMOTIC or EMOTIon recognition in Context is a database of images with people in real environments, annotated with their apparent emotions. The EMOTIC dataset combines two different types of emotion representation, that includes a set of 26 discrete categories, and the continuous dimensions valence, arousal, and dominance. The dataset contains 23, 571 images and 34, 320 annotated people. In fact, some of the images were manually collected from the internet using the Google search engine.

Get the dataset here.

6| FER-2013

The FER-2013 dataset consists of 28,000 labelled images in the training set, 3,500 labelled images in the development set, and 3,500 images in the test set. The dataset was created by gathering the results of a Google image search of each emotion and synonyms of the emotions. Each image in FER-2013 is labelled as one of seven emotions, such as happy, sad, angry, afraid, surprise, disgust, and neutral, with happy being the most prevalent emotion, providing a baseline for random guessing of 24.4%.

Get the dataset here.

7| Google Facial Expression Comparison Dataset

Google Facial Expression Comparison Dataset is a large-scale facial expression dataset that consists of face image triplets along with human annotations. The dataset helps in specifying which two faces in each triplet form the most similar pair in terms of facial expression. The dataset is intended to help on topics related to facial expression analysis such as expression-based image retrieval, expression-based photo album summarisation, emotion classification, expression synthesis, etc. 

Get the dataset here.

8| K-EmoCon

K-EmoCon is a multimodal dataset acquired from 32 subjects participating in 16 paired debates on a social issue. The dataset consists of physiological sensor data collected with three off-the-shelf wearable devices, audiovisual footage of participants during the debate, and continuous emotion annotations.

Get the dataset here.

More Great AIM Stories

Ambika Choudhury
A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.

Our Upcoming Events

Conference, in-person (Bangalore)
Machine Learning Developers Summit (MLDS) 2023
19-20th Jan, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

Conference, in-person (Bangalore)
MachineCon 2023
23rd Jun, 2023

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM