MITB Banner

10 Face Datasets To Start Facial Recognition Projects

Share

One of the major research areas, facial recognition has been adopted by governments and organisations for a few years now. Leading phone makers like Apple, Samsung, among others, have been integrating this technology into their smartphones for providing maximum security to the users. As per research, facial recognition technology is expected to grow and reach $9.6 billion by 2020.

In this article, we list down 10 face datasets which can be used to start facial recognition projects.

(The datasets are listed according to the latest year of publication)

1| Flickr-Faces-HQ Dataset (FFHQ)

Flickr-Faces-HQ Dataset (FFHQ) is a dataset consist of human faces and includes more variation than CELEBA-HQ dataset in terms of age, ethnicity and image background, and also has much better coverage of accessories such as eyeglasses, sunglasses, hats, etc. The images were crawled from Flickr and then automatically aligned and cropped.  

Size: The dataset consists of 70,000 high-quality PNG images at 1024×1024 resolution and contains considerable variation in terms of age, ethnicity and image background. 

Projects: This dataset was originally created as a benchmark for generative adversarial networks (GAN).

Publication Year: 2019

Download here.

2| Tufts-Face-Database

Tufts Face Database is the most comprehensive, large-scale face dataset that contains 7 image modalities: visible, near-infrared, thermal, computerised sketch, LYTRO, recorded video, and 3D images. 

Size: The dataset contains over 10,000 images, where 74 females and 38 males from more than 15 countries with an age range between 4 to 70 years old are included. 

Projects: This database will be available to researchers worldwide in order to benchmark facial recognition algorithms for sketches, thermal, NIR, 3D face recognition and heterogamous face recognition.

Publication Year: 2019

Download here.

3| Real and Fake Face Detection

This dataset contains expert-generated high-quality photoshopped face images where the images are composite of different faces, separated by eyes, nose, mouth, or whole face.

Size: The size of the dataset is 215MB 

Projects: This dataset can be used to discriminate real and fake images.

Publication Year: 2019

Download here.

4| Google Facial Expression Comparison Dataset

This dataset by Google is a large-scale facial expression dataset that consists of face image triplets along with human annotations that specify, which two faces in each triplet form the most similar pair in terms of facial expression. 

Size: The size of the dataset is 200MB, which includes 500K triplets and 156K face images.

Projects: The dataset is intended to aid researchers working on topics related to facial expression analysis such as expression-based image retrieval, expression-based photo album summarisation, emotion classification, expression synthesis, etc.

Publication Year: 2018

Download here.

5| Face Images With Marked Landmark Points

Face Images with Marked Landmark Points is a Kaggle dataset to predict keypoint positions on face images.

Size: The size of the dataset is 497MP and contains 7049 facial images and up to 15 key points marked on them.

Projects: This dataset can be used as a building block in several applications, such as tracking faces in images and video, analysing facial expressions, detecting dysmorphic facial signs for medical diagnosis and biometrics or facial recognition.

Publication Year: 2018

Download here.

6| Labelled Faces in the Wild Home (LFW) Dataset

Labelled Faces in the Wild (LFW) dataset is a database of face photographs designed for studying the problem of unconstrained face recognition. Labelled Faces in the Wild is a public benchmark for face verification, also known as pair matching. 

Size: The size of the dataset is 173MB and it consists of over 13,000 images of faces collected from the web.

Projects: The dataset can be used for face verification and other forms of face recognition.

Publication Year: 2018

Download here.

7| UTKFace Large Scale Face Dataset

UTKFace dataset is a large-scale face dataset with long age span, which ranges from 0 to 116 years old. The images cover large variation in pose, facial expression, illumination, occlusion, resolution and other such.

Size: The dataset consists of over 20K images with annotations of age, gender and ethnicity.

Projects: The dataset can be used on a variety of task such as facial detection, age estimation, age progression, age regression, landmark localisation, etc. 

Publication Year: 2017

Download here.

8| YouTube Faces Dataset with Facial Keypoints

This dataset is a processed version of the YouTube Faces Dataset, that basically contained short videos of celebrities that are publicly available and were downloaded from YouTube. There are multiple videos of each celebrity (up to 6 videos per celebrity).

Size: The size of the dataset is 10GB, and it includes approximately 1293 videos with consecutive frames of up to 240 frames for each original video. The overall single image frames are a total of 155,560 images. 

Projects: This dataset can be used to recognising faces in unconstrained videos.

Publication Year: 2017

Download here.

9| Large-scale CelebFaces Attributes (CelebA) Dataset

CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations. The images in this dataset cover large pose variations and background clutter. 

Size: The size of the dataset is 200K, which includes 10,177 number of identities, 202,599 number of face images, and 5 landmark locations, 40 binary attributes annotations per image. 

Projects: The dataset can be employed as training and testing sets for the following computer vision tasks: face attribute recognition, face detection, landmark (or facial part) localisation, and face editing & synthesis. 

Publication Year: 2015

Download here.

10| Yale Face Database

The Yale Face Database contains 165 grayscale images in GIF format of 15 individuals. There are 11 images per subject, one per different facial expression or configuration: centre-light, w/glasses, happy, left-light, w/no glasses, normal, right-light, sad, sleepy, surprised, and wink.

Size: The size of the dataset is 6.4MB and contains 5760 single light source images of 10 subjects each seen under 576 viewing conditions.

Projects: The dataset can be used for facial recognition, doppelganger list comparison, etc.

Publication Year: 2001

Download here

Share
Picture of Ambika Choudhury

Ambika Choudhury

A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.