Now Reading
Real-Time Face Mask Detector With TensorFlow Object Detection

Real-Time Face Mask Detector With TensorFlow Object Detection

Harsh Goyal
  • To build a model to detect whether a person is wearing a face mask or not with your webcam or mobile camera.

To recover losses caused by this global pandemic, the country is going through various stages of reopening. Face masks have become a vital element of our daily lives. So, wearing them is important for safety and to control the spread. 

Our main focus is to detect whether a person is wearing a mask or not.


  • I had some experience with the TensorFlow Object Detection API. I chose to utilize a pre-trained COCO dataset model. These pre-trained models are great for the 90 categories already in COCO (e.g., person, objects, animals, etc). They are additionally uncommonly valuable as a benchmark for introducing another model to prepare fresh out of the plastic new datasets.
  • I used google Colab to train my model with free GPU access. It’s the best way a train under a cloud GPU which has higher processing power compared to my local pc.
  • There is also a pro version of Colab is available to access more ram and notebook run time.
  • A pro version of Colab is also available which gives access to more ram and high probability of getting one of the better GPU’s of Colab.

Object Detection approach:

The object detection workflow comprises of the below steps:

  1. Collecting the dataset of images and validate the Object Detection model.
  2. Preparing a TFRecord file for ingesting in object detection API.
  3. Installing the TensorFlow Object Detection API.
  4. Set the model config file.
  5. Running Object detection training and evaluation.
  6. Exporting the model.

Collecting the images to train and validate the Object Detection model

  • I used a Kaggle face mask dataset with annotations so it’s been easier for me to not spent extra time for annotating them. This dataset consists of 853 images belonging to with mask, Mask worn incorrectly and Without mask 3 classes. 

Preparing Dataset 

  • Dataset contains xml files that contain information about the bounding box of images. TFOD uses PASCAL VOC format.


Generate CSV files

  • Splitting the dataset into train and test sets (8:2) with both images’ files and xml files.
  • The data from the XMLs was then imported into a CSV file. Each row corresponded to an annotation.

import os
import glob
import pandas as pd
import xml.etree.ElementTree as ET
def xml_to_csv(path):
    xml_list = []
    for xml_file in glob.glob(path + ‘/*.xml’):
        tree = ET.parse(xml_file)
        root = tree.getroot()
        for member in root.findall(‘object’):
            value = (root.find(‘filename’).text,
    column_name = [‘filename’, ‘width’, ‘height’, ‘class’, ‘xmin’, ‘ymin’, ‘xmax’, ‘ymax’]
    xml_df = pd.DataFrame(xml_list, columns=column_name)
    return xml_df
def main():
    for folder in [‘train’,’test’]:
        image_path = os.path.join(os.getcwd(), (‘images/’ + folder))
        xml_df = xml_to_csv(image_path)
        xml_df.to_csv((‘images/’ + folder + ‘_labels.csv’), index=None)
        print(‘Successfully converted xml to csv.’)

Preparing a TFRecord file for ingesting in object detection API

  • Generate train.record and test.record files from train and test csv files respectively. TFRecord is the binary format which is used in TensorFlow Object Detection.

# From tensorflow/models/
# Create train data:
python –csv_input=images/train_labels.csv –image_dir=images/train –output_path=train.record
# Create test data:
python –csv_input=images/test_labels.csv  –image_dir=images/test –output_path=test.record

Installing the TensorFlow Object Detection API

  1. Read about TensorFlow Object Detection API installation documentation.
  2. Download git for Windows.
  3. Clone the TensorFlow-Model-Tree repository on your local system or Colab.

git clone

Training the TFRecords

  • After setting TFRecords files, next step would be to choose a pre-trained model from TensorFlow model zoo. 
  • After some investigation with some models I decided to use ssdlite_mobilenet_v2_coco with 150k steps as it offers a faster speed and good mean_average_precision (mAP).
  • I also trained with ‘faster_rcnn_inception’ with 200k steps and got very good accuracy but speed is very low (1 fps).

Set the model config file

  • Before training the model, I had to do some changes in the ssdlite_mobilenet_v2_coco config file. Update num_classes, fine_tune_checkpoint, and num_steps plus update input_path and label_map_path for both train_input_reader and eval_input_reader-

Running the Object detection training and eval job

See Also
India's facial recognition

  • After setting the config file, I finally reached the core step which is training our model. You can train your model either local system or on Colab.

python –logtostderr –train_dir=training/ –pipeline_config_path=training/YOUR_MODEL.config

Exporting the trained Inference Graph

After the training is complete, I extracted the newly trained inference graph. This graph will be used to perform object detection. This will be done using TensorFlow object detection.

python object_detection/ –input_type image_tensor –pipeline_config_path {pipeline_fname} –trained_checkpoint_prefix {last_model_path} –output_directory training/exported_graph

After this process it generates .pb file that contains the graph definition as well as all the weights for the model. With this file we can run the trained model.

You can check out my GitHub repo of Real Time face mask Detection and also have a look of live demo on my LinkedIn post 


What Do You Think?

If you loved this story, do join our Telegram Community.

Also, you can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.

Copyright Analytics India Magazine Pvt Ltd

Scroll To Top