The onset of pandemic led to increased use of computer vision for identifying patients, tracking their movements and for other contactless precautionary measures. For detecting faces, the algorithms take cues from the landmarks around eyes, nose and mouth. Now since the bottom half of the facial features are hidden by the masks, facial recognition technology is facing new challenges.
According to a preliminary study by the National Institute of Standards and Technology (NIST) on 89 of the best commercial facial recognition algorithms, showed error in matching digitally applied face masks with photos of the same person without a mask. Now the question is, what happens to those facial recognition systems, which were trained on faces without masks in the pre-pandemic world?
Sign up for your weekly dose of what's up in emerging technology.
Addressing these challenges, Vladimir Iglovikov, in his talk on the 2nd day of CVDC 2020, presented the attendees with the methods to build models to detect masked faces. Vladimir is a Sr. Computer Vision Engineer at Lyft, a Kaggle Grandmaster and also a veteran of the Russian airborne forces. Currently, Vladimir is applying Deep Learning techniques to the computer vision problems at the Lyft’s Level5 Engineering Centre that is focussed on the development of self-driving cars.
In this talk, Vladimir brought his valuable expertise in computer vision to the fore and introduced the attendees to the do’s and don’ts of building ML models. doubled down on the ownership of models and why it is important to consider each model as an individual project of its own. This hour long talk was filled with nuggets of wisdom for those who are in the initial stages of their machine learning journey.
About The Model
Before one sets out to build a machine learning model that detects masked faces, says Vladimir, one should first build a model that detects unmasked faces well. So, first build a good face detector then go for the masked variants.
Talking about the model for face detection, Vladimir gave an overview of the RetinaFace model that has set new benchmarks for face detection.
RetinaFace is a robust single-stage face detector, which performs pixel-wise face localisation on various scales of faces by taking advantage of joint extra-supervised and self-supervised multi-task learning. According to the authors of this paper, RetinaFace contributed the follows:
- Manual annotation of five facial landmarks on the WIDER FACE dataset and observe significant improvement in hard face detection with the assistance of this extra supervision signal.
- Adding of a self-supervised mesh decoder branch for predicting a pixel-wise 3D shape face information in parallel with the existing supervised branches.
Wider Face dataset, which has 32,203 images and 393,703 faces was used for experiments and the researchers claim RetinaFace to have outperformed the state of the art used
Here are a few recommendations from Vladimir, for those who have built their model:
- Publish code on GitHub. Making code public keeps the model owner under check and adds the responsibility to make it better.
- Add code formatters and style checkers.
- Create a clear readme document
- Create a Google colab notebook with an example
- Make a library and upload it to PyPi
- Build a web app.
- Write a blog post.
- Create video with a demo.
These tips, if followed, states Vladimir, the chances of the work getting cited is more compared to the traditional methods.
This talk has great significance in the current times especially during this COVID-19 pandemic, when many new norms are being established across the globe. Wearing masks is one such norm that has been embraced by many, reluctantly. Though facial technology has been hitting roadblocks of late, its use is crucial in vulnerable places like airports etc. and building an accurate detector becomes even more crucial.
Know more about RetinaFace, click here.