Open-source machine learning platform TensorFTlow has announced that it would be adding iris tracking to its face mesh package. The iris tracking has been added to this package through the TensorFlow.js face landmark detection model.
It must be noted that the face mesh package was introduced in TensorFlow.js earlier this year in March. This package uses a single camera to derive approximate 3D facial surface geometry from an image or a video stream, without even a depth sensor. It was introduced with an ability to locate eyes, nose, lips (along with lip contours), and facial silhouette. With the addition of iris tracking now, it would be possible to detect eye movements, including blinking. It is implemented through the MediaPipe iris model, which is again an open-source ML model, with no requirement for additional hardware.
Iris Landmark Tracking
Iris tracking, especially on mobile devices, is a challenging task to perform. There are several hurdles to overcome, such as constrained computing resources, presence or occlusion such as hair strands or squinting of eyes, and even variable light conditions available. Most often, separate hardware is deployed for this function. This hardware includes expensive headsets and remote eye-tracking systems. Its high cost is not the only problem of deploying hardware; given how bulky they are, it becomes for usage with mobile devices unfeasible.
However, with the new announcement from TensorFlow, the users will be able to upgrade to the new face landmark detection model by just making a few code changes and no additional hardware installations. This package can be installed in two ways: by using script tags or by using NPM.
Further, this new model offers three significant improvements: iris keypoints detection, better eyelid contour detection, and improved detection for rotated faces.
As discussed previously, this package cancels the need for having separate hardware, thereby establishing compatibility with mobile devices. The fact that this is a lightweight package that contains only 3MB of weights makes it even more suitable for real-time interference on mobile devices.
As part of future enhancements, the TensorFlow.js and the MediaPipe teams would now be adding depth estimation capabilities to the face landmark detection using improved iris coordinates. The teams will also make the code available for facilitating reproducible research and developer community’s further usage.
The full blog from TensorFlow can be found here.
About MediaPipe Iris
MediaPipe Iris was announced in August this year, as a machine learning model for accurate iris estimation built for use on modern mobile phones, desktops, laptops, and over the web. Along with tracking iris landmarks involving iris, pupil, and eye contour, this model also showcased that ability to determine the metric distance between the subject and the camera. It demonstrated an error rate of just 10% without the use of a depth sensor. This was ensured by relying on the fact that the horizontal iris diameter of the human eye remains constant across different populations along with some simple geometric arguments.
The model was built upon the previous work on 3D Face Meshes from which the eye region of the original image was isolated for use in the iris tracking system. The problem was broadly divided into two parts — eye contour estimation and iris location.
A multi-task model that consists of a unified encoder with different components for each task was designed for using task-specific training data. This model was trained upon manually annotated 50,000 images which described a variety of illumination conditions, and head rotation poses from diverse regions.
A detailed blog on this can be read here.