With a vision to enhance the computer vision functions, on its 20th anniversary, OpenCV introduced an MIT-licensed AI Kit, based on Intel’s Myriad X. By integrating spatial AI, this OpenCV AI Kit (OAK) board will accelerate computer vision functions like object detection, identifying distances between two objects, image classification, counting people to name a few.
Initiated as a technology solution for the life safety problem of bikers’ accidents on roads, this project later expanded with a better understanding of the combination of depth perception with artificial intelligence and embedding the combination in a product. And thus, this kit consists of hardware, software and AI training that will help companies to enhance their products with CV powerhouses.
Also Read: Why Is OpenCV Gaining Prominence?
OpenCV AI Kit — Building The Platform
Working on Intel’s Movidius Myriad X chip, OAK is made up of hardware — OAK-1 and OAK-D, and software components — API software. While a single camera of OAK-1 can perform neural inference of image classification and object detection, the multiple camera units of OAK-D with 4k resolution at 60fps provides advanced 3D capabilities for depth detection in real-time.
Computer vision functions demand massive processors and a lot of computing power. However, OpenCV’s OAK board claims to consume less power and is the fastest and reliable way to integrate spatial AI into projects. Further, these embedded AI chips include deep learning models which eliminate the security and latency issues that could arise from cloud infrastructure.
Unlike other Myriad X solutions, the OAK device leverages the full potential of the VPU, and this open-source solution has immense potential for the OpenCV community. Not only this solution can prove to be beneficial for developing industrial smart cameras, but can also help in creating spatial AI projects for preventing road accidents.
Furthermore, the OAK board comes with a neural net covering which allows it to perform some specific functions like detect people with/without a mask, recognise their age, detect facial features like chin, mouth etc., and detect pedestrians. These particular tasks can also be personalised accordingly, by training their own DL models with available datasets and using OpenVINO to deploy them.
Both the hardware components support Linux, macOS and Windows hosts, making it versatile enough for any prototyping flow despite the size. A node-based editor in the device makes the multi-step spatial AI process of crops and zooms simple. And, due to its API software, it opts to perform all the CV tasks in an arbitrary manner.
Using OAK-D For Spatial AI Results
According to the company’s Kickstarter campaign document, the two ways to gain accurate spatial AI results from OAK are — Monocular Neural Inference and Stereo Neural Inference. While monocular neural inference runs on a single camera fused with stereo depth, the stereo neural inference operates OAK-D’s multiple cameras to provide 3D positioning data to the users.
Example of stereo neural inference running on OAK-D
This example highlights how facial detection of a person is running parallel on both the camera in the device.
What’s more, the company claims that for both the cases of gaining spatial AI results from OAK-D can be trained on standard neural networks that are 2D. OAK-D has been designed to provide accurate 3D results even while the model is trained on 2D datasets.
Spatial AI can be exceptionally beneficial for not just object detection but also the distance and depth involved. The project is providing some early discounts for early risers interested in a solution like this and urges more companies to build smart and innovative projects combining the power of neural inference with depth perception.
Read their entire Kickstarter campaign here.