Augmented Reality (AR) will deeply affect businesses across all industries, impacting the way we learn, make data-driven decisions and communicate with the physical world. Machine learning is a crucial determinant in pushing the AR industry forward. In the AR world, ML is being utilised to determine the detection problem based on camera tracking. Several large tech companies such as Google, Microsoft, Facebook and Amazon are leading the development of the underlying technology and integrating AI with AR for various use cases.
The next-gen of AR can create much more personal and intimate experiences, with the computing environment connecting digital objects in the real world which users can both interact with and be present together. Companies such as Facebook have been extremely focused on creating technology to shape the next generation of computing to make it more human-designed and around the ways that we all naturally interact with each other.
Sign up for your weekly dose of what's up in emerging technology.
AI and ML In Augmented Reality
In the last years alone, billions of people have used AR features on social media platforms, including Facebook and Snapchat. Facebook provided support for spark AR studio with operating systems such as Windows and macOS and opened its AR for Instagram for everyone to build apps for it. Companies are now using ML for tasks such as inferring approximate 3D surface geometry to allow visual effects, needing only a single camera input without the requirement for a dedicated depth sensor, and more.
Other areas where AI plus AR have been explored are car insurance companies where you can walk up to any car, hold your phone up to it and it will identify the make and model of the vehicle. It would connect to the company’s APIs and then tell you the rate and monthly payment you would be eligible for.
How Can Machine Learning Be Integrated Into AR
For using machine learning in augmented reality apps, there are several pre-trained ML models that can be used. For example, ResNet and others are AI models optimised for computer vision task object detection. These models are designed to track classes of objects and not just one particular object.
For applications in the context of augmented reality, there are three levels of image processing for which machine learning is used. First is image classification which tells you what is in the image, second is object detection to draw a bounding box around the image, and finally, image masking where you can actually get an exact outline of the objects in an image.
Now, suppose you can have image masking work on mobile with an existing AR SDK that does ground plane detection. In that case, you can infer the position of the object in 3D space for object occlusion or adding colliders for physical interactions. There are native solutions for running ML models on Android and iOS, and for image masking or object detection, we can only know spatial information about the detections in 2D space. To add AR technology along with AI models, developers are building applications that involve physical interaction with image objects and a 3D space.
The technology is also open source. So, for example, let’s say across all the tech companies you want to use Google for this. You can train a model against the existing model using transfer learning with the Google cloud platform. TensorFlow Lite, which is an open-source deep learning framework for on-device inference, is incredibly useful for building AR apps. Especially if you are looking to maximise the performance on your smartphone to run a machine learning model.
And then, of course, there is MLKit which helps to create an app without necessarily having to write their own model. There are also APIs which provide resources such as OCR, face detection, and other capabilities already pre-built for users.
There are also tools such as ARCore developed by Google which is cross-platform and uses OpenGL. It’s a light wrapper around OpenGL which can perform tasks like motion tracking and scene building. There is also SceneForm, which is a specific SDK for Android that saves you from having to learn OpenGL.
With advancements in machine learning there are many ways available to integrate into the AR systems. It requires minimal effort to get things up and running. These can be run directly on the devices or run through cloud services as well.