Unity Launches Synthetic Image Datasets To Train AI Models Faster

Unity came up with synthetic image datasets to simplify data collection and protect the confidentiality of data.

San Francisco-based videogame software development company Unity Technologies recently announced the launch of synthetic image datasets to help develop computer vision applications (train artificial intelligence (AI) models) faster and reduce cost significantly.

Unity — a cross-platform game engine — is widely used by game developers worldwide to create interactive games, virtual reality and augmented reality applications. With its recent announcement, the company looks to leverage its datasets to build AI models across industry verticals, including manufacturing, retail and security. 

Tackling the privacy problem 

Unity believes most real-life data collection techniques are labour intensive, expensive, and carry greater privacy risks. The firm came up with synthetic image datasets to simplify data collection and protect the confidentiality of data.

AIM Daily XO

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

As real data contains sensitive information, most programmers, software developers or researchers may not want them to be disclosed. Synthetic data, however, holds no private information and can not be traced back to the source.  

Most importantly, synthetic data addresses confidentiality and privacy concerns to a large extent and eliminates the privacy issues arising from using images of real people and places.

Download our Mobile App

For example, in autonomous or self-driving cars, the collection of real data is unreasonably expensive. Waymo, the self-driving vertical of Alphabet, has spent close to $3.5 billion in testing Chrysler Pacificas in Silicon Valley and Phoenix. Over the past few years, around 30 self-driving car companies have spent close to $16 billion on developing fully self-driving cars.

At present, the collection of synthetic data is expensive and time-consuming. An AI model must initially create it, and many companies lack the resources to do so. To that end, Unity has come up with a Unity Perception SDK and a library with labelling and randomisation tools for developers.

Eliminates data bias     

Unity said the usage of synthetic data eliminates the problem of ‘biased data,’ which often results in skewed outcomes, lower accuracy levels and analytics errors. 

“Data captured from the real world is often biased towards what is easy to collect, is subject to human labelling errors, and needs to be refreshed often, which can be very expensive,” said Danny Lange, SVP of AI at Unity, comparing it with synthetic datasets. 

Lange said the best AI results are achieved with a large amount of high-quality synthetic data combined with a small amount of real data, when possible. The synthetic version of datasets validates privacy rules and accurately reflects real world-data, he added. 

Further, he said these datasets empower companies to simulate scenarios that might occur in the real world shortly based on a sizable increase in user data. “As a result, we see smarter indoor environments, such as cashier-less grocery stores, and more as our customers discover new applications,” said Lange. 

How does it work

The new Unity computer vision datasets are based on synthetic data, which is generated using algorithms. Conversely, synthetic data is used for computer vision applications, particularly in the area of object detection. In the case of Unity, the artificial environment is most likely to be used in creating a 3D model of the object and learning to navigate environments by visual information.  

Unity engine creates ‘digital twins’ of objects (3D models) using photogrammetry techniques. Digital twins are a virtual replica of the physical things, mainly used to run simulations (testbed) before the actual deployment of the solution. The digital twins are then placed in various 3D environments or randomisers, with multiple lighting conditions, textures, camera positions, scale factors, and other parameters.

Source: Unity [Showcasing the visual examples of image labelling] 

Unity recommends various environments best suited to address the customer’s computer vision problem and look for the most suitable dataset. Currently, the Unity team provides the necessary handholding required for its customers. Soon, the company plans to offer a simple self-service interface to generate additional features at their convenience. 

Unity offers a tiered pricing model. The price per image falls proportionally with the increased need for synthetic images/data.

Lange believes synthetic computer vision datasets can support a wide range of AI training use cases, ranging from object detection to improving the performance of AI models.

Sign up for The Deep Learning Podcast

by Vijayalakshmi Anandan

The Deep Learning Curve is a technology-based podcast hosted by Vijayalakshmi Anandan - Video Presenter and Podcaster at Analytics India Magazine. This podcast is the narrator's journey of curiosity and discovery in the world of technology.

Amit Raja Naik
Amit Raja Naik is a seasoned technology journalist who covers everything from data science to machine learning and artificial intelligence for Analytics India Magazine, where he examines the trends, challenges, ideas, and transformations across the industry.

Our Upcoming Events

24th Mar, 2023 | Webinar
Women-in-Tech: Are you ready for the Techade

27-28th Apr, 2023 I Bangalore
Data Engineering Summit (DES) 2023

23 Jun, 2023 | Bangalore
MachineCon India 2023 [AI100 Awards]

21 Jul, 2023 | New York
MachineCon USA 2023 [AI100 Awards]

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Council Post: From Promise to Peril: The Pros and Cons of Generative AI

Most people associate ‘Generative AI’ with some type of end-of-the-world scenario. In actuality, generative AI exists to facilitate your work rather than to replace it. Its applications are showing up more frequently in daily life. There is probably a method to incorporate generative AI into your work, regardless of whether you operate as a marketer, programmer, designer, or business owner.

Meet the Tech Fanatic, Deedy

Debarghya Das or Deedy is the founding engineer of internal enterprise search space Glean, a company that strives to solve workplace search queries