Unity Launches Synthetic Image Datasets To Train AI Models Faster

Unity came up with synthetic image datasets to simplify data collection and protect the confidentiality of data.

Published on April 21, 2021
by Amit Raja Naik

San Francisco-based videogame software development company Unity Technologies recently announced the launch of synthetic image datasets to help develop computer vision applications (train artificial intelligence (AI) models) faster and reduce cost significantly.

Unity — a cross-platform game engine — is widely used by game developers worldwide to create interactive games, virtual reality and augmented reality applications. With its recent announcement, the company looks to leverage its datasets to build AI models across industry verticals, including manufacturing, retail and security.

Tackling the privacy problem

Unity believes most real-life data collection techniques are labour intensive, expensive, and carry greater privacy risks. The firm came up with synthetic image datasets to simplify data collection and protect the confidentiality of data.

As real data contains sensitive information, most programmers, software developers or researchers may not want them to be disclosed. Synthetic data, however, holds no private information and can not be traced back to the source.

Most importantly, synthetic data addresses confidentiality and privacy concerns to a large extent and eliminates the privacy issues arising from using images of real people and places.

For example, in autonomous or self-driving cars, the collection of real data is unreasonably expensive. Waymo, the self-driving vertical of Alphabet, has spent close to $3.5 billion in testing Chrysler Pacificas in Silicon Valley and Phoenix. Over the past few years, around 30 self-driving car companies have spent close to $16 billion on developing fully self-driving cars.

At present, the collection of synthetic data is expensive and time-consuming. An AI model must initially create it, and many companies lack the resources to do so. To that end, Unity has come up with a Unity Perception SDK and a library with labelling and randomisation tools for developers.

Eliminates data bias

Unity said the usage of synthetic data eliminates the problem of ‘biased data,’ which often results in skewed outcomes, lower accuracy levels and analytics errors.

“Data captured from the real world is often biased towards what is easy to collect, is subject to human labelling errors, and needs to be refreshed often, which can be very expensive,” said Danny Lange, SVP of AI at Unity, comparing it with synthetic datasets.

Lange said the best AI results are achieved with a large amount of high-quality synthetic data combined with a small amount of real data, when possible. The synthetic version of datasets validates privacy rules and accurately reflects real world-data, he added.

Further, he said these datasets empower companies to simulate scenarios that might occur in the real world shortly based on a sizable increase in user data. “As a result, we see smarter indoor environments, such as cashier-less grocery stores, and more as our customers discover new applications,” said Lange.

How does it work

The new Unity computer vision datasets are based on synthetic data, which is generated using algorithms. Conversely, synthetic data is used for computer vision applications, particularly in the area of object detection. In the case of Unity, the artificial environment is most likely to be used in creating a 3D model of the object and learning to navigate environments by visual information.

Unity engine creates ‘digital twins’ of objects (3D models) using photogrammetry techniques. Digital twins are a virtual replica of the physical things, mainly used to run simulations (testbed) before the actual deployment of the solution. The digital twins are then placed in various 3D environments or randomisers, with multiple lighting conditions, textures, camera positions, scale factors, and other parameters.

Source: Unity [Showcasing the visual examples of image labelling]

Unity recommends various environments best suited to address the customer’s computer vision problem and look for the most suitable dataset. Currently, the Unity team provides the necessary handholding required for its customers. Soon, the company plans to offer a simple self-service interface to generate additional features at their convenience.

Unity offers a tiered pricing model. The price per image falls proportionally with the increased need for synthetic images/data.

Lange believes synthetic computer vision datasets can support a wide range of AI training use cases, ranging from object detection to improving the performance of AI models.

Access all our open Survey & Awards Nomination forms in one place >>

Amit Raja Naik

Amit Raja Naik is a seasoned technology journalist who covers everything from data science to machine learning and artificial intelligence for Analytics India Magazine, where he examines the trends, challenges, ideas, and transformations across the industry.

Unity Launches Synthetic Image Datasets To Train AI Models Faster

Tackling the privacy problem

Eliminates data bias

How does it work

Amit Raja Naik

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discord Server

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Recent Stories

KissanAI Releases Dhenu Llama 3, an Indic LLM for Farmers

Enhancing AI Integration through Optimal Data Management in the Global Convenience Food and Beverage Sector

Is it Humane to Bash Humane Ai Pin?

Meta Llama 3 Now Available on Databricks For Enterprise

How Databricks is Enabling Agriculture’s Data Revolution with UPL

How Good is Llama 3 for Indic Languages?

OpenAI Hires Pragya Misra As Its First Employee in India

Meta Forces Developers Cite ‘Llama 3’ in their AI Development

India is Making its Own AI Servers

World's Biggest Media & Analyst firm specializing in AI

Advertise with us

AIM publishes every day, and we believe in quality over quantity, honesty over spin. We offer a wide variety of branding and targeting options to make it easy for you to propagate your brand.

Branded Content

AIM Brand Solutions, a marketing division within AIM, specializes in creating diverse content such as documentaries, public artworks, podcasts, videos, articles, and more to effectively tell compelling stories.

Corporate Upskilling

ADaSci Corporate training program on Generative AI provides a unique opportunity to empower, retain and advance your talent

Hackathons

With MachineHack you can not only find qualified developers with hiring challenges but can also engage the developer community and your internal workforce by hosting hackathons.

Talent Assessment

Conduct Customized Online Assessments on our Powerful Cloud-based Platform, Secured with Best-in-class Proctoring

Research & Advisory

AIM Research produces a series of annual reports on AI & Data Science covering every aspect of the industry. Request Customised Reports & AIM Surveys for a study on topics of your interest.

Conferences & Events

Immerse yourself in AI and business conferences tailored to your role, designed to elevate your performance and empower you to accomplish your organization’s vital objectives.

AIM Launches the 3rd Edition of Data Engineering Summit. May 30-31, Bengaluru