Toshiba Claims To Have Created World’s Most Accurate Visual Question Answering AI

The AI overcomes the difficulty of answering questions on the positioning and appearance of people and objects and possesses the ability to learn the information required to handle a wide range of questions and answers.
toshiba japan ai investments

Advertisement

Toshiba Corporation claims to have developed the world’s most accurate and highly versatile Visual Question Answering (VQA) AI that can recognise not only people and objects but also colours, shapes, appearances and background details in images. 

The AI overcomes the difficulty of answering questions on the positioning and appearance of people and objects and possesses the ability to learn the information required to handle a wide range of questions and answers.

THE BELAMY

Sign up for your weekly dose of what's up in emerging technology.

Toshiba presented the technology at ICANN2021, the international conference for neural networks, on 14 September.

Image Credits: Toshiba

When experimented on a public dataset comprising of a large volume of images and data text, the VQA AI correctly answered 66.25% of questions without any pre-learning and 74.57% with pre-learning. For example, the AI can find a worker standing in a designated place by asking questions like, “Is the person on a black mat?” which requires recognition of the individual, position, shape and colour. 

Applying this to safety monitoring systems at production sites can help improve safety and reduce workloads on onsite supervisors. It can also be used to identify specific scenes in broadcast content and surveillance video footage.

Image Credits: Toshiba

The global AI market, including software, hardware, and services, is forecast to grow 16.4% year over year in 2021 to $327.5 billion and is expected to reach $554.3 billion by 2024. Toshiba’s new AI meets the need for flexibility with the world’s highest accuracy in answering questions and is also able to change or add questions quickly. Its ability to recognise not only people and objects but also image backgrounds, plus the extensive database at its disposal, ensure that it can process the features of images and pre-learned questions quickly to derive the correct answer. 

After learning a large set of images, questions and answers that cover the presence of people and objects, and information such as their location and status, the AI can provide an appropriate answer to a question from approximately 3,000 answer patterns..

More Great AIM Stories

Victor Dey
Victor is an aspiring Data Scientist & is a Master of Science in Data Science & Big Data Analytics. He is a Researcher, a Data Science Influencer and also an Ex-University Football Player. A keen learner of new developments in Data Science and Artificial Intelligence, he is committed to growing the Data Science community.

Our Upcoming Events

Conference, in-person (Bangalore)
MachineCon 2022
24th Jun

Conference, Virtual
Deep Learning DevCon 2022
30th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM
MORE FROM AIM