Advertisement

Toshiba Claims To Have Created World’s Most Accurate Visual Question Answering AI

The AI overcomes the difficulty of answering questions on the positioning and appearance of people and objects and possesses the ability to learn the information required to handle a wide range of questions and answers.
toshiba japan ai investments

Toshiba Corporation claims to have developed the world’s most accurate and highly versatile Visual Question Answering (VQA) AI that can recognise not only people and objects but also colours, shapes, appearances and background details in images. 

The AI overcomes the difficulty of answering questions on the positioning and appearance of people and objects and possesses the ability to learn the information required to handle a wide range of questions and answers.

Toshiba presented the technology at ICANN2021, the international conference for neural networks, on 14 September.

Image Credits: Toshiba

When experimented on a public dataset comprising of a large volume of images and data text, the VQA AI correctly answered 66.25% of questions without any pre-learning and 74.57% with pre-learning. For example, the AI can find a worker standing in a designated place by asking questions like, “Is the person on a black mat?” which requires recognition of the individual, position, shape and colour. 

Applying this to safety monitoring systems at production sites can help improve safety and reduce workloads on onsite supervisors. It can also be used to identify specific scenes in broadcast content and surveillance video footage.

Image Credits: Toshiba

The global AI market, including software, hardware, and services, is forecast to grow 16.4% year over year in 2021 to $327.5 billion and is expected to reach $554.3 billion by 2024. Toshiba’s new AI meets the need for flexibility with the world’s highest accuracy in answering questions and is also able to change or add questions quickly. Its ability to recognise not only people and objects but also image backgrounds, plus the extensive database at its disposal, ensure that it can process the features of images and pre-learned questions quickly to derive the correct answer. 

After learning a large set of images, questions and answers that cover the presence of people and objects, and information such as their location and status, the AI can provide an appropriate answer to a question from approximately 3,000 answer patterns..

Download our Mobile App

Victor Dey
Victor is an aspiring Data Scientist & is a Master of Science in Data Science & Big Data Analytics. He is a Researcher, a Data Science Influencer and also an Ex-University Football Player. A keen learner of new developments in Data Science and Artificial Intelligence, he is committed to growing the Data Science community.

Subscribe to our newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day.
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Our Upcoming Events

15th June | Online

Building LLM powered applications using LangChain

17th June | Online

Mastering LangChain: A Hands-on Workshop for Building Generative AI Applications

Jun 23, 2023 | Bangalore

MachineCon 2023 India

26th June | Online

Accelerating inference for every workload with TensorRT

MachineCon 2023 USA

Jul 21, 2023 | New York

Cypher 2023

Oct 11-13, 2023 | Bangalore

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox
MOST POPULAR

Is Sam Altman a Hypocrite? 

While on the one hand, Altman is advocating for the international community to build strong AI regulations, he is also worried when someone finally decides to regulate it