Behind Google’s AIY Kits


Image Courtesy: Target (

With almost every company in today’s market focussing strongly on artificial intelligence and machine learning, the opportunity for insight and development in these fields has taken on a new meaning. Innovative products in ML are the talk of the town. Google’s Home and Amazon’s Echo are some of the popular standalone products in the market where the technology behind these devices relies on advancements in ML and AI.

Now, the search engine giant Google has come up with another range of products which will allow users to build their own projects in AI and ML using features such as voice, face and image recognition. This project division in Google is dubbed as AIY — a tongue-in-cheek play on “do-it-yourself [with AI]”. The products under AIY are called ‘kits’. As of now, these kits are available in two variants — the Voice Kit and the Vision Kit. The former allows users to build projects in voice-recognition, while the latter helps them with image recognition and ML. This article will explore what goes behind these AIY kits.


Sign up for your weekly dose of what's up in emerging technology.

The Configuration Build

Launched in late 2017, the AIY kits are aimed at beginners who wish to learn and experiment with AI. The kit generally comprises of a cardboard shell, an electronic circuitry to handle AI functionalities called a Vision Bonnet, a speaker (with the Voice Kit), a lens (with the Vision Kit), a tripod stand and other components required for setup. These kits are assembled to form the functioning device, and can be connected to other peripherals. They are also designed to work with Raspberry Pi computers, as these hand-held computers offer the flexibility of providing a practical learning environment in programming. In addition, it also fulfills the computing requirements for software projects built on practice, on a small scale without burning a hole in one’s pocket.

AIY Voice Kit For Your Voice Recognition Project Needs

Google introduced this first in the market as part of their AIY projects. The commercial success paved way for the product’s development as well as incorporating improvements over time. Voice Kit is primarily powered by Google Assistant for voice detection along with word and language detection. Moreover, modules from Tensorflow can also be used and integrated with the kit’s functionality for better natural language processing.

The kit typically consists of cardboard frames, Vision Bonnet, a circuit board capable of handling AI tasks, a speaker unit, a push button and other connecting parts such as USB cables and button harness. The latest version of Voice Kit is bundled with Raspberry Pi Zero WH computer and a micro SD card. Thanks to this, the hassle of buying a separate computer for a user-specific project is avoided.

In the box: Voice Kit ’s (v2.0) components

The components are to be assembled by users themselves. For this process, detailed step-by-step instructions are given on the official page. Once the kit is assembled, it is ready to connect with other devices. To start with this, Google has provided two options to connect with the kit:

  1. Mobile platforms (Android and iOS), using the official AIY Projects app
  2. Regular desktop environment (requires mouse, keyboard, monitor and other I/O components)

After this, Google Assistant is configured for the kit using WiFi and the setup is good to go for a new voice-based project.

AIY Vision Kit For Image Recognition And ML

AIY Vision Kit was developed as a result of Voice Kit’s success. Google came up with this project to foster ML development and awareness among users. Vision Kit helps customers build their ML projects with the power of computer vision. The configuration is almost similar to that of Voice Kit except that it comes with a host of image capturing components such as a camera and LEDs for signalling.

In the box: Vision Kit components (v1.1)

Billy Rutledge, director of AIY Projects at Google, talks about the ML working aspects which go behind the Vision Kit, in a blog. He says, “The provided software includes three TensorFlow-based neural network models for different vision applications. One based on MobileNets can recognise a thousand common objects, a second can recognise faces and their expressions and the third is a person, cat and dog detector. We’ve also included a tool to compile models for Vision Kit, so you can train and retrain models with TensorFlow on your workstation or any cloud service. We also provide a Python API that gives you the ability to change the RGB button colors, adjust the piezo element sounds and access the four GPIO pins.”

The setup configuration is also identical to Voice Kit. Once it is set, creative image-recognition projects can be worked upon with Vision Kit.


AI and ML concepts are progressing at a rapid pace. People interested in this field have a lot to learn and implement these technologies in their projects practically. Tools like AIY kits provides deeper practical understanding of concepts like AI and ML, for professionals to get an insight into these fields. It also improves analytical skills and help assess real time problems with machine learning and artificial intelligence.

More Great AIM Stories

Abhishek Sharma
I research and cover latest happenings in data science. My fervent interests are in latest technology and humor/comedy (an odd combination!). When I'm not busy reading on these subjects, you'll find me watching movies or playing badminton.

Our Upcoming Events

Conference, in-person (Bangalore)
MachineCon 2022
24th Jun

Conference, Virtual
Deep Learning DevCon 2022
30th Jul

Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM