Recently, the researchers at Facebook AI announced that they have built and deployed a universal Computer Vision system designed for shopping known as GrokNet. The social media giant unveiled the approaches for building an accurate image product recognition system, GrokNet. It is a unified computer vision model that incorporates a diverse set of 83 loss functions, optimising jointly for accurate product recognition and various classification tasks over 7 commerce datasets.
The AI-powered shopping systems is said to leverage state-of-the-art image recognition models to improve the way people buy, sell, and discover items. In a blog post, the researchers said, “Our long-term vision is to build and develop an all-in-one AI real-time lifestyle assistant that can search and rank billions of products accurately while personalising to individuals’ tastes. That same system would make online shopping just as social as shopping with friends in real life.”
They added, “ Going one step further, it would advance visual search to make your real-world environment shoppable. If you see something you like (clothing, furniture, electronics, etc.), you could snap a photo of it and the system would find that exact item, as well as several similar ones to purchase the product immediately.”
Subscribe to our Newsletter
Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Why This New System?
Shopping is one of the challenging tasks for AI systems to tackle because the personal taste of individuals is subjective. In order to build a truly intelligent assistant, the researchers need to teach systems to understand each individual’s taste and style including the context that matters when searching for a product to fit a specific need or situation.
Facebook Marketplace plays an important role for many people across the globe to provide a platform for selling and buying products from Facebook members and businesses. Currently, the social media giant uses image classifiers to augment product descriptions with predicted attributes, categories, and search queries, thereby improving the ability to find and browse products. The image recognition system used search log interaction data to train these image classifiers with large-scale weakly-supervised data.
According to the researchers, developing the new system has made Facebook AI a step ahead of the vision to achieve a robust AI lifestyle assistant. GrokNet is a deployed image recognition system for commerce applications that leverages a multi-task learning approach to train a single computer vision trunk. This new AI system is said to achieve a 2.1x improvement in exact product match accuracy when compared to the previous state-of-the-art Facebook product recognition system.
GrokNet is a unified computer vision model that incorporates an underlying Convolutional Neural Network (CNN) model that forms the “trunk” of the model. The system is built as a distributed PyTorch workflow on the FBLearner framework. The trunk model for GrokNet uses ResNeXt-101 32×4d. After initialisation of the weights, the system is fine-tuned on the datasets using Distributed Data-Parallel GPU training on 8-GPU hosts, across 12 hosts (96 total GPUs).
The model is trained on human annotations, user-generated tags, and noisy search engine interaction data. According to the researchers, the final trained system analyses images to predict object category, home attributes, fashion attributes, vehicle attributes, search queries and image embedding.
For this model, the researchers combined 7 different datasets with wide-ranging label semantics and image statistics, where the combined dataset contains 89 million public images from Facebook Marketplace, and diverse image statistics include user photos of Marketplace products for sale. The model is trained on both human-provided annotations and weak signals.
According to the researchers, one of the goals of this project was to solve a large number of computer vision tasks with a single model, by training on both existing datasets as well as many new ones.
The researchers have already used this system to launch automatically populated listing details in Marketplace seller listings. According to them, GrokNet will play an important role in making virtually any photo shoppable across apps on Facebook and could be used to help customers find what they are looking for, receive personalised suggestions, and much more.