Google has released Model Search, an open-source platform for developing efficient and best machine learning models automatically. Model Search, built on TensorFlow, finds the best-fit architecture required for a given dataset and the problem while minimising the coding time and compute resources.
Why Model Search
A neural network is as good as its ability to generalise for different tasks. Sure, popular generalisation techniques have cropped up in recent years, but the research community has not caught up with them. Further, the problem compounds in machine learning domains that require more in-depth knowledge.
The introduction of several AutoML algorithms and techniques such as neural architecture search (NAS) has helped researchers find the ‘right’ neural network automatically, without the need for manual intervention.
However, these techniques require thousands of models to train before converging and are compute-heavy. They don’t fare too well when working with domains on which they haven’t been trained before. This is where domain-agnostic and flexible platforms like Model Search come in.
The Model Search system from Google runs both training and evaluation experiments for various ML models with different architectures and training techniques. The system contains:
- Multiple trainers.
- Algorithms for search and transfer learning.
- A database to store the evaluated models.
The Model Search system is highly collaborative; while each trainer conducts experiments independently, in the end, all the trainers share the observations and learnings from their individual experiments. The search algorithm checks all the completed trials at the beginning of every cycle. The algorithm decides what to try next using an approximate search strategy called beam search. The search algorithm then mutates over the best architectures found; the resulting model is assigned back to the trainer.
How Does It Work?
The Model Search system builds a neural network from a set of predefined blocks representing micro-architectures such as LSTM, Transformer layers or ResNet. With this approach of using pre-existing architectural components, Model Search can leverage ‘best knowledge’ from NAS research across domains, ensuring greater efficiency and reduction in the scale of search space.
Credit: Google AI Blog
The Model Search framework is built on TensorFlow, allowing blocks to implement all functions that take tensor as an input. The search algorithms implemented in Model Search are adaptive, greedy and incremental. This makes them converge faster than other Reinforcement Learning (RL) algorithms.
The search algorithm in the Model Search system imitates the ‘explore and exploit’ principle of RL algorithms. It separates the search for a good candidate (explore) and boosts the accuracy by ensembling the good candidates so discovered (exploit). The main search algorithm applies random changes to the architecture or the training techniques to the ‘best performing’ experiments.
Credit: Google AI Blog
Model Search uses either of the two methods — Knowledge distillation or weight sharing to further improve the efficiency and accuracy of transfer learning between internal experiments. Knowledge distillation is a process of transferring knowledge from a larger model to a smaller one. This method improves candidates’ accuracy by adding a loss term matching the high performing models’ prediction.
On the other hand, weight sharing is a method that allows robust feature detection while reducing the number of parameters. In the Model Search system, the Weight sharing method bootstraps some of the parameters from previously trained candidates. This is done by imitating suitable weights from previously trained models and by randomly initialising the rest. It is a faster training method and allows for better discovery of suitable architectures.
Google said the Model Search system improved on production models with very few iterations. It was particularly found useful for keyword spotting and language identification. During testing, the system was able to find a suitable architecture for image classification on an open source CIFAR-10 dataset.
“By building upon previous knowledge for a given domain, we believe that this framework is powerful enough to build models with the state-of-the-art performance on well-studied problems when provided with a search space composed of standard building blocks,” Hanna Mazzawi and Xavi Gonzalvo from Google Research said in a blog post.
Subscribe to our NewsletterGet the latest updates and relevant offers by sharing your email.
You can write for us and be one of the 500+ experts who have contributed stories at AIM. Share your nominations here.
I am a journalist with a postgraduate degree in computer network engineering. When not reading or writing, one can find me doodling away to my heart’s content.