It is no wonder, now we do hyperparameter optimization and feature selection in a more automated and robust way, to build better performing deep learning models at ease – the next big question is “What’s next?”. Are we aware of the limitations of the “State-of-the-art” model that we use? Does it really fit your needs, or are we just adjusting it to match our needs? How much do we know about its architecture? The immediate answer to all these questions is Neural Architecture Search (NAS).
NAS is an algorithmic-based approach to find the optimal design of the neural network that outperforms the hand-designed models, it goes with the principle “Better the design, Better the performance” and NAS helps to minimize the time and cost involved in design experimentation.
The high-level process includes four steps as illustrated in the below picture:
NAS has three dimensions within it – Search space, Search Strategy and Performance Estimation.
Search space is the architecture pattern that will be designed by the NAS approach. Be it a sequential ‘n’ layer propagation or a multi clustered complex branching.
The search strategy is something that depends upon the search methods used to define a NAS approach – it can be a Bayesian optimization or a reinforcement learning – It accounts for the time taken to build a model.
The performance estimation is the convergence of certain performance metrics which we expect out of a NAS produced neural architecture model. In certain cases, it helps in cascading the results to the next iteration for producing a better model and in other cases, it just keeps improvising on its own every time from scratch.
The two major advantages of NAS are – Rather than experimenting with large datasets, NAS takes small datasets and improves, which usually works well with deep learning. The limited search space helps to get a better model in a quick time.
The advancement in NAS has produced some enhanced approaches and many have tried in different search methods including Reinforcement learning, Bayesian optimization, Gradient-based optimization, Sequential model-based optimization and Evolutionary algorithm. All these methods have their own set of processes, pros and cons, out of which, two methods are quite popular in most of the researches – Progressive NAS and Efficient NAS.
Progressive NAS (PNAS)
Progressive NAS (PNAS) uses Sequential Model-Based Optimization (SMBO) which starts with the simplest model and goes complex only when required. This approach believes that rather than picking random customization, it should start with simple and basic blocks/layers and builds up over it to more complex only when it is really needed. This approach is seen at times as time-consuming but on the other hand, stops falling into a complex architecture which is unnecessary for a requirement.
The other one is ENAS (Efficient NAS) which is a bit contrast to the PNAS – It uses transfer learning. The weights get updated from the previous learning and use it for the next iteration of developing a more efficient model. In this way, ENAS is comparatively faster in producing models in comparison to Standard NAS and PNAS.
NAS in real-time is used effectively in the Google Cloud offering – “AutoML”. To brief about this service, if you feed in with labeled data, it takes care of everything else, right from model building, training, testing and deploying into the cloud platform. NAS plays a demanding role in identifying the right model for the given data and subsequently reduces the dependency of having a ML expert in house rather gives freedom to anyone who has good data in hand to enjoy the benefits of ML. This service is predominantly used for models involving Computer Vision and NLP and expected to expand in other areas in the near future. So, we can conclude that NAS is a subsection of AutoML.
All approaches or methods do have certain limitations and the same applies to NAS. Even though we say it designs customized neural architectures by itself, still it depends on the human-made predefined inputs for it to improvise and build more robust and high performing models. Post several iterations, we will end up in an architecture design that meets our expectations and results, we are still unsure about the reason for it performing better than other models – Still keeps us puzzling about it. All researches in NAS space are focused towards Imaging related use cases and not much beyond that. This is something bit concerning when we approach use cases in different technical and domain areas.
The futuristic scope lies in a fully automated way of designing neural architectures with more insights and details about the reason behind the final customized model that outperformed other models. More research needs to be performed in making NAS a more generic approach suited to different use cases and domains than restricting it to deep learning and more specifically into Image classification/segmentation.
Subscribe to our NewsletterGet the latest updates and relevant offers by sharing your email.
Ravichander is a passionate data professional with close to a decade of experience in DWBI, Data Management and DataScience. He is currently leading a team of data analysis and data scientists in AstraZeneca and in the past, he worked with Capgemini, HCL and Cognizant for projects involving leading players in Banking & Financial services, Leisure & Travel. He loves to learn new tools & technologies and in his leisure time, he likes experimenting different technical areas with proof of concepts.