Is Depth In Neural Networks Always Preferable? This Research Says the Contrary

Non-deep networks could be utilised to create low-latency recognition systems, rather than deep networks.

Deep neural networks are defined by their depth. However, more depth implies increased sequential processing and delay. This depth raises the question of whether it is possible to construct high-performance “non-deep” neural networks. Princeton University and Intel Labs researchers demonstrate that it is. 

Characteristics Of Depth 

The fields of machine learning, computer vision, and natural language processing have been transformed by deep neural networks (DNNs). As its name implies, one of the primary characteristics of DNNs is their depth. They have a large depth, which can be defined as the longest path between an input neuron and an output neuron. Often, a neural network can be characterised as a linear sequence of layers with no intra-group connections. In these circumstances, the depth of a network is defined by its layer count. 

It is widely believed that a significant depth is required for high-performance networks, as depth boosts a network’s representational capability and aids in learning increasingly abstract characteristics. Indeed, one of the key reasons for ResNets‘ success is that they enable extremely deep networks with up to 1000 layers. As a result, state-of-the-art performance is increasingly attained by training models with a high degree of depth, and the definition of “deep” has moved from “two or more layers” in the early days of deep learning to “tens or hundreds of layers” in today’s models.

Subscribe to our Newsletter

Join our editors every weekday evening as they steer you through the most significant news of the day, introduce you to fresh perspectives, and provide unexpected moments of joy
Your newsletter subscriptions are subject to AIM Privacy Policy and Terms and Conditions.

Is Deeper Necessary?

However, is a great depth always necessary? The depth is an important issue to ask because great depth does not come without downsides. For example, a deeper network results in increased sequential processing and delay; it is also more difficult to parallelise and is, therefore, less appropriate for applications that require rapid response times.

Contrary to popular belief, the researchers discovered that this is indeed possible. They describe a non-deep network design that outperforms its deep equivalents. The researchers referred to the design as ParNet (Parallel Networks). They demonstrate for the first time that a classification network with a depth of 12 can achieve higher than 80% accuracy on ImageNet, 96% on CIFAR10, and 81% on CIFAR100. Additionally, the researchers demonstrate that a detection network with a shallow (12) backbone may obtain a 48% AP on MS-COCO. ParNet assists in addressing a scientific question regarding the necessity of great depth and provides practical benefits. ParNet may be efficiently parallelised over several processors due to its similar substructures.

Research Contributions

To summarise, there are three contributions:

• For the first time, the researchers demonstrate that a neural network with a depth of 12 may perform well on extremely competitive benchmarks (80.7% on ImageNet, 96% on CIFAR10, 81% on CIFAR100).

• The researchers demonstrate how ParNet’s parallel structures can be used for fast, low-latency inference.

• The researchers examine ParNet scaling requirements and demonstrate how they can be effectively scaled while maintaining a continuous low depth.

Code is available at Non-Deep Networks

The researchers do this by layering parallel subnetworks rather than one layer after another. The current research contributes to the effective reduction of depth while keeping a high level of performance.  The researchers analyse the design’s scaling rules and demonstrate how to improve performance without altering the network’s depth. Finally, the researchers demonstrate the feasibility of using non-deep networks to construct low-latency recognition systems.


The researchers established for the first time empirical evidence that non-deep networks can compete with deep networks in large-scale visual recognition benchmarks. They demonstrated that similar substructures could be leveraged to generate remarkably performant non-deep networks. Additionally, the researchers demonstrated methods for scaling up and optimising the performance of such networks without expanding their depth. The work demonstrates alternate designs for highly accurate neural networks that do not require deep networks. Such designs may be more suitable for future multi-chip processors. Moreover, the researchers anticipate that the work will aid in the construction of highly precise and rapid neural networks.

Dr. Nivash Jeevanandam
Nivash holds a doctorate in information technology and has been a research associate at a university and a development engineer in the IT industry. Data science and machine learning excite him.

Download our Mobile App

MachineHack | AI Hackathons, Coding & Learning

Host Hackathons & Recruit Great Data Talent!

AIMResearch Pioneering advanced AI market research

With a decade of experience under our belt, we are transforming how businesses use AI & data-driven insights to succeed.

The Gold Standard for Recognizing Excellence in Data Science and Tech Workplaces

With Best Firm Certification, you can effortlessly delve into the minds of your employees, unveil invaluable perspectives, and gain distinguished acclaim for fostering an exceptional company culture.

AIM Leaders Council

World’s Biggest Community Exclusively For Senior Executives In Data Science And Analytics.

3 Ways to Join our Community

Telegram group

Discover special offers, top stories, upcoming events, and more.

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox