A Beginner’s Guide to Deep Regression Forest

Deep regression forest can be considered as a method similar to the traditional regression forest but the decisions at each split node are softer than the traditional regression forest.

Random forest and decision trees are some of the most popular predictive models in the machine learning field. When using random forests, we can find different variants of it that can be used in classification and regression analysis. In this article, we are going to discuss a variant of the random forest named as Deep Regression Forest, which integrates the features of deep neural networks and can outperform in various applications of regression analysis. The major points to be discussed in this article are listed below.

Table of Contents

  1. What is Deep Regression Forest? 
  2. When to apply the Deep Regression Forest?
  3. How Does It Work?
  4. Application of Deep Regression Forest

What is Deep Regression Forest?

In traditional machine learning, we find that random forest and decision trees are some of the popular algorithms for predictive modelling which use the structure of a tree. These algorithms perform data partition at split nodes and data abstraction at leaf nodes. These models can be used for both classification and regression problems. 

When a traditional forest is used for regression problems, it makes hard data partitions because these traditional forests are based on a kind of greedy approach where for data partitions at each split node, it makes locally optimal hard decisions. These locally optimal hard decisions made by the greedy algorithms are well-performing but when it comes to handling the heterogeneous data like image data for age estimation, we are required to apply the soft decisions.

Deep regression forest can be considered as a method similar to the traditional regression forest but the decisions at each split node are softer than the traditional regression forest. Using the deep regression forest, we can also make the algorithm eligible to learn the input feature space and data abstraction at leaf nodes jointly. This eligibility can ensure that the input and output from leaf nodes are correlated and the correlation is homogeneous. Also, it helps in handling heterogeneous data. 

We can also consider deep regression forests as deferential regression forests which can also be integrated with any deep neural networks. Integrating them with deep neural networks has shown great potential in computer vision regression tasks.  We can summarize the procedure of deep regression forest with the following steps:-

  1. Fix the leaf nodes and optimize the data partitions (at split nodes).
  2. Convolutional neural networks can be used for feature learning by backpropagation. 
  3. Fix the split nodes and optimize the data abstraction (at leaf nodes).

Considering the above steps, we can say that the first step is for feature learning and the second step is to perform local regression. We can call this forest a combination of deep neural decision forest and label distribution learning forest. But, the objective of this forest is to perform regression tasks. 

When to Apply the Deep Regression Forest?

As we could understand what deep regression forest is in the last section, we also got to know that it has the capability of dealing with heterogeneous data. The basic idea behind the model is to learn from low-dimensional embeddings. The reason behind the capability of dealing with heterogeneous properties is, learning data partition and data abstraction jointly. So we can say if in any process the data is heterogeneous like image data, we can apply the Deep Regression Forest for regression analysis. With this, it also has the capability of dealing with nonlinear data. Unlike other methods, we can integrate this model to any deep neural network and perform regression analysis using the homogenous partitions in the joint input-output space, and learning a local regressor for each partition. 

The problem of locally optimal hard decision data partition at split nodes can be solved by defining a function for soft decision at each split node. Because of the production of a soft decision, we can apply it for depth estimation.

How Does It Work?

The main motive of the process is based on the working of a normal regression forest. So  we can understand that the process can be completed in two normal steps:

  • Learn from a differential regression tree.
  • Ensemble trees to form a forest. 

 Let’s understand how to make an algorithm work in the above-given steps. Let’s say there are two spaces  X (input space) and Y (output space). Now we know that, in regression analysis for each input information, there is an output target and the objective of regression procedures is to find a mapping function.  

The mapping can be given as the following:

yˆ = g(x) = Z yp(y|x)dy


X = Input sample

Y = Target sample

g(X) = Mapping function 

p(Y|X) = Conditional probability function 

When we perform this regression using a decision tree, there is a set of split nodes and a set of leaf nodes. In these sets, each split node defines a split function and the leaf node contains a density distribution. Till now we could see that this process is similar to the traditional regression tree which exhibits hard decisions. To make it soft, we can introduce a split function which can include the sigmoid function and index function as:

sn(x; Θ) = σ(fϕ(n)(x; Θ))


  • X is parameterized by Θ
  • sn(x; Θ) = split function

 In this function, σ(·) is the sigmoid function and ϕ(·) is an index function. Let’s take a look at the below image:

Image source

The above image is a representation of the deep regression forest where the output unit (red circles) of the function is parameterized by Θ and a fully connected layer of the convolutional neural network. Index functions assigned to the tree are two in quantity and split nodes are in blue and leaf nodes are in green circles. 

To indicate the correspondence between split nodes of trees and the FC layer, black dashed lines are used. Each tree has independent leaf node distribution and the output from the forts is a mixture of tree predictions, function f(.) and leaf node distribution are learned jointly in an end-to-end manner. Based on the above scenario the mapping between x and y is given by 

yˆ = g(x; T ) = Z yp(y|x; T )dy.

Where T is a tree-based structure.

By ensembling these trees after optimization, we can form a regression tree, where it is a possibility that all the trees will share the same parameters and they can have different sets of split functions and independent leaf distributions. We can also define the loss function of the forest as the average loss function of the trees.

Application of Deep Regression Forest

As of now, we have seen what, when, and how deep regression forest can be used. When we talk about the application of the deep regression forest, we can say that it can be used in regression problems that can be dealt with normal random forest regression models or any neural network. Some of the normal fields of applications are:

  • Price optimization in e-commerce
  • Disease level prediction in medical science
  • Price prediction in stock markets.
  • To measure the effects of water and fertilizer on crops.

Also, we have seen that the major capabilities of deep regression forest are to deal with heterogeneous data. So we can find the application of it majorly in the fields with heterogeneous data. Some of the applications related to heterogeneous data  are as follows:

  • Age estimation using the images
  • Crowd counting 
  • Pose estimation
  • Demographic analysis 
  • Visual surveillance 
  • Access control   

Here in the above applications, we can understand that the required results from the process are going through data that is heterogeneous.

Final Words

In this article, we have seen a variant of random forest in the regression analysis that is deep regression forest and we found that it has the capability of making soft decisions so that it can be used with data that is heterogeneous and requires regression analysis. Along with that we have discussed the procedure on which it works and some of the normal and crucial applications where it can be used for better results.


More Great AIM Stories

Yugesh Verma
Yugesh is a graduate in automobile engineering and worked as a data analyst intern. He completed several Data Science projects. He has a strong interest in Deep Learning and writing blogs on data science and machine learning.
Yugesh Verma
How is Boolean algebra used in Machine learning?

Machine learning model with Boolean algebra starts with the data with a target variable and input or learner variables and using the set of rules it generates output value by considering a given configuration of input samples.

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM