An Illustrative Guide to Extrapolation in Machine Learning

Extrapolation is a sort of estimation of a variable's value beyond the initial observation range based on its relationship with another variable.

Humans excel at extrapolating in a variety of situations. For example, we can use arithmetic to solve problems with infinitely big numbers. One can question if machine learning can do the same thing and generalize to cases that are arbitrarily far apart from the training data. Extrapolation is a statistical technique for estimating values that extend beyond a particular collection of data or observations. In contrast to extrapolation, we shall explain its primary aspects in this article and attempt to connect it to machine learning. The following are the main points to be discussed in this article.

Table of Contents

  1. What is Extrapolation?
  2. Interpolation Vs Extrapolation 
  3. Problems of Extrapolation 
  4. Where does Extraplotaion Fail?
  5. Methods of Extrapolation
  6. Implementing Linear Extrapolation in Python

Let’s start the discussion by understanding extrapolation. 

What is Extrapolation?

Extrapolation is a sort of estimation of a variable’s value beyond the initial observation range based on its relationship with another variable. Extrapolation is similar to interpolation in that it generates estimates between known observations, but it is more uncertain and has a higher risk of giving meaningless results.

Extrapolation can also refer to a method’s expansion, presuming that similar methods are applicable. Extrapolation is a term that refers to the process of projecting, extending, or expanding known experience into an unknown or previously unexperienced area in order to arrive at a (typically speculative) understanding of the unknown.

Extrapolation is a method of estimating a value outside of a defined range. Let’s take a general example. If you’re a parent, you may recall your youngster calling any small four-legged critter a cat because their first classifier employed only a few traits. They were also able to correctly identify dogs after being trained to extrapolate and factor in additional attributes. 

Even for humans, extrapolation is challenging. Our models are interpolation machines, no matter how clever they are. Even the most complicated neural networks may fail when asked to extrapolate beyond the limitations of their training data.

Machine learning has traditionally only been able to interpolate data, that is, generate predictions about a scenario that is “between” two other, known situations. Because machine learning only learns to model existing data locally as accurately as possible, it cannot extrapolate – that is, it cannot make predictions about scenarios outside of the known conditions. It takes time and resources to collect enough data for good interpolation, and it necessitates data from extreme or dangerous settings.

Interpolation Vs Extrapolation 

When We use data in regression problems to generalize a function that translates a set of input variables X to a set of output variables y. A y value can be predicted for any combination of input variables using this function mapping. When the input variables are located between the training data, this procedure is referred to as interpolation; however, if the point of estimation is located outside of this region, it is referred to as extrapolation. 

The grey and white sections in the univariate example in Fig above show the extrapolation and interpolation regimes, respectively. The black lines reflect a selection of polynomial models that were used to make predictions within and outside of the training data set. 

The models are well limited in the interpolation regime, causing them to collapse in a tiny region. However, outside of the domain, the models diverge, producing radically disparate predictions. The absence of information given to the model during training that would confine the model to predictions with a smaller variance is the cause of this large divergence of predictions (despite being the same model with slightly different hyperparameters and trained on the same set of data).

This is the risk of extrapolation: model predictions outside of the training domain are particularly sensitive to training data and model parameters, resulting in unpredictable behaviour unless the model formulation contains implicit or explicit assumptions.

Problems of Extrapolation

In the absence of training data, most learners do not specify the behaviour of their final functions. They’re usually made to be universal approximators or as close as possible with few modelling constraints. As a result, in places where there is little or no data, the function has very little previous control. As a result, we can’t regulate the behaviour of the prediction function at extrapolation points in most machine learning scenarios, and we can’t tell when this is a problem.

Extrapolation should not be a problem in theory; in a static system with a representative training sample, the chances of having to anticipate a point of extrapolation are essentially zero. However, most training sets are not representative, and they are not derived from static systems, therefore extrapolation may be required.

Even empirical data derived from a product distribution can appear to have a strong correlation pattern when scaled up to high dimensions. Because functions are learned based on an empirical sample, they may be able to extrapolate effectively even in theoretically dense locations.

Where Does Extrapolation Fail?

Extrapolation works with linear and other types of regression to some extent, but not with decision trees or random forests. In the Decision Tree and Random Forest, the input is sorted and filtered down into leaf nodes that have no direct relationship to other leaf nodes in the tree or forest. This means that, while the random forest is great at sorting data, the results can’t be extrapolated because it doesn’t know how to classify data outside of the domain.

Methods of Extrapolation 

A good decision on which extrapolation method to use is based on a prior understanding of the process that produced the existing data points. Some experts have recommended using causal factors to assess extrapolation approaches. We will see a few of them. These are pure mathematical methods one should relate to your problem properly.

Linear Extrapolation

Linear extrapolation is the process of drawing a tangent line from the known data’s end and extending it beyond that point. Only use linear extrapolation to extend the graph of an essentially linear function or not too much beyond the existing data to get good results. Linear extrapolation produces the function if the two data points closest to the point x* to be extrapolated are (xk-1,yk-1) and (xk,yk).

Polynomial Extrapolation 

A polynomial curve can be built using all of the known data or just a small portion of it (two points for linear extrapolation, three points for quadratic extrapolation, etc.). The curve that results can then be extended beyond the available data. The most common way of polynomial extrapolation is to use Lagrange interpolation or Newton’s method of finite differences to generate a Newton series that matches the data. The data can be extrapolated using the obtained polynomial.

Conic Extrapolation

Five spots near the end of the given data can be used to make a conic section. If the conic section is an ellipse or a circle, it will loop back and rejoin itself when extrapolated. A parabola or hyperbola that has been extrapolated will not rejoin itself, but it may curve back toward the X-axis. A conic sections template (on paper) or a computer could be used for this form of extrapolation.

Further, we will see the simple python implementation of linear extrapolation. 

Implementing Linear Extrapolation in Python

The technique is beneficial when the linear function is known. It’s done by drawing a tangent and extending it beyond the limit. When the projected point is close to the rest of the points, linear extrapolation delivers a decent result.

# Code is taken from GeeksforGeeks
# Extrapolation
def extrapolation_(q, r):
    result = (q[0][1] + (r - q[0][0]) /
        (q[1][0] - q[0][0]) *
        (q[1][1] - q[0][1]));
 
    return result
 
# dataset
q = [[ 5.2, 8.7 ], [2.4, 4.1 ]];
# Sample Value 
r = 2.1;
 
# Finding the extrapolation
print("Value of y at x = 2.1 :",extrapolation_(q, r))

Final Words 

Extrapolation is a helpful technique, but it must be used in conjunction with the appropriate model for describing the data, and it has limitations after you leave the training area. Its applications include predicting in situations where you have continuous data, such as time, speed, and so on. Prediction is notoriously imprecise, and the accuracy falls as the distance from the learned area grows. In situations where extrapolation is required, the model should be updated and retrained to lower the margin of error. Through this article, we have understood extrapolation and its interpolation mathematically and related them with the ML, and seen their effect on the ML system. We have also seen particularly where it fails, and methods that can be used.   

References 

More Great AIM Stories

Vijaysinh Lendave
Vijaysinh is an enthusiast in machine learning and deep learning. He is skilled in ML algorithms, data manipulation, handling and visualization, model building.

More Stories

OUR UPCOMING EVENTS

8th April | In-person Conference | Hotel Radisson Blue, Bangalore

Organized by Analytics India Magazine

View Event >>

30th Apr | Virtual conference

Organized by Analytics India Magazine

View Event >>

MORE FROM AIM
Yugesh Verma
All you need to know about Graph Embeddings

Embeddings can be the subgroups of a group, similarly, in graph theory embedding of a graph can be considered as a representation of a graph on a surface, where points of that surface are made up of vertices and arcs are made up of edges

Yugesh Verma
A beginner’s guide to Spatio-Temporal graph neural networks

Spatio-temporal graphs are made of static structures and time-varying features, and such information in a graph requires a neural network that can deal with time-varying features of the graph. Neural networks which are developed to deal with time-varying features of the graph can be considered as Spatio-temporal graph neural networks. 

Vijaysinh Lendave
How to Evaluate Recommender Systems with RGRecSys?

A recommender system, sometimes known as a recommendation engine, is a type of information filtering system that attempts to forecast a user’s “rating” or “preference” for an item. In this post, we will look at RGRecSys, a library that performs constraint evaluation of recommender systems.

3 Ways to Join our Community

Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

Telegram Channel

Discover special offers, top stories, upcoming events, and more.

Subscribe to our newsletter

Get the latest updates from AIM