# Curse Of Dimensionality And What Beginners Should Do To Overcome It

The Curse of Dimensionality is termed by mathematician R. Bellman in his book “Dynamic Programming” in 1957. According to him, the curse of dimensionality is the problem caused by the exponential increase in volume associated with adding extra dimensions to Euclidean space.

The curse of dimensionality basically means that the error increases with the increase in the number of features. It refers to the fact that algorithms are harder to design in high dimensions and often have a running time exponential in the dimensions. A higher number of dimensions theoretically allow more information to be stored, but practically it rarely helps due to the higher possibility of noise and redundancy in the real-world data.

Gathering a huge number of data may lead to the dimensionality problem where highly noisy dimensions with fewer pieces of information and without significant benefit can be obtained due to the large data. The exploding nature of spatial volume is at the forefront is the reason for the curse of dimensionality.

The difficulty in analysing high-dimensional data results from the conjunction of two effects.

• High-dimensional spaces have geometrical properties which are counter-intuitive, and far from the properties which can be observed in two-or three-dimensional spaces.
• Data analysis tools are most often designed having in mind intuitive properties and examples in low-dimensional spaces and usually, data analysis tools are best illustrated in 2-or 3-dimensional spaces.

The difficulty is that those tools are also used when data are high-dimensional and more complex and hence, there is a probability to lose the intuition of the behavior of the tool and thus drawing incorrect conclusions.

### Domains of CoD

Some of the domains of the curse of dimensionality are discussed below.

#### Anomaly Detection

Anomalies in high-dimensional data may show a significant number of attributes which are irrelevant in nature, different subspaces produce incomparable scores, given the large search space, for every desired significance a hypothesis can be found, certain objects occur more frequently in neighbor lists than others, etc.

#### Combinatorics

Here the curse of dimensionality occurs when the complexity increases rapidly which is caused by the increasing number of possible combinations of inputs.

#### Machine Learning

One of the key ingredients in the successful development of learning algorithms is to have enough data for learning so that they fill the space or part of the space where the model must be valid. An increase in the dimensionality in data results in the sparsification of data and this exponential increase is the first consequence of what is called the curse of dimensionality.

In machine learning, a small increase in the dimensionality would require a large increase in the volume of the data in order to maintain a similar level of performance. Thus the curse of dimensionality is the expression of all phenomena that appear with high-dimensional data, and that have most often unfortunate consequences on the behavior and performances of learning algorithms.

### How To Combat The CoD

#### Dimensionality Reduction

Dimensionality reduction is a method of converting the high dimensional variables into lower dimensional variables without changing the specific information of the variables. To overcome the issue of the curse of dimensionality, Dimensionality Reduction is used to reduce the feature space with consideration by a set of principal features. Dimensionality Reduction contains no extra variables that make the data analyzing easier and simple for machine learning algorithms and resulting in a faster outcome from the algorithms.

#### Regularisation

The problem of curse of dimensionality comes from that parameter estimates which are unstable, hence, regularising these estimates will help that the parameters to make correct estimation.

#### PCA

PCA (Principal Component Analysis) is one of the most traditional tools used for dimension reduction. It transforms the data into the most informative space, thereby allowing the use of lesser dimensions which are almost as informative as the original data. But since it is a linear tool, nonlinear relations between the components of the initial data may be lost in the preprocessing.

References

A Technical Journalist who loves writing about Machine Learning and Artificial Intelligence. A lover of music, writing and learning something out of the box.

## Oct 11-13, 2023 | Bangalore

### Telegram group

Discover special offers, top stories, upcoming events, and more.

### Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

### Harnessing Human Emotions in Generative AI

“AI can still be very literal in its interactions, while humans don’t tend to express emotions in simple and straightforward ways.”

### Get Started With Data Streaming

Kafka Connect enables developers to easily bring in data from their data sources and turn data-at-rest into data-in-motion.

### AI Cloud Wars: Azure AI vs Vertex AI

Had you taken a sip everytime Nadella uttered “Copilot” at Microsoft Guild, you would have had woken up with a very bad hangover the next day.

### Is the GPT Breakthrough in Robotics Approaching Soon?

On the heels of AI and LLM advancements, will Robotics get its GPT moment anytime soon?

### You Are to be Blamed for ChatGPT’s Flaws

It is not just ChatGPT’s fault that it generates misinformation, it’s also, and mostly the fault of content creators and media firms

### What is AWS’ Generative AI Strategy?

AWS wants to empower customers with the capabilities and resources to build customised Generative AI solutions that cater to their specific needs and requirements.

### Microsoft To Ape Google’s Success With Windows AI DevTools

The announcements at the Build conference paint a picture of Microsoft’s goals to use the dev ecosystem as rocket fuel to reach for the stars.

### How This Bengaluru Boy Cracked Netflix, Twitter and DoorDash Interviews

“No matter how smart you are, how much knowledge you have or what GPA you score, cracking tech interviews is a different ball game altogether”

### Generative AI is Having An Edison Moment

Companies making generative models accessible are thriving more than a more impactful research

### Unleashing the Power of LLM for Enterprise Applications with LangChain

Listen to this story LangChain has become one of the most talked about topics