# Here’s Why Data Scientists Shouldn’t Rely Too Much On P-Values In Machine Learning Experiments

Statistical concepts go hand-in-hand with machine learning, but may not always fulfil capabilities to the latter. At times, machine learning models cannot perform better if certain statistical revisions are made in them. Then again, it is open to interpretation and depends on the problem which the ML algorithms aim to solve. In this article, we consider a specific case called ‘p-values’ in statistics and discuss how it affects machine learning in general.

### What Do P-Values Signify

In statistics, hypothesis tests are conducted to check whether the inference made from the population holds good or not in an experiment. The hypothesis tests are mainly categorised into two types, null hypothesis and alternate (sometimes called the alternative) hypothesis. The null hypothesis, which is the foundation for any statistical experiment, establishes no statistical relation for the observations collected in the sample or the population. In null hypothesis, it is generally accepted if no contradictory argument is found. Alternate hypothesis provides the basis for rejecting the null hypothesis. In simple words, it is the alternative statement to the null hypothesis.

Now, the p-value is used to factually assess the strength of both null and alternate hypothesis. P-values are decimal numbers between 0 and 1, which serves as a probabilistic reference to weigh the hypothesis. Sometimes, it is also expressed in a percentage format. Typically, a small p-value (less than 0.05) suggests that null hypothesis is to be rejected while a large p-value (greater than 0.05) denotes that null hypothesis is to be accepted due to lack of counter proposition against it. Values equal to or nearer to 0.05 denote that experimenters can take their own call.

Many times, p-values are wrongly interpreted. They are sometimes considered probability values themselves for the experiment, without taking hypothesis testing into account. This will certainly lead to incorrect conclusions for the experiment in the statistical context. Another instance would be studying different variables under a project in statistics, for example, in regression analysis. If the variables are not correctly selected, the analysis would be void.

When analysing with p-values, experimenters should have an idea of what is to be tested that lies ahead of time, because, once the p-values are set into effect, it is difficult to get the same statistical sense if they are manipulated later. In fact, the assumptions in the theory for deriving p-values are partly misleading sometimes, which makes it unsuitable for machine learning applications because statistical significance will be diminished.

In a journal paper by Mark Schervish, a professor at the Carnegie Mellon University, says that p-values are logically flawed when they are used informally, without giving much thought to statistical considerations. In the paper titled P Values: What They Are And What They Are Not, Schervish presents an argument that p-values are continuous in value as opposed to a definite value (as mentioned earlier). The study examines point-null and one-sided hypothesis to prove that p-values are continuous in function. He asserts, “Just as the point-null and one-sided hypotheses are limits of interval hypotheses, so too are their P values limits of the P values of the interval hypotheses for every data value. This observation allows us to think of point-null hypotheses as approximations to interval hypotheses”.

Also, p-values do not act as statistical support because they rarely satisfy all the same criteria for multiple statistical comparisons. Therefore, p-values are not viable for machine learning models due to the fact that data is always continuous, and can change statistical inferences made in the models.

### Using Bayesian Approach

For a machine learning environment, the Bayesian approach works well because this approach deals with probability distributions rather than devising a hypothesis and its subsequent testing. Unlike p-values, the Bayesian approach has a subjective perspective, wherein the experimenter can acknowledge the reason for choosing a specific probability distribution, and can also make updates according to the statistical experiment. On top of that, the approach provides an easier way to depict data values visually, which can bring more information into the context.

### Conclusion

Machine learning projects generally take care of all the statistics before they are deployed into practice. There are a host of techniques such as dimensionality reduction, and principal component analysis (PCA), among others that take care of assumptions in data for machine learning. Incorporating statistical concepts such as hypothesis testing and p-values on an already well-set machine learning model might lead to increased complexity in deducing data. Ultimately, p-values show their significance only when there are fewer parameters or variables involved in the experiments or projects.

I research and cover latest happenings in data science. My fervent interests are in latest technology and humor/comedy (an odd combination!). When I'm not busy reading on these subjects, you'll find me watching movies or playing badminton.

## Our Upcoming Events

### Telegram group

Discover special offers, top stories, upcoming events, and more.

### Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

### NVIDIA Expands Cloud Business with Investments, Partnerships

With NVIDIA partnership, Hugging Face users get access to SOTA GPUs and infrastructure needed to rapidly train and finetune foundation models at scale and drive a new wave of enterprise LLM development.

### Intel Soon to be on Par with NVIDIA

A green CPU with a blue GPU might soon be possible.

### Shell Hackathon to Protect Against Cyber Threats

The aim of the Cyber Threat Detection Hackathon is to build a model capable of identifying code in a body of text.

### ChatGPT is Down, I Can’t Code Anymore

Don’t they know I have a product to ship?

### Decoding SAP Labs’ Generative AI Motto

The German ERP software provider is investing heavily in upskilling its employees.

### Why AI Tech Honchos are Meeting Behind Closed Doors

What transpired when the who’s who of tech leaders convened in Capitol Hill last week to discuss AI behind closed doors?

### AI Clock is Ticking: Wake Up Call for Education Institutions

It’s not too late

### This Indian AI Healthcare Model Outperformed GPT-4 and MedPaLM

“While Google is building for the US, August’s focus on India and its empathetic conversation will be key differentiators for us.”

### Why Atlassian Chose Not to Rush Through LLMs

Last week, Atlassian’s CTO Rajeev Rajan sat down with AIM to list down the company’s technological priorities

### 6 IDEs Built for Rust

Rust IDEs aid efficient code development by offering features like code completion, syntax highlighting, linting, debugging tools, and code refactoring