Starting on a lighter note, let’s dig into one area called hypothesis testing. In statistics, hypothesis testing is used to decide whether a particular claim made on a population of data is true or false from a sample available. Just like the image above, a man cannot be pregnant (biologically), which is a claim made on pregnancy in humans. In hypothesis testing, this factual statement forms the ‘null hypothesis’ where it is tested for validity (right or wrong).
Generally, the norm in hypothesis testing is to reject the null hypothesis. This is because the null hypothesis is the basis for validating what we are testing. The opposing statement called ‘alternative hypothesis’ contests with the null hypothesis. Altogether, hypothesis testing combines both these arguments to check which is right or wrong in the sample data.
Type I Error
In formal terms, if a null hypothesis is rejected even when it is true and the alternative hypothesis is accepted, this error is known as a Type I error. With respect to the cover image, ‘man cannot be pregnant’ is the null hypothesis and ‘man is pregnant’ is the alternative hypothesis. Despite this fact, if the doctor rejects the statement, he commits a Type I error. In other words, the man is considered pregnant!
Type II Error
If the null hypothesis is accepted when it is false and the alternate hypothesis is rejected, this leads to what is known as Type II error. Again, back to the image, ‘woman is not pregnant’ is the null hypothesis and ‘woman is pregnant’ is the alternative hypothesis. Suppose the doctor says the lady is not pregnant even when she is, typically leads to Type II error where the null hypothesis is deemed right and alternative hypothesis is said to be wrong.
How to Reduce These Errors
In the case of Type I error, a smaller level of significance will generally help. Before beginning with hypothesis testing, this feature is considered if the null hypothesis is assumed to be true. In Type II error, another concept called Power, in addition to the significance level, helps overcome the effect of this error (more about this can be found here).
Overall, before running a statistical experiment, a good deal of right data is suggested for eliminating differences statistically.
Register for our upcoming events:
- WEBINAR: HOW TO BEGIN A CAREER IN DATA SCIENCE | 24th Oct
- Machine Learning Developers Summit 2020: 22-23rd Jan, Bangalore | 30-31st Jan, Hyderabad
Enjoyed this story? Join our Telegram group. And be part of an engaging community.
Our annual ranking of Artificial Intelligence Programs in India for 2019 is out. Check here.
Provide your comments below
What's Your Reaction?
I research and cover latest happenings in data science. My fervent interests are in latest technology and humor/comedy (an odd combination!). When I'm not busy reading on these subjects, you'll find me watching movies or playing badminton.