MITB Banner

What Is A Naive Bayes Classifier And What Significance Does It Have In ML

Share

Classifier systems are most popular with spam filtering for emails, collaborative filtering for recommendation engines and sentiment analysis. AI is good with demarcating groups based on patterns over large sets of data.

Naive Bayes classifier is based on Bayes’ theorem and is one of the oldest approaches for classification problems.

Bayes’ theorem can be put in simple terms as:

The objective here is to determine the likelihood of an event A happening given B happens.

The naive Bayes classifier combines Bayes’ model with decision rules like the hypothesis which is the most probable outcomes.

Naive Bayes methods are a set of supervised learning algorithms based on applying Bayes’ theorem with the “naive” assumption of conditional independence between every pair of features given the value of the class variable.

It was initially introduced for text categorisation tasks and still is used as a benchmark.

There have been many innovations like Support Vector Machines or KNN over the years in solving the classification problem with more flexibility and smartly. But Naive Bayes classifier can still be competent with enough pre-processed data and has shown great results in medical applications where classification is crucial in diagnosis.

How Good Is NB Classifier For ML

The first assumption of a Naive Bayes classifier is that the value of a particular feature is independent of the value of any other feature. Which means that the interdependencies within data are comfortably neglected. Hence the name ‘naive.’

A naive Bayes classifier considers every feature to contribute independently to the probability irrespective of the correlations.

For unsupervised or in more practical scenarios, maximum likelihood is the method used by naive Bayes model in order to avoid any Bayesian methods, which are good in supervised setting.

Gaussian Naive Bayes classifier where the feature values are assumed to be distributed in accordance with Gaussian distribution. The likelihood of the feature being classified is assumed to be Gaussian.

Calling Gaussian NB classifier in Python using sci-kit learn:

from sklearn.naive_bayes import GaussianNB

Multinomial Naive Bayes classifier considers feature vectors which are representation of the frequencies with which few events have been generated by a multinomial distribution.

Whereas, in Bernoulli Naive Bayes approach, features are independent booleans and can be used for binary responses.

For example, in document classification tasks, Multinomial NB can be used for a number of times a word appears in the document(frequency). And, Bernoulli NB for classifying whether a word appears or not (a binary YES or NO).

NB classifiers are usually pitted against support vector machines(SVM). In many cases, SVMs are better than Naive Bayes. SVMs skims across features for dependencies when non-linear kernel like Gaussian or radial basis function(RBF) are used.

Even though naive Bayes is criticized for its inaccuracies surrounding the assumption of independence across features, it does fairly well when the class conditional feature is decoupled. This decoupling allows it to treat the feature distributions as one-dimensional distribution. And, avoid the challenges of dimensionality like, the need for data sets that grow exponentially with features.

NB classifiers can be tweaked in for better results especially for document classification or word identification using the following techniques:

  • By removing stop words in a sentence as they are not significant to the classification task. We won the game is as good as we won the game by a very close margin. Here ‘very’ and ‘close’ are the stop words and removing them wouldn’t change the result.
  • By lemmatizing words, synonymous parts of a paragraph will be grouped together to avoid the ticking of word frequency counter. ‘Game’ or ‘games’ will be grouped.
  • Checking significance of a word with term frequency-inverse document frequency(TF-IDF). This technique is used to check for the weightage of a word in text mining tasks. It can be used for stop words filtering.  Using tf-idf value as a threshold, few words can be penalised in case of their high frequency.

Read about NB classifier in detail here

Share
Picture of Ram Sagar

Ram Sagar

I have a master's degree in Robotics and I write about machine learning advancements.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.