# What Is A Naive Bayes Classifier And What Significance Does It Have In ML

Classifier systems are most popular with spam filtering for emails, collaborative filtering for recommendation engines and sentiment analysis. AI is good with demarcating groups based on patterns over large sets of data.

Naive Bayes classifier is based on Bayes’ theorem and is one of the oldest approaches for classification problems.

#### THE BELAMY

Bayes’ theorem can be put in simple terms as:

The objective here is to determine the likelihood of an event A happening given B happens.

The naive Bayes classifier combines Bayes’ model with decision rules like the hypothesis which is the most probable outcomes.

Naive Bayes methods are a set of supervised learning algorithms based on applying Bayes’ theorem with the “naive” assumption of conditional independence between every pair of features given the value of the class variable.

It was initially introduced for text categorisation tasks and still is used as a benchmark.

There have been many innovations like Support Vector Machines or KNN over the years in solving the classification problem with more flexibility and smartly. But Naive Bayes classifier can still be competent with enough pre-processed data and has shown great results in medical applications where classification is crucial in diagnosis.

### How Good Is NB Classifier For ML

The first assumption of a Naive Bayes classifier is that the value of a particular feature is independent of the value of any other feature. Which means that the interdependencies within data are comfortably neglected. Hence the name ‘naive.’

A naive Bayes classifier considers every feature to contribute independently to the probability irrespective of the correlations.

For unsupervised or in more practical scenarios, maximum likelihood is the method used by naive Bayes model in order to avoid any Bayesian methods, which are good in supervised setting.

Gaussian Naive Bayes classifier where the feature values are assumed to be distributed in accordance with Gaussian distribution. The likelihood of the feature being classified is assumed to be Gaussian.

Calling Gaussian NB classifier in Python using sci-kit learn:

`from sklearn.naive_bayes import GaussianNB`

Multinomial Naive Bayes classifier considers feature vectors which are representation of the frequencies with which few events have been generated by a multinomial distribution.

Whereas, in Bernoulli Naive Bayes approach, features are independent booleans and can be used for binary responses.

For example, in document classification tasks, Multinomial NB can be used for a number of times a word appears in the document(frequency). And, Bernoulli NB for classifying whether a word appears or not (a binary YES or NO).

NB classifiers are usually pitted against support vector machines(SVM). In many cases, SVMs are better than Naive Bayes. SVMs skims across features for dependencies when non-linear kernel like Gaussian or radial basis function(RBF) are used.

Even though naive Bayes is criticized for its inaccuracies surrounding the assumption of independence across features, it does fairly well when the class conditional feature is decoupled. This decoupling allows it to treat the feature distributions as one-dimensional distribution. And, avoid the challenges of dimensionality like, the need for data sets that grow exponentially with features.

NB classifiers can be tweaked in for better results especially for document classification or word identification using the following techniques:

• By removing stop words in a sentence as they are not significant to the classification task. We won the game is as good as we won the game by a very close margin. Here ‘very’ and ‘close’ are the stop words and removing them wouldn’t change the result.
• By lemmatizing words, synonymous parts of a paragraph will be grouped together to avoid the ticking of word frequency counter. ‘Game’ or ‘games’ will be grouped.
• Checking significance of a word with term frequency-inverse document frequency(TF-IDF). This technique is used to check for the weightage of a word in text mining tasks. It can be used for stop words filtering.  Using tf-idf value as a threshold, few words can be penalised in case of their high frequency.

## More Great AIM Stories

### Of Programming, Religion, and AI

I have a master's degree in Robotics and I write about machine learning advancements.

## AIM Upcoming Events

Early Bird Passes expire on 3rd Feb

Conference, in-person (Bangalore)
Rising 2023 | Women in Tech Conference
16-17th Mar, 2023

Conference, in-person (Bangalore)
Data Engineering Summit (DES) 2023
27-28th Apr, 2023

### Telegram group

Discover special offers, top stories, upcoming events, and more.

### Discord Server

Stay Connected with a larger ecosystem of data science and ML Professionals

### Council Post: Role of AI in creating inclusive products & solutions

We cannot assume a foregone conclusion that Artificial Intelligence will create a more equitable and inclusive society – it will take intentional and sustained focus and effort.

### The tech stack behind Bangalore based startup Synth’s summarisation tool

“Synth is compatible with everything; Twitter, Google meet, Teams, Zoom, or any other video conferencing client that will come up in the next ten years.”

### 2022 is going to be the year of Foundational Models in AI research

The foundational models are based on concepts that have been around for a while, like deep neural networks and self-supervised learning.

### What is data protection as a service, and does your organisation need it?

We can already see big tech providers jumping on this bandwagon and providing data protection services to their customers.

### How small datasets drive efficiency in vision models

Vision Transformers (ViTs) is emerging as an alternative to convolutional neural networks (CNNs) for visual recognition.

### This is how DeepSpeed is playing a part in Microsoft’s AI at Scale effort

Overall, the breakthroughs and infrastructures present a potential path toward training and inference of the next generation of AI scale without more compute resources.

### How to use cloud platforms for your data science projects

Some of the most common cloud-based platforms for data science projects include Amazon Web Services, Google Cloud Platform, IBM Watson and Microsoft Azure.

### GitHub Copilot vs No-Code platforms – The race to replace developers?

GitHub Copilot’s AI-based suggestive algorithm is making waves in the world of coding and development.

### Meet the winners of ‘Ode to Code’ – Tredence’s weather prediction hackathon

The hackathon, organised by Tredence, in association with MachineHack, witnessed close to 580+ participants and 300+ solutions posted on the leaderboard.

### Top tech start-ups featured on Shark Tank India

Thinkerbell Labs raised ₹1.05 crore on their flagship product, Annie, a self-learning app for visually impaired individuals.