21st-may-banner design

Building A Machine Learning Model With WEKA With ‘No Coding’

In this article, we will learn about how to use WEKA to pre-process and build a machine learning model with code.

Share

No-code environments in machine learning have become increasingly popular due to the fact that almost anybody who needs machine learning, whatever field they may be in, can use these tools to build models for themselves. WEKA is one of the early no-code tools that was developed but is very efficient and powerful. WEKA can be used to implement state of the art machine learning and deep learning models and can support numerous file formats. 

In this article, we will learn about how to use WEKA to pre-process and build a machine learning model with code.

Installing and setting up WEKA

WEKA can be used in Linux, Windows or Mac operating systems and you can download this from the official website here. Select the operating system and click on download. The download will begin automatically. Once this is done, follow the steps to complete the installation and WEKA is ready to be used. 

Dashboard

When you launch the application on the local system you are presented with a dashboard. 

WEKA

Explorer: This environment is WEKA’s graphical user interface. You can find datasets and many machine learning models here along with visualization and pre-processing tools.

Experimenter: This environment is used for conducting experiments on the data or for performing certain statistical operations on the learning dataset. 

KnowledgeFlow: This environment provides the same functionality as that of the experimenter but is a drag and drop interface. 

Workbench: Workbench is an all in one application that combines user-selectable perspectives in it. 

Simple CLI: This is used for a deeper week and uses lesser memory compared to the other environments. It is prefered for larger deep learning models. 

For this article, we will make use of the explorer environment to build a machine learning model.

Selecting this environment gives a dashboard that looks like this. 

WEKA

The dataset

You can use a dataset of your own and the tool can understand the dataset. But, here I have selected one of the built-in datasets. The build-in datasets in the tool are in the format of .arff. Weka supports CSV, JSON, Excel, bsi etc.

To select the dataset from Weka, click on the ‘Choose’ option and navigate to the folder where you have installed weka. Select a folder named data here and you can see the following datasets. 

WEKA

I have selected the dataset called vote.arff. This dataset contains information about voters who either vote for democrats or republicans based on a lot of factors. 

Once you select the dataset you can see the different features of the dataset. 

WEKA

Pre-processing

After the dataset is loaded we see that there are some missing values in the dataset. To eliminate this we can select the ‘choose’ option on the top left again and we see a list of filters that can be used for data manipulation

pre-process

Under this, select the unsupervised-> attributes-> replace missing values option.

Once this is selected, the missing values are automatically replaced and the dataset is free from any missing row or column. 

no-code

Visualization

To visualize all features against the label you can select the ‘visualize all’ button on the screen and you can see bar charts of all feature distributions against the target. Blue represents the democratic party and red represents the republican. 

visualize

But, for other forms of visualization, you can select the ‘visualize’ tab on the top.

You can click on each box and identify the distributions according to each instance of the data. 

visualize

As shown above, the x-axis contains immigration, y contains the classes and to the right, you can see each instance of the dataset. You can explore different types of visualization here. 

Model building

The final step would be to build a classification model that can predict which party would get more votes. To build a classification model select the ‘classify’ tab on the top in the explorer dashboard. 

machine learning

As you see there are multiple algorithms that are available here. I have decided to use the decision tree classification model. Select the decision tree option from here and you can see the results immediately. 

The output shows how many instances are accurate, how many are wrong and the different errors which were calculated. Finally it shows the confusion matrix in the end as well. 

You can also train multiple models here by making other alterations like dropping a few columns, applying PCA etc. All these results can be logged and saved in the workspace that is created. 

Thus, we create a classification model just by using a few clicks of the mouse without writing any code. 

Conclusion

We saw in this article how we can build a machine learning model with WEKA and understood the different environments present in this tool. Weka finds wide applications among researchers and business analysts for faster model building and data analysis. 

Share
Picture of Bhoomika Madhukar

Bhoomika Madhukar

I am an aspiring data scientist with a passion for teaching. I am a computer science graduate from Dayananda Sagar Institute. I have experience in building models in deep learning and reinforcement learning. My goal is to use AI in the field of education to make learning meaningful for everyone.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.