MITB Banner

Meet Weka, The Wonderkid Of Machine Learning Software

Share

weka-bn
weka-bn
The name Weka also means a bird (seen above) of New Zealand origin

While Python and R have been dominating the programming domain for long, one particular tool that rose in parallel is Weka. Developed by University of Waikato, New Zealand, Weka stands for Waikato Environment for Knowledge Analysis. This software was exclusively built for machine learning for data mining and comprises various tools for data preparation, classification, regression, clustering, association rules mining, and visualisation. This article takes a lowdown on what is Weka and how is it turning to be useful for data science researchers.

Weka And Data Mining

Java is the primary programming language used in the development of Weka. In fact, the first version was not based on Java, instead, it had Tool Command Language (TCL) in its environment (initially Weka was used to perform data analysis in agricultural domains). With the inclusion of Java, later on, Weka’s applications spread on to data mining tasks concerning different areas such as in education and research.

Here is a short list of standard ML algorithms in Weka. The latter name is how it is addressed in Weka.

  • Linear Regression: function.LinearRegression
  • Logistic Regression: function.Logistic
  • Naive Bayes: bayes.NaiveBayes
  • Decision Tree (specifically the C4.5 variety): trees.J48
  • k-Nearest Neighbors (also called KNN: lazy.IBk
  • Support Vector Machines (also called SVM): functions.SMO
  • Neural Network: functions.MultilayerPerceptron
  • Random Forest: trees.RandomForest
  • Bootstrap Aggregation (also called Bagging): meta.Bagging
  • Stacked Aggregation (also called Stacking or Blending): meta.Stacking

All In A Single Interface

An open source tool under the GNU General Public License, Weka has a variety of GUI to start with data mining (it also has a ‘workbench’ feature, sometimes not included at the GUI window of earlier versions). They are known as Explorer, Experimenter and KnowledgeFlow.

Weka can be downloaded here.

  1. Explorer lets a user tinker around with the data and helps how it can be transformed for analysis. It also lets what algorithms go into the software.  
  2. Experimenter runs the algorithms made by the user and provides a detailed analysis surrounding the data mining project.
  3. KnowledgeFlow helps design how the algorithms actually work in the project.
  4. Simple CLI means Command Line Interface for Weka. Users can use commands to work with the project instead of relying on the Explorer/GUI. The catch here is, it reduces memory constraint on Weka.

Weka has hundreds of algorithms for classification, data preprocessing, clustering etc. All of this can be performed easily since the implementations are already there, that too, with a single inter. Ian Witten, one of the creators of Weka, tells how the software features could be applied for data mining problems.

“One way of using WEKA is to apply a learning method to a dataset and analyze its output to learn more about the data. Another is to use learned models to generate predictions on new instances. A third is to apply several different learners and compare their performance in order to choose one for prediction. In the interactive WEKA interface, you select the learning method you want from a menu.”

Hence, this is the reason why Weka uses no code for machine learning. Its embedded software environment is what makes this possible. Suppose if an ML project is based on Java, there is no need for writing code again in Weka. In fact, when using Weka there is no requirement for knowing Java either. The GUI or CLI takes care of this part.

Other Avenues That Weka Can Explore

One of the reasons this software fell behind with ML users was its exclusivity with data mining. Had it advanced into other areas of data science such as data visualisation, Weka would be as popular as Python or R. Another aspect here is the Java environment in Weka. Not every user would be comfortable with Java, and moreover, they may dislike using Weka for this reason. Users also feel that the interface is old-fashioned and can be improved with more visual features.

Despite the criticism, an interesting development Weka has come up with is a deep learning package called WekaDeeplearning4j. It was developed to incorporate deep learning into Weka. Here, the backend is provided by the Deeplearning4j Java library. If Weka sees various facets like this in its platform, it will definitely grow large to tackle ML problems in general.

Share
Picture of Abhishek Sharma

Abhishek Sharma

I research and cover latest happenings in data science. My fervent interests are in latest technology and humor/comedy (an odd combination!). When I'm not busy reading on these subjects, you'll find me watching movies or playing badminton.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.