MITB Banner

Machine Learning The Easiest Way To Detect Malware In Android OS

Share

android

With handheld technology growing at an exponential rate, almost every advancement in the digital world gets more attention than ever. This can largely be credited to the ever-growing mobile phone ecosystem. As of now, there are over two billion mobile phone devices across the world (including feature phones and tablet devices).

When it comes to the operating system (OS) or software that powers these smartphones, Google’s Android has clearly emerged as the winner in the ecosystem, as compared to Apple’s iOS or Microsoft’s Windows Phone, with a lion’s share of 80 percent. This success and popularity can be attributed to the ease at which Android offers developers to build applications and its open-source availability.

With popularity comes the darker side — the vulnerabilities in Android. Due to the OS being an open-source platform, it has become the breeding ground for miscreants to develop malware applications to expose security and other flaws leaving critical information such as user data and privacy out in the open. Although Google has made a stringent effort to curb malware applications in the recent years, it cannot be said that it has completely mitigated these negative intentions.

Discerning Malware Through Machine Learning

Studies and methods to detect malware in Android can be dated back to its release in 2008. A plethora of software tools such as sandboxes and debuggers have been developed and used for malware analysis since then. However, with the staggering rise of malware outbreaks in the recent times, it is difficult to curtail with just these tools alone. This issue urged computer scientists and researchers come up with machine learning (ML) methods.

Earliest studies have explored ML techniques such as classification to differentiate harmful applications from the genuine ones. Android Package Kits (generally known as ‘.apk’ files) which are Android’s application files, were extensively tested through ML algorithms to look for malicious software code. These studies also analysed the code for discrepancies, which may leave applications vulnerable to attacks.

But, the challenge here lies in capturing the exact features for ML in applications. To tackle this, some studies used support vector machines (SVM), for identifying different types of malware classes. Along with this, these studies have made use of tools like control flow graph (CFG) for representation of the application learning flow to boost detection. What was evident from these studies was the feature extraction was made easier compared to previous studies. Subsequent research on identifying Android malware through ML saw Bayesian classification approaches that have seen significant results in achieving a higher detection accuracy.

Computer researchers have also looked into app permissions to see if ML methods have an impact here. It was observed to be fairly good — detection accuracies up to 90 percent! Some computer developers even started integrating ML in sandboxes, which has the potential to deal with vulnerabilities in applications’ online services as well.

Types Of Malware Detection

Malware detection in computers is generally divided into two types, static analysis and dynamic analysis. The same is applicable to Android OS. Static analysis deals with examining the functionality of an application/file without executing it, whereas dynamic analysis examines the file by running it in a computer (or even sandbox tools) to investigate the behaviour of the malware in depth.

Since modern malware in Android is loaded with extra elements such as evasion techniques, all recent studies focus on dynamic analysis in ML. Static analysis usually considers techniques such as packed encryption and mitigates the effects of malware but then again cannot help because they may leave out ‘traces’ susceptible for attacks.

Botnets

Another form of malware which is quite popular lately is botnets. What was once omnipresent in computers has now spread to mobile devices such as smartphones. Device networks affected by botnets can act independently and thus pose a peril to the mobile ecosystem. Generally, botnet attacks are done without the knowledge of the user knowing it, and are now being deployed on smartphones powered by Android. All critical data can be stolen through distributed denial of service (DDoS) attacks such as HTTP Flood, Ping Flood and so on regardless of the mobile platform. Although ML detection has emerged to tackle DDoS, it has yet to successfully make a mark in the mobile space.

Conclusion

ML models bring in a proactive approach to eliminate malware. But as mentioned earlier, the feature extraction bit is still to see a strong improvement. For example, there is a possibility that a particular part of a malware can be avoided in the output if features do not match closely in the training. Therefore, a rigid framework for ML is always suggested to counter modern shapeshifting malware, and the adversarial impact it can afflict sensitive information.

Share
Picture of Abhishek Sharma

Abhishek Sharma

I research and cover latest happenings in data science. My fervent interests are in latest technology and humor/comedy (an odd combination!). When I'm not busy reading on these subjects, you'll find me watching movies or playing badminton.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.