Next Frontier of Cybersecurity: Guarding against Generative AI

Using LLMs for tasks like analysis, reporting, and rule generation expands the potential scope of attacks and vulnerabilities

Share

Published on July 4, 2023

by Pritam Bordoloi

Listen to this story

ChatGPT, introduced in November last year, experienced a rapid surge in adoption among users and enterprises alike. However, no one can deny the floodgate of risks that this new technology has unleashed on everyone. A recent report by cybersecurity firm Group-IB revealed that over 100,000 ChatGPT accounts have been compromised and their data is being illicitly traded on the dark web, with India alone accounting for 12,632 stolen credentials.

( Source: Group-IB)

Similarly, in March, a bug in an open-source library gave some ChatGPT users the ability to see titles from another active user’s chat history. Companies such as Google, Samsung and Apple have also forbidden their employees from using any generative AI-powered bots.

Venkatesh Sundar, founder and president, Americas, at Indusface, believes there is rapid adoption of generative AI without too much consideration of risk. “In most cases, the adopted LLM models are built by someone else, so they carry the security risk of a compromised LLM affecting all apps using the LLM model. This is very similar to the risk of using open source / third-party code and plug-ins,” he told AIM.

Generative AI API risk

API risks aren’t not new. As anticipated by Gartner in their 2019 report, API hacks have indeed become a prevalent form of cyberattack. According to a survey conducted by Salt Security, a leading API security company, among 200 enterprise security officials, a staggering 91% of companies reported experiencing API-related security issues in the past year.

Now, as more and more enterprises are looking to leverage LLM APIs, the biggest concern remains the leaking or exposure of sensitive data from these tools. While certain applications of natural language interfaces, such as search functionality, may pose lower security risks, the use of LLMs for tasks like analysis, reporting, and rule generation expands the potential scope of attacks and vulnerabilities.

There is a risk of data breaches or unauthorised access to this information, potentially resulting in privacy violations and data leaks. “While there’s so much attention being placed on the use and availability of generative AI, ransomware groups continue to wreak havoc and find success at breaching organisations around the world,” Satnam Narang, senior staff research engineer at Tenable, told AIM.

Adding further to the discussion, Sundar stresses that organisations should anticipate attacks or attempts to corrupt the data set. Hackers may attempt to inject malicious or biassed data into the dataset, which can influence the LLM’s responses and outputs. “Important business decisions may rely on this data, without good understanding of how the AI model works or the validity of data points used in the process,” Kiran Vangaveti, founder & CEO, BluSapphire Cyber Systems told AIM.

Earlier this year, researchers from Saarland University have presented a paper on prompt engineering attacks in chatbots. They discovered a method to inject prompts indirectly, using ‘application-integrated LLMs’ like Bing Chat and GitHub Copilot, expanding the attack surface for hackers. Injected prompts can collect user information and enable social engineering attacks.

Is Building GenAI capabilities in-house the key?

OpenAI and other organisations recognise the importance of addressing API risks and have implemented precautionary measures. OpenAI, for instance, has undergone third-party security audits, maintains SOC 2 Type 2 compliance, and conducts annual penetration testing to identify and address potential security vulnerabilities before they can be exploited by malicious individuals.

However, Sundar believes security is complex and securing natural language queries is way more complex. “While controls like access are being built, many attacks leverage different prompts or series of prompts to leak information. For example, when ChatGPT blocked the prompt to generate malware, people have found a way around it and now are asking ChatGPT to give a script for penetration testing,” he said.

Vangaveti concurs that understanding security frameworks required to protect against malicious use or protect data is a complex task. However, as this area matures, more frameworks or best practices will evolve. Furthermore, enterprises today are also exploring many open-source LLMs as alternatives. Open source LLMs can potentially be more vulnerable to cyber attacks due to their availability and open nature. Since the source code and architecture are openly accessible, it becomes easier for attackers to identify and exploit vulnerabilities.

Nonetheless, to tackle this, Narang believes the solution could be building generative AI capabilities in-house. “As long as there is a reliance upon outside tooling to provide the generative AI functionality, there will always be some inherent risk involved in entrusting data to a third-party, unless there are plans to develop and maintain one in-house”. Interestingly, Samsung announced that they will be building their own generative AI capabilities after sensitive data were accidentally shared with ChatGPT by some of its employees.

ChatGPT is writing malware

ChatGPT’s coding capabilities, which include writing code and fixing bugs, have unfortunately been exploited by malicious actors to develop malware. “Attackers are able to profile targets relatively quickly and create attack code on the fly with little expertise. They are able to build custom malware rapidly,” Vangaveti said.

Some experts believe ChatGPT and DALL-E pose even a greater risk to non-API users. “Information stealing malware, such as Raccoon, Vidar and Redline are capable of stealing sensitive information stored in web browsers, which includes user credentials (username/email and password), session cookies and browser history,” Narang said.

Besides, researchers from threat detection company HYAS have demonstrated a proof of concept (PoC) called BlackMamba, and demonstrated how LLM APIs can be used in malware in order to evade detection. “To demonstrate what AI-based malware is capable of, we have built a simple PoC exploiting a large language model to synthesise polymorphic keylogger functionality on-the-fly, dynamically modifying the benign code at runtime — all without any command-and-control infrastructure to deliver or verify the malicious keylogger functionality,” they said in a blog post.

Hence, without doubt, the widespread adoption of generative AI has raised concerns about security risks, including API vulnerabilities and data exposure, so organisations must implement robust security measures and remain vigilant to mitigate these risks effectively.

Access all our open Survey & Awards Nomination forms in one place

Pritam Bordoloi

I have a keen interest in creative writing and artificial intelligence. As a journalist, I deep dive into the world of technology and analyse how it’s restructuring business models and reshaping society.