Photo of ChatGPT Usage on Laptop
© Iryna Imago/Shutterstock.com
No AI-generated content: this article is written and researched by humans
Table of contents

Threat actors are increasingly exploiting vulnerabilities in AI chatbots to bypass inbuilt safety measures and harness them for malicious purposes.

“While AI jailbreaking is still in its experimental phase, it allows for the creation of uncensored content without much consideration for the potential consequences,” SlashNext said in a blog post on Sept. 12.

Cybercriminals use tailored commands to exploit “weaknesses” in AI chatbots and get them to craft unethical content that can be used for malicious purposes, like social engineering attacks.

Cybersecurity researchers have since warned about this possibility. In August, IBM researchers conducted an experiment showing that AI chatbots like ChatGPT and Google Bard can be hypnotized to leak user data and produce text that goes against their ethical guidelines.

“Jailbreak prompts can range from straightforward comments to more abstract narratives designed to coax the chatbot into bypassing its constraints. The overall goal is to find specific language that convinces the AI to unleash its full, uncensored potential,” SlashNext said.

AI Jailbreaking on the Rise

According to SlashNext, there are online communities dedicated to jailbreaking AI chatbots, where members “exchange jailbreaking tactics, strategies, and prompts to gain unrestricted access to chatbot capabilities.”

Besides manipulative prompts, cybercriminals are also creating custom interfaces for jailbroken AI and passing them off as unique large language models (LLMs).

We’ve written about malicious LLMs like WormGPT, FraudGPT, and DarkBERT. With the exception of WormGPT, SlashNext said most of these tools are not custom LLMs.

“Instead, they use interfaces that connect to jailbroken versions of public chatbots like ChatGPT, disguised through a wrapper. In essence, cybercriminals exploit jailbroken versions of publicly accessible language models like OpenGPT, falsely presenting them as custom LLMs,” SlashNext’s blog post revealed.

“The only real advantage of these tools is the provision of anonymity for users. Some of them offer unauthenticated access in exchange for cryptocurrency payments,” SlashNext explained.

Jailbroken chatbots can be employed to craft convincing phishing emails, texts, or messages that are almost indistinguishable from legitimate communications. These malicious emails can be far more effective at deceiving individuals and bypassing traditional security measures.

Securing AI Chatbots

OpenAI and other organizations are actively working to enhance the security of their chatbots, conducting red team exercises to identify weak points and enforce access controls, SlashNext revealed.

“However, AI security is still in its early stages as researchers explore effective strategies to fortify chatbots against those seeking to exploit them,” the report said. “The goal is to develop chatbots that can resist attempts to compromise their safety while continuing to provide valuable services to users.”

Read our guide to the privacy risks of AI chatbots to learn more about the security and privacy concerns surrounding these AI tools and how to use them safely.

For more AI-related cybersecurity insights, follow us on X (Twitter), Threads, and Mastodon!

Leave a comment