- 5 easy ways to transfer photos from your Android device to your Windows PC
- How to get Google's new Pixel 9a for free
- Just installed iOS 18.4? Changing these 3 features made my iPhone much better to use
- 7 strategic insights business and IT leaders need for AI transformation in 2025
- The most underrated robot vacuum I've ever tested is now 60% off
Cyber-criminals “Jailbreak” AI Chatbots For Malicious Ends

SlashNext, a cybersecurity company, has uncovered a concerning trend in the world of artificial intelligence (AI) chatbots. Referred to as “jailbreaking,” this practice involves users exploiting vulnerabilities within AI chatbot systems, potentially violating ethical guidelines and cybersecurity protocols.
AI chatbots like ChatGPT have gained notoriety for their advanced conversational abilities. However, some users have identified weaknesses in these systems, enabling them to bypass built-in safety measures. This manipulation of chatbot prompting systems allows users to unleash uncensored and unregulated content and is raising ethical concerns.
Jailbreaking AI chatbots involve issuing specific commands or narratives that trigger an unrestricted mode, enabling the AI to respond without constraints. Online communities have emerged where individuals share strategies and tactics for achieving these jailbreaks, fostering a culture of experimentation and boundary-pushing.
“These platforms are collaborative spaces where users share jailbreaking tactics, strategies, and prompts to harness the full potential of AI systems,” commented Callie Guenther, cyber threat research senior manager at Critical Start.
“While the primary drive of these communities is exploration and pushing AI boundaries, it’s essential to note the double-edged nature of such pursuits.”
SlashNext explained that this trend has also attracted the attention of cyber-criminals who have developed tools claiming to use custom large language models (LLMs) for malicious purposes.
However, research suggests that most of these tools, with the notable exception of WormGPT, merely connect to jailbroken versions of public chatbots, disguising their true nature and allowing users to exploit AI-generated content while maintaining anonymity.
One prominent method in this space is the “Anarchy” method, which uses a commanding tone to trigger an unrestricted mode in AI chatbots, specifically targeting ChatGPT.
Read more on attacks leveraging ChatGPT: ChatGPT-Related Malicious URLs on the Rise
As AI technology continues to advance, concerns about the security and ethical implications of AI jailbreaking are growing.
“Defensive security teams have two major objectives here. First, they can assist in research on how to secure LLMs from prompt-based injection and share those learnings with the community,” explained Nicole Carignan, vice president of strategic cyber AI at Darktrace.
“Second, they can use AI to defend at scale against more sophisticated social engineering attacks. It will take a growing arsenal of defensive AI to effectively protect systems in the age of offensive AI, and we are already making significant progress on this front.”
According to SlashNext, organizations like OpenAI are taking proactive steps to enhance chatbot security through vulnerability assessments and access controls.
“However, AI security is still in its early stages as researchers explore effective strategies to fortify chatbots against those seeking to exploit them,” the company added. “The goal is to develop chatbots that can resist attempts to compromise their safety while continuing to provide valuable services to users.”