(Source-TechGenies)
Prompt Hacking: A New cybersecurity threats
Cybersecurity analysts from Radware have pinpointed a concerning trend in their latest report: the rise of prompt hacking. With AI becoming more accessible, both well-intentioned users and malicious actors are exploiting AI models in unintended ways. Prompt hacking involves injecting malicious instructions into AI models, bypassing their safeguards. This method, highlighted as the number one security vulnerability for Large Language Models (LLMs), poses significant risks. Examples include the “Do Anything Now” (DAN) jailbreak for ChatGPT and instances where users manipulated models to ignore safeguards. Despite efforts to enhance guardrails, researchers have demonstrated vulnerabilities even in cutting-edge LLMs like Google’s Gemini. This arms race between enhancing AI security and exploiting vulnerabilities underscores the evolving nature of cybersecurity threats.
In March 2024, researchers from AI security firm HiddenLayer found they could bypass the guardrails built into Google’s Gemini, showing that even the most novel LLMs were still vulnerable to prompt hacking. Another paper published in March reported that University of Maryland researchers oversaw 600,000 adversarial prompts deployed on the state-of-the-art LLMs ChatGPT, GPT-3, and Flan-T5 XXL.
The results provided evidence that current LLMs can still be manipulated through prompt hacking, and mitigating such attacks with prompt-based defenses could “prove to be an impossible problem.”
Private GPT Models: A Playground for Malicious Actors
Another alarming trend identified by Radware is the proliferation of private GPT models without guardrails. These models, often found on platforms like GitHub, lack the security measures of commercial providers. Dubbed “WormGPT,” “FraudGPT,” and “DarkBard,” these models enable amateur cybercriminals to orchestrate sophisticated attacks such as phishing and malware creation. Despite initial observations that such models were in their infancy, recent discussions at the World Economic Forum underscore their continued relevance. As private GPT models evolve, the threat landscape becomes more complex, requiring innovative approaches to counteract their potential misuse.
Back in August 2023, Rakesh Krishnan, a senior threat analyst at Netenrich, told Wired that FraudGPT only appeared to have a few subscribers and that “all these projects are in their infancy.” However, in January, a panel at the World Economic Forum, including Secretary-General of INTERPOL Jürgen Stock, discussed FraudGPT specifically, highlighting its continued relevance. Stock said, “Fraud is entering a new dimension with all the devices the internet provides.”
Zero-Day Exploits and Credible Deepfakes: Challenges Ahead
Radware’s report warns of a surge in zero-day exploits fueled by open-source AI tools, enabling attackers to automate scanning and exploiting vulnerabilities. Moreover, the emergence of highly credible deepfakes poses significant risks, with state-of-the-art AI systems enabling the creation of fake content with minimal effort. These deepfakes, ranging from video impersonation scams to misinformation campaigns, undermine trust and fuel criminal activities. Despite efforts by ethical providers to implement guardrails, the rapid advancement of AI technology necessitates proactive measures to mitigate emerging cybersecurity threats.
According to the Radware report, another emerging AI-related threat comes in the form of “highly credible scams and deepfakes.” The authors said that state-of-the-art generative AI systems, like Google’s Gemini, could allow bad actors to create fake content “with just a few keystrokes.”
Research by Onfido revealed that the number of deepfake fraud attempts increased by 3,000% in 2023, with cheap face-swapping apps proving the most popular tool. One of the most high-profile cases from this year is when a finance worker transferred HK$200 million (£20 million) to a scammer after they posed as senior officers at their company in video conference calls.
The authors of the Radware report wrote, “Ethical providers will ensure guardrails are put in place to limit abuse, but it is only a matter of time before similar systems make their way into the public domain and malicious actors transform them into real productivity engines. This will allow criminals to run fully automated large-scale spear-phishing and misinformation campaigns.”
In conclusion, as AI continues to permeate various aspects of cybersecurity, stakeholders must remain vigilant and adaptive. Addressing the challenges posed by prompt hacking, private GPT models, zero-day exploits, and deepfakes requires collaboration between industry experts, policymakers, and AI developers. By staying ahead of evolving cybersecurity threats and implementing robust security measures, organizations can safeguard against malicious exploitation of AI technology.