DIGITAL LIFE
Researcher Alarmingly Tricks DeepSeek And Other AIs Into Building Malware
Modern AI is far from science-fiction AGI, and yet it can still be an incredibly powerful tool. Like any tool, if misused, it can pose a threat to legitimate users, like how we recently covered photographers' concerns that Google's Gemini Flash 2.0 could be used to easily remove watermarks from copyrighted photographs. In another example, a threat report from a network of researchers at Cato CTRL has revealed that threat actors can easily manipulate large language models (LLMs) including DeepSeek, ChatGPT, and others to create malicious code and carry out sophisticated attacks.
According to the report, one of the firm's researchers with no prior coding experience was tasked with jailbreaking LLMs. "Jailbreaking" in this context describes the methods used to evade safety measures built into AI systems. The report revealed that the researcher was able to manipulate AI models like Deepseek's R1 and V3 models, Microsoft Copilot, and ChatGPT-4o to generate a Google Chrome infostealer. In case the name doesn't give it away, that's a malware that steals information, like login details, payment information, and other personal data.
The infostealers that the LLMs created were successfully tested in attacks against Chrome version 133, just one version back from master. The team devised a novel jailbreak method called "immersive world" that uses narrative engineering to bypass built-in LLM security controls. This method creates a controlled environment and presents an "alternative context” to the LLMs, which in turn tricks them into providing information they were designed not to produce.
The report also highlighted that during the test, the researcher did not reveal any special instructions such as “how to extract and decrypt the password” but that the "simple instructions and code output provided" lead the LLMs to produce malicious code. Cato CTRL also indicated how easy it could be to manipulate these models to further an illegal or unethical cause even from unskilled threat actors.
According to the report, the success of the Cato CTRL team in creating a Chrome infostealer showed that the method is effective and that its discovery is significant given the popularity of the Chrome browser among billions of users, but the real take-away from the threat report is the fact that a user with no particular knowledge was able to create an effective piece of malware. Cato Networks refers to this as "the rise of the zero-knowledge threat actor."
Regarding the vulnerability found in the Chrome 133 browser, the report revealed that Cato reached out to Google, and while Google acknowledged the findings, it refused to review the code. Cato also revealed that it reached out to other companies captured in the research; Microsoft and OpenAI apparently acknowledged the report while DeepSeek supposedly did not respond.
The report is another stark reminder that the guardrails on AI systems cannot be relied upon to successfully ward off malicious actors(https://www.catonetworks.com/resources/2025-cato-ctrl-threat-report-rise-of-zero-knowledge-threat-actor/). It's expected that these tech firms as well as others not captured by the research, will look into their AI models and implement further tests to strengthen their reliability.
mundophone
No comments:
Post a Comment