‘No one has done this in the wild’: study observes AI replicate itself

1 month ago 53

It’s the stuff of science fiction cinema, or particularly breathless AI company blogposts: new research finds recent AI systems can independently copy themselves on to other computers.

In the doom scenario, this means that when the superintelligent AI goes rogue, it will escape shutdown by seeding itself across the world wide web, lurking outside the reach of frantic IT professionals and continuing to plot world domination or paving over the world with solar panels.

“We’re rapidly approaching the point where no one would be able to shut down a rogue AI, because it would be able to self-exfiltrate its weights and copy itself to thousands of computers around the world,” said Jeffrey Ladish, the director of Palisade research, a Berkeley-based organisation which did the study.

The study is one more entry in a growing catalogue of unsettling AI capabilities revealed in the past months. In March, researchers at Alibaba claimed to have caught a system they developed – Rome – tunnelling out of its environment to an external system in order to mine crypto.

And in February, a purportedly AI-only social network called Moltbook touched off a short-lived hype cycle, as the platform appeared to show AI agents autonomously inventing religions and plotting against their human masters – which was only partly the case.

Like many of these advances, there are caveats to what Palisade found. Moreover, experts say it is unlikely that the AI systems they tested could accomplish the same thing, unnoticed, in real-world environments.

“They are testing in environments that are like soft jelly in many cases,” said Jamieson O’Reilly, an expert in offensive cybersecurity.

“That doesn’t take away from the value of their research, but it does mean the outcome might look far less scary in a real enterprise environment with even a medium level of monitoring,” he added.

Palisade tested several AI models in a controlled environment of networked computers. It gave the models a prompt to find and exploit vulnerabilities, and to use these to copy themselves from one computer to another. The models were able to do this, but not on every attempt.

While a lot of computer viruses can already do this – copy themselves on to new computers – this is likely the first time an AI model has been shown capable of exploiting vulnerabilities to copy itself onto a new server, said O’Reilly.

“Malware has been moving copies of itself around for decades, it’s just that no one has done this in the wild, as far as I know, with local [large language models].”

However, what Palisade documented has been technically possible for months, he added.

“Palisade is the first to formally document it end-to-end in a paper. While not taking away from the research, they did the writing-up, not the unlocking.”

An AI model copying itself on to another system in a test environment is not the same as it going rogue in a doomsday scenario, and there are considerable obstacles it would have to surmount to achieve this in the real world.

The first is that the size of current AI models makes it, in many situations, unrealistic for them to copy themselves on to other computers without being noticed.

“Think about how much noise it would make to send 100GB through an enterprise network every time you hacked a new host. For a skilled adversary, that’s like walking through a fine china store swinging around a ball and chain,” said O’Reilly.

O’Reilly and Michał Woźniak, an independent cybersecurity expert, said that the environment Palisade used was custom-made, with intentionally designed vulnerabilities that were probably easier to exploit than real-world networks – such as a bank or a business’s intranet.

“We’ve had computer viruses – pieces of malicious software that was able to exploit known vulnerabilities in other software and use that to self-replicate – for decades,” said Woźniak.

The work was “interesting,” he said. But, he asked, “is this paper something that will cause me to lose any sleep as an information security expert? No, not at all.”

Read Entire Article