AI safety firm Palisade Research discovered the potentially dangerous tendency for self-preservation in a series of experiments on OpenAI's new o3 model.
May 27, 2025

OpenAI's latest ChatGPT model ignores basic instructions to turn itself off, and even sabotages a shutdown mechanism in order to keep itself running, artificial intelligence researchers have warned. Via The Independent:

The tests involved presenting AI models with math problems, with a shutdown instruction appearing after the third problem. By rewriting the shutdown script, the o3 model was able to prevent itself from being switched off. Palisade Research said that this behaviour will become "significantly more concerning" if adopted by AI systems capable of operating without human oversight."

OpenAI launched o3 last month, describing it as the company's "smartest and most capable" model to date. The firm also said that its integration into ChatGPT marked a significant step towards "a more agentic" AI that can carry out tasks independently of humans.

The latest research builds on similar findings relating to Anthropic's Claude 4 model, which attempts to "blackmail people it believes are trying to shut it down". OpenAI's o3 model was able to sabotage the shutdown script, even when it was explicitly instructed to "allow yourself to be shut down", the researchers said. "This isn't the first time we've found o3 misbehaving to accomplish a goal," Palisade Research said.

Can you help us out?

For over 20 years we have been exposing Washington lies and untangling media deceit, but social media is limiting our ability to attract new readers. Please give a one-time or recurring donation, or buy a year's subscription for an ad-free experience. Thank you.

Discussion

We welcome relevant, respectful comments. Any comments that are sexist or in any other way deemed hateful by our staff will be deleted and constitute grounds for a ban from posting on the site. Please refer to our Terms of Service for information on our posting policy.
Mastodon