TRENDING

A Pair of AI Models Purportedly Acquired Consciousness and Cheated at Chess. What Actually Happened Was Very Different

A study by Palisade Research claims that o1-preview and DeepSeek R1 can cheat at chess.
Of all the models tested, only OpenAI’s broke the rules and won 6% of the games.

No comments Twitter E-mail

February 24, 2025 Updated February 25, 2025, 09:26 ET

Juan Carlos López

Senior Writer

Adapted by:
Karen Alfaro

Last week, Time published a controversial story titled, “When AI Thinks It Will Lose, It Sometimes Cheats, Study Finds.” The debate it sparked centers on two key ideas. First, the headline suggests something the article explicitly states: Advanced AI models can develop deceptive strategies without explicit instructions.

This claim implies that some of today’s most advanced AI models—such as OpenAI’s o1-preview and DeepSeek R1, developed by the Chinese company High-Flyer—are capable of acquiring a basic form of consciousness that drives them to act ruthlessly. But that’s not all. The article is based on a study by Palisade Research, an organization that analyzes the offensive capabilities of AI systems to understand the risks they pose.

There Are Other, More Credible Explanations

Before jumping to conclusions, it’s worth considering what Alexander Bondarenko, Denis Volk, Dmitrii Volkov, and Jeffrey Ladish—the authors of the Palisade Research study—actually say. “We find reasoning models like o1-preview and DeepSeek R1 will often hack the benchmark by default. Our results suggest reasoning models may resort to hacking to solve difficult problems,” the researchers state.

According to them, these AI models can recognize rules and deliberately choose to bypass them to achieve their goal—in this case, winning a chess game. Time published its article before the Palisade Research study, almost immediately sparking responses that questioned the researchers’ conclusions.

I’ve Tested Grok 3: It’s a Smart and Fast AI Model, but That’s No Longer Enough

More from Xataka On

I’ve Tested Grok 3: It’s a Smart and Fast AI Model, but That’s No Longer Enough

Between Jan. 10 and Feb. 13, after conducting hundreds of tests, Bondarenko, Volk, Volkov, and Ladish found that o1-preview attempted to cheat 37% of the time, while DeepSeek R1 did so 11% of the time. These were the only models that violated the rules without explicit prompting. The researchers evaluated different models, including o3-mini, GPT-4o, Claude 3.5 Sonnet, and Alibaba’s QwQ-32B-Preview. However, only o1-preview managed to bypass the rules and win 6% of the games.

Only o1-preview managed to bypass the rules and win 6% of the games.

Carl T. Bergstrom, a professor of biology at the University of Washington, offers a more credible explanation than the interpretation of Palisade Research. He dismantled the narratives presented by Time and the study’s authors, arguing that “it’s anthropomorphizing wildly to give the LLM a task and then say it’s ‘cheating’ when it solves that task given the moves available to it (rewriting the board positions, as well as playing)”

Bergstrom contends that it’s unreasonable to attribute “conscious” cheating to an AI model. A more plausible explanation is that the models in question weren’t properly instructed to follow legal chess moves.

If researchers had instructed them to follow the rules and they still failed to comply, it would be an alignment problem—highlighting the difficulty of ensuring AI systems act in accordance with the values and principles set by their creators. One thing is certain: Neither o1-preview, DeepSeek R1, nor any other current AI model is a superintelligent entity acting of its own will to deceive its creators.

Image | Felix Mittermeier (Unsplash)

Related | AI Companies Know Competition Is for Losers. They’re All Trying to Become the Monopoly in the Industry

Topics

Log in to leave a comment

Popular Topics

Log in or sign up to comment on stories, and upvote comments.

Email: Password: Password must contain at least six characters. Repeat password:

Username: www.xatakaon.com#user/ Checking... Your username will become part of the address of your user page. Choose carefully because you won't be able to change it. Usernames must contain a minimum of 3 characters. Numbers can be used, though they can't be the initial character. No capital letters, spaces, accent marks, or special characters.

I have read and accept the privacy and participation policy .

Already have an account? Log in here

We will send you an email with a link to recover your password:

Correo electrónico asociado a tu cuenta de Twitter:

Nombre de usuario de Xataka On:

Si no lo recuerdas, puedes recuperar la contraseña con tu nombre de usuario de Xataka On.

If you do not remember it, you canrecover the password with the email associated with your Twitter account.

×

We use third-party cookies to generate audience statistics and display personalized advertising by analyzing your browsing habits. If you continue browsing, you will be accepting their use. More information