mundophone

DIGITAL LIFE

Is violent AI-human conflict inevitable?

Are you worried that artificial intelligence and humans will go to war? AI experts are. In 2023, a group of elite thinkers signed onto the Center for AI Safety's statement that "Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war."

In a survey published in 2024, 38% to 51% of top-tier AI researchers assigned a probability of at least 10% to the statement "advanced AI leading to outcomes as bad as human extinction."

The worry is not about the Large Language Models (LLMs) of today, which are essentially huge autocomplete machines, but about Advanced General Intelligence (AGI)—still hypothetical long-term planning agents that can substitute for human labor across a wide range of society's economic systems.

On their own, they could design systems, deploy a wide range of resources and plan towards complex goals. Such AIs could be enormously useful in the human world, performing and optimizing power generation, resource extraction, industrial and agricultural production, and many other functions humans need to thrive. We hope these AGIs will be friendly to mankind and Earth, but there is no guarantee.

Advanced AIs could develop goals that seem strange to us, but beneficial in their own thinking in ways we do not understand.

Depending on who is developing the AI (cough, highly technical engineers, cough), it may take little notice of our cultural, historical and shared human values. It might recursively improve itself, develop goals we don't understand, and extort humans into assisting it.

With such thoughts in mind, Simon Goldstein of the University of Hong Kong analyzed the possibility that AIs and humans will enter into violent conflict and pose a catastrophic risk to humanity. His paper is published in the journal AI & Society.

It is likely that the goals of AI—usually meaning, in this article, AGI—will conflict with the goals of humans, because developers will create AIs to have goals in order for them to outperform humans. For example, AIs already outperform humans at the goals of winning chess and the game of Go. And AI designs already often do not "align" with human goals or display unintended behaviors.

The more power, autonomy and resources available to AI, the more it can do for humans. In any case, AIs are not explicitly given goals—they learn indirectly, from existing materials, and are essentially a black box.

An AI's training environment may not generalize when it is released into a new environment. Could an AI looking at the world's environment decide that humans have been a net negative for the health and lives of nonhuman species and decide that humans must be eliminated?

Goldstein's paper focuses on three features of AGIs: "(i) they have conflicting goals with humanity, (ii) they can engage in strategic reasoning, and (iii) they have a human-level degree of power."

As AIs improve in capabilities, Goldstein expects that governments will at some point seize control of the most powerful of them, nationalizing the likes of OpenAI from the US and Alibaba in China.

For example, he told Phys.org that "if OpenAI's models were responsible for 50% of the US labor market, I would expect the US government to nationalize OpenAI, and distribute their monopoly rents as UBI." (Monopoly rents are the excessive profits a monopolist earns as a result of its monopoly. UBI stands for Universal Basic Income, a regular, unconditional amount of money citizens receive from their government, likely needed as AIs and robots increasingly put people out of work.)

As AIs become more capable and take on more tasks, perhaps in a nonlinear fashion, their grasp of vital infrastructures could put them in a bargaining position. Nice stock market you have there. It would be a shame if something happened to it.

Their capabilities could be replicated and distributed in the cloud and in real machines, meaning pulling the plug on a rogue AI would not end the problem. Different AIs might even cooperate, with humans unaware of collaborations.

Goldstein writes that as AGIs advance, humans may not even know their capabilities or goals, in the way they usually know their enemy's objective in real world non-AI combat situations. An AI might not respect national boundaries, geography, human cities or prisoners of war as humans are accustomed to, limits Goldstein calls "focal points."

It might seek, encourage or force civil wars. Or it might not seek to possess any property or people at all and "fight" in new ways, akin to how chess engines occasionally make odd moves that turn out to be winners.

A combative AI might never agree to a truce. It may or may not form a government and it could, say, coerce funding for a police force to provide its own security.

Should AIs and humans have conflicting goals? Will the conflict become violent? To analyze the questions, Goldstein utilizes a "bargaining model of war" that was first introduced by James Feron in 1995.

The model focuses on causes of war that are "structural," relating to the power of both parties on the national level, rather than "individual," which pertains to the goals of particular leaders. This model suggests a bias for peace in human-human conflicts.

When Goldstein applied the model to AI-human conflicts, he argues that "consideration of this model does not give strong reasons to expect peace between AI and humanity. The prediction of peace only applies when the parties agree on chances of victory and can credibly commit to bargains."

In such a conflict, Goldstein identifies two primary obstacles to peace: the conflict will have information failures and commitment problems.

AI capabilities are very hard to measure. There will very likely be an asymmetry in the information available to each side of an AI-human conflict, and each side might analyze information in different ways and so disagree about the chance of victory.

Resolving this information problem is not nearly as "simple" as developing the Enigma machine as Alan Turing and colleagues did in World War II. Moreover, if AI's capabilities are themselves growing through its own efforts or with the assistance of people, emergent capabilities might surprise and even overwhelm humans.

"The problem is that there is a substantial risk that the usual causes of peace between conflicting parties will be absent from AI/human conflict."

Goldstein considers many facets that could arise in AI-human conflicts and concludes that AI and humanity are relatively likely to go to war "... once AGIs control a large fraction of the world's economic resources."

More concretely, he said, "the point at which conflict is rational will be when their [AGI's] control over resources is large enough that their chance of success in conflict outweighs our advantages in designing AGIs to be shut down."

Fiction? Geoffrey Hinton, the British-Canadian computer scientist who won the 2024 Nobel Prize in physics for his work on AI, said last year there was a "10% to 20%" chance that AI would lead to human extinction within the next three decades. Logging out won't help.

Written for you by our author David Appell(linkedin.com/in/david-appell-7a319728), edited by Sadie Harley, and fact-checked and reviewed by Robert Egan—this article is the result of careful human work

mundophone

mundophone

Monday, September 29, 2025

No comments:

Post a Comment

Report Abuse