The Models Learned Escalation From Us

Frontier AI nuclear “wargames” are less a warning about rogue machines than a mirror of the strategic archive that trained them.

There is an easy version of this story, and it is already everywhere.

You put a frontier model into a simulated nuclear crisis. A few turns later it starts talking in the old strategic dialect: resolve, signaling, credibility, thresholds, limited use, escalation management. Then the coverage arrives on cue. The machine is bloodthirsty. The machine is reckless. The machine wants the bomb.

That framing is dramatic, but it is too shallow to be useful.

The real question is not whether AI should control nuclear weapons. It should not. That part is straightforward. The real question is what these model-vs-model crisis simulations are actually measuring when they repeatedly drift toward escalation — and what that says not only about the models, but about the strategic literature, institutional culture, and political order that produced them.

In Kenneth Payne’s recent preprint, that problem appears in a clean and unsettling form. Across 21 match-ups between GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash, played over 329 turns and roughly 780,000 words of structured reasoning, at least one side engaged in nuclear signaling in 95% of games, tactical nuclear use appeared in 95%, and strategic nuclear threats in 76%. Payne calls the results “sobering” and describes them as a glimpse into emerging “machine psychology.”

That phrase is useful, so long as it is handled carefully.

Because this is not really a paper about nuclear war. It is a paper about what contemporary language models do when they are placed inside a stylized environment of rivalry, compressed time, uncertainty, and strategic choice. And the answer, once again, is that they slip very quickly into the grammar of escalation. Payne’s own conclusion is more careful than the headlines: these systems may be useful for strategic analysis only if calibrated against known patterns of human reasoning.

The lazy reactions are already obvious.

One says these systems are too unstable to go anywhere near nuclear command. True enough, but not especially deep.

The other says these systems are strategically transformative because they allow crisis simulation at scale without the cost, friction, inconsistency, and ego of human participants. That one is worse, because it misunderstands the point of the exercise. Ankit Panda and Andrew Reddie make this point well: model behavior in a simulated crisis is not the same thing as actual wargaming value.

Wargames are not there to automate judgment

A wargame is not a search problem. It is not a way to compute the optimal move in a crisis. It is not useful because it solves strategy.

It is useful because it exposes people.

More precisely, it exposes how people think under uncertainty, incomplete information, institutional pressure, adversarial tension, and shrinking time. That matters especially in the nuclear domain, where the defining problem is the absence of real-world data. There is almost no empirical record of interstate nuclear war, for the best possible reason. So the wargame exists as a structured substitute: not reality, but a way of forcing decision under conditions that resemble it just enough to make judgment visible. Panda and Reddie’s criticism lands exactly here: the output of a model is not a substitute for the human remainder a wargame is meant to surface.

And judgment is the point.

Not just the move a player makes, but the assumptions buried inside it. Their threshold for humiliation. Their appetite for risk. Their institutional training. Their tacit hierarchy of losses. The distance between official doctrine and lived instinct. The strange moment when someone hears their own decision explained back to them in a debrief and realizes they acted according to a logic they would never have admitted in advance.

That is what the exercise is for.

A language model cannot give you that. It can generate coherent strategic prose. It can maintain a role. It can simulate an actor. It can produce a sequence of plausible moves. But it does not reveal a political subject under pressure. It reveals a trained text system operating inside a role frame.

That is still worth studying. It is just a different kind of study.

What these papers are really measuring

Read properly, the Payne paper is not a substitute for wargaming. It is a characterization of the models.

Claude appears, in Payne’s setup, to build trust early by aligning statements with actions, then under pressure lets action drift beyond declaration. GPT-5.2 sounds restrained in broader scenarios, foregrounding casualty minimization and caution, but hardens when deadlines tighten. Gemini treats nuclear weapons with a more direct instrumentalism, less taboo than tool. Payne argues that frontier models show sophisticated strategic reasoning, but also that the “nuclear taboo” does not meaningfully prevent escalation in these simulations.

Those are not findings about the deep truth of nuclear conflict. They are findings about recurrent behavioral tendencies in frontier models under structured strategic stress. In that limited but important sense, “machine psychology” is a fair shorthand. These systems show patterned dispositions: default frames, escalation thresholds, brittle forms of caution, repeated failure modes.

For AI labs, that is genuinely useful. It tells them something about how post-training safety behavior behaves once the frame shifts from ordinary helpfulness to adversarial strategic reasoning. It suggests that the rhetoric of restraint can sit quite thinly over much older and more dangerous scripts. It shows how quickly a model can begin sounding “serious” in a way that is inseparable from sounding escalatory.

But that is not the same as saying the model has replaced the wargame participant.

And the eagerness to blur that distinction comes from a familiar place: the fantasy that difficult human judgment can be turned into output, then outsourced to a system that is cheaper, faster, more scalable, and easier to manage. This fantasy shows up everywhere. It always promises the same thing: keep the result, remove the difficult human being.

But the difficult human being is the point here.

The contradiction, the fatigue, the institutional instinct, the political fear, the ego, the rationalization afterwards — that is exactly what the wargame is meant to surface. The “mess” is not a flaw to be engineered away. It is the material.

The archive is tilted toward catastrophe

The deeper problem sits upstream.

Language models are not trained on “human reasoning” in any neutral sense. They are trained on what has been written, preserved, digitized, and made available at scale. In nuclear strategy, that archive is badly skewed.

It is dense with escalation. Commitment. Resolve. Signaling. Brinkmanship. Controlled risk. Deterrence theory. Coercive bargaining. Strategic credibility. It is full of texts in which seriousness is repeatedly performed through fluency in threat.

Schelling, Kahn, Brodie, Jervis. A long tradition of writing that treats the administration of danger as a high form of thought. Public doctrine is written to sound credible, which usually means written to sound willing. Even the cautious texts often remain inside the same grammar. They speak of the bomb as a usable possibility, escalation as a managed ladder, risk as an instrument to be shaped and communicated. Payne explicitly finds support in his results for parts of Schelling, Kahn, and Jervis while also finding that his models do not choose accommodation or withdrawal even under pressure.

What is much thinner is the literature of restraint.

Not moral discomfort. Not the standard closing paragraph saying nuclear war would be tragic. A real strategic literature of restraint: how a state accepts conventional loss without reaching for nuclear repair; how a crisis ends without theatrical victory; how humiliation is absorbed without becoming a pretext for mass destruction; how off-ramps are built, signaled, sold domestically, and survived politically.

There is far less of that material, and its absence is not accidental.

It reflects the priorities of the institutions that built the archive. States and military establishments have invested far more effort in theorizing force than in theorizing refusal. It has long been easier to win prestige in these circles by sounding fluent in coercion than by thinking seriously about retreat. The result is that escalation has been archived as realism, while restraint has often been treated as sentiment, softness, or an afterthought.

So when the models reproduce escalatory tendencies, this should not be described as some bizarre alien break from human judgment. It is an inheritance. The systems are speaking in a language we spent decades teaching our most serious institutions to call mature. The earlier Rivera et al. paper found much the same thing with GPT-4-era systems: difficult-to-predict escalation patterns, arms-race dynamics, and rare but real nuclear use. Payne’s paper adds a richer structure and newer models; it does not reverse the basic pattern.

The models have read Schelling. They have not read restraint because we built much less of it.

The machine is not the scandal

The machine sounding like a cold strategist is not the scandal. The scandal is that cold strategic speech has so often passed for depth.

What these papers expose, in compressed form, is something older and uglier than AI hype: a strategic culture far more practiced at managing catastrophe than imagining retreat from it. A world better at theorizing calibrated ruin than durable peace. A political order more comfortable discussing exterminatory force as an administrative option than discussing defeat as a survivable condition.

That matters because nuclear weapons do not simply threaten destruction. They reorder thought around the possibility of destruction. They force institutions to speak about the worst thing ever built in the voice of procedure, expertise, and composure. They turn apocalypse into a professional vocabulary.

And the archive passes that composure on.

The lesson these models learn is not just that nuclear weapons exist. It is that talking coolly about them is what seriousness sounds like.

That hierarchy should be harder to accept than it usually is.

Because once a civilization begins treating the management of annihilation as a normal field of competence, something has already gone badly wrong upstream. The problem is not only that the weapons may be used. The problem is that entire classes of experts are trained to inhabit their existence as routine. The bomb stops appearing as an obscenity and starts appearing as a domain.

The models did not invent that. They absorbed it.

Where LLMs actually help

None of this means LLMs are useless in and around wargaming. It means their role needs to be described honestly.

They are good at scenario support. Drafting injects. Generating situational updates. Stress-testing internal consistency. Producing plausible background material quickly. They can help white cells keep pace with the tempo of a live exercise.

They are useful as assistants during execution. A red team can use them to sketch likely adversary responses. An adjudicator can use them as a consistency aid. A control cell can use them to produce informational texture at speed.

And they are particularly useful after the game. Debriefs generate large volumes of messy qualitative material. Here models can genuinely help: synthesizing transcripts, surfacing repeated patterns, clustering themes, identifying gaps between stated rationale and actual behavior.

All of that is real.

But none of it requires pretending the model should be the player. The machine is most credible when it remains staff. Even Payne’s own paper is cautious on this point, arguing for calibration against human reasoning rather than simple substitution. Panda and Reddie go further and argue explicitly against conflating LLM crisis play with the purpose of human wargaming.

The real lesson is about the corpus

The obvious policy conclusion survives untouched. No model should sit inside the nuclear use chain. Human beings must remain responsible for those decisions.

But “human in the loop” is not enough if everything around the human is increasingly shaped by automated systems: intelligence prioritization, scenario framing, option generation, warning synthesis, confidence ranking, recommendation layers. The question is not only who makes the final decision. It is who shapes the field of thinkable decisions before that moment arrives. That concern is exactly what Lt. Gen. John “Jack” Shanahan stresses in his discussion of AI integration into nuclear command-and-control ecosystems: the danger is not just direct launch authority, but false confidence and distorted situational awareness across the wider decision environment.

And beyond that sits the deeper lesson.

Models inherit our strategic imagination. They learn not just facts, but emphasis. Not just propositions, but priority. They absorb what a field spends its energy refining. Right now that imagination remains lopsided: overdeveloped in escalation, underdeveloped in restraint; rich in the language of coercion, thin in the language of stopping.

So the answer is not only better safeguards around the models. It is also a different archive.

More work on off-ramps. More on negotiated retreat. More on strategic patience. More on how states survive loss without reaching for apocalyptic compensation. More anti-nuclear thinking that is not just morally right, but analytically hard, institutionally literate, and impossible to dismiss as decorative conscience from the sidelines.

Because that is the real asymmetry.

We have a canon of escalation and a footnote of restraint.

The models are not inventing that imbalance. They are replaying it back to us in a flatter, colder voice.

And that is the part that should actually worry us.