Skip to content

Scaling Intelligence: The Security Foundations Beneath America’s AI Ambitions Are Cracking

Artificial intelligence diffusion is stress-testing the assumptions that underpin U.S. cybersecurity. Inspecting those foundations isn’t a precaution against scaling AI—it’s the precondition for doing it with confidence.

Signage of artificial intelligence is seen during the World Audio Visual Entertainment Summit in Mumbai, India, on May 2, 2025.
Signage of artificial intelligence is seen during the World Audio Visual Entertainment Summit in Mumbai, India, on May 2, 2025. Indranil Aditya/Getty Images

By experts and staff

Published
  • Vinh NguyenCFR Expert
    Senior Fellow for Artificial Intelligence

Vinh X. Nguyen is senior fellow for artificial intelligence (AI) at the Council on Foreign Relations.

Two bolts of lightning have struck the cybersecurity landscape in six months. In November 2025, Anthropic disclosed that Chinese state-sponsored actors had used its Claude model to run a largely automated cyberespionage campaign; the AI performed 80 to 90 percent of the work across roughly thirty targets. In April 2026, the same company revealed that its Mythos model had autonomously discovered thousands of previously unknown vulnerabilities in every major operating system and browser—a capability the company deemed significant enough to withhold from general release.

Between strikes, the steady rain has begun. The rain is AI itself—diffuse, constant, soaking every institution. Much of it nourishes science, software, and defense. But the same rain falls on weakened ground. CrowdStrike documented an 89 percent year-over-year increase in AI-enabled adversary operations. Mozilla’s application of Mythos to Firefox surfaced 271 vulnerabilities in a single evaluation that would have taken elite researchers years to find.

Governments and boardrooms are patching, monitoring, and reinforcing. But the harder question is whether the United States can scale what it intends to build with confidence that the ground beneath it will bear the weight. Leadership in AI will not be decided by who trains the largest models. It will be decided by who can deploy them most deeply into the systems that generate advantage, and that depends on whether the foundations can withstand what is now being asked of them.

Built on assumptions

For thirty years, cybersecurity rested on three assumptions so embedded they were never written down: that sophisticated attacks would remain expensive, that identity systems built for humans could extend to whatever came next, and that human judgment would remain in the path of consequential decisions. These assumptions underlie not only security budgets but the country’s ability to deploy AI across the institutions that generate economic and military advantage. Today, they are cracking.

The first crack is in plummeting attack costs. What once required an intelligence service can now be replicated by a motivated individual with a frontier model. CrowdStrike observed a 38 percent increase in China-nexus intrusions in 2025—a tempo impractical without AI assistance. Defenders are gaining too; Mythos-class capabilities are hardening critical software faster than human teams could. But attackers merely need one path, while defenders must close all of them. Patching pipelines and procurement cycles were built for human tempo. It remains to be seen whether defenders can keep pace with capabilities arriving at machine speed.

The second crack is identity. Every credential and access control system was built on an unstated premise—that an identity belongs to a person. Nonhuman identities now substantially outnumber human ones, yet the identity layer was never built to verify that an agent’s actions match the intent of the human who dispatched it. In March 2026, an AI agent at Meta posted incorrect guidance to an internal forum without its operator’s approval. A second employee acted on that guidance and posted sensitive data that sat exposed for nearly two hours. The gap is not technical. It is structural in how institutions assign, scope, and verify trust, and it widens with every agent granted credentials designed for people. Governance frameworks for nonhuman identity exist on paper. Deployments are outrunning them.

The final crack is the most subtle: human judgment itself. When the identity layer can no longer verify who or what is acting, the remaining check is the individual—the analyst who pauses at an anomaly or the engineer who notices a system behaving oddly and asks why. This was never a designed control. Damage simply spread as fast as people could respond to it. That check is now the last line of defense, exactly as it is being engineered out. Organizations are deliberately automating reviews, approvals, and separation of duties to operate at machine speed—a trade-off most are making by default rather than by deliberation.

None of these assumptions lapsed because someone made a bad decision. They lapsed because no one was assigned to revisit them.

The audit that matters

Technology mandates and new regulatory bodies will not solve the problem of unexamined premises. What this moment demands is not louder alarms but an audit of assumptions. Where does the threat model still assume capability is scarce? Where does identity assume a human on the other end? Where have people served as informal structural reinforcement that no one designed in, and what replaces that reinforcement as it is automated away?

The work begins not with new controls but with a larger question: what did we assume, and does it still hold? The output is a written account of what was taken for granted, drawn from the people closest to the systems. An enterprise may find its change-approval process was automated two years ago without a governance decision; a federal agency may find that under continuous-authorization pilots, the security scrutiny its authority-to-operate assumed is now a dashboard no one is required to read.

In the private sector, the forum for this audit already exists: the board’s risk committee, where accountability for structural commitments resides. The deliverable should be specific: a written register of the security assumptions on which current AI deployments depend, reviewed annually and updated when any assumption is found to have lapsed. Companies that have undergone post-quantum readiness assessments already have a template. The exercise is the same, but applied more broadly.

In government, the Office of the National Cyber Director (ONCD) should claim it. ONCD’s statutory mandate is coherence across federal cybersecurity—precisely the vantage point from which lapsed assumptions are visible and agency-by-agency efforts are not. A reasonable first deliverable: working with the Cybersecurity and Infrastructure Security Agency (CISA) and sector risk management agencies, produce within six months an inventory of the security assumptions underpinning federal civilian and critical infrastructure systems. This is not a new framework, but an accounting of the old ones. The inventory is the precondition for informed decisions about where AI can be deployed with confidence and where the assumptions beneath it need to be revisited first.

There is a competitive logic to doing this deliberately. China is embedding AI across its security and governance apparatus—from automated censorship to AI-recommended criminal sentencing—at a pace that does not pause for assumption audits. It is already exporting that infrastructure to willing buyers abroad. The U.S. advantage is not that its foundations are sounder, but that its institutions can choose to look. An open system can audit itself in ways a closed one structurally cannot, and that capacity is a strategic asset—if exercised before the next failure forces the question rather than after.

These are governance questions, not compliance ones. The United States holds real advantages—in model capability, in allied coordination, in the depth of its defender community—every one resting on assumptions that were reasonable when made and unexamined since. The instinct will be to treat that examination as overhead: one more review standing between leadership and the real work of deployment.

But inspecting the foundation is the precondition for scaling AI with confidence. You don’t wait for the storm to find the cracks—you find them first. That is the real work.

This work represents the views and opinions solely of the author. The Council on Foreign Relations is an independent, nonpartisan membership organization, think tank, and publisher, and takes no institutional positions on matters of policy.