What Breaks When AI Makes a Trillion Decisions

January 2026·5 min read·AI Infrastructure

An agent approved 50,000 micro-loans in three minutes, and every individual decision passed muster with credit checks cleared, risk scores within bounds, and policy constraints satisfied. But the portfolio that emerged was toxic: correlation risk exceeded any reasonable tolerance, concentration in a single sector approached catastrophic levels, and loss estimates reached nine figures before anyone understood what had happened. No log caught it, no alert fired, and the CFO found out a quarter later when the losses finally materialized on the books.

This kind of failure is new. Databases never needed to know whether queries related to each other, and message queues have always ignored correlations between the events they process. Decisioning systems cannot afford that luxury, because the interactions between decisions can matter more than the decisions themselves.

The Scale of What's Changing

Hundreds of billions of API calls flow through the internet every day, and each one carries a decision: where to route, what to return, how to handle failure. Engineers have historically hardcoded most of these choices into configuration files and compiled binaries, but that's changing as AI agents take on more of the decision-making burden. The infrastructure we've built assumes static, predictable behavior, and it will buckle under the weight of dynamic reasoning at scale.

The Precedent We Should Learn From

We've extracted infrastructure layers before when capabilities became too complex to leave embedded in application code. Web applications grew complex, so we pulled out databases and let them evolve independently. Scale demanded caches, and systems like Redis emerged to handle that abstraction. Real-time requirements brought message queues, and now we take all of these layers for granted as foundational infrastructure.

Another extraction is coming, and we might as well call it the decisioning layer: reasoning that runs beneath your services, making contextual choices that used to be static configuration.

What Breaks at Scale

Latency contracts shatter. A typical API call returns in 200-500 milliseconds, and that's the contract your entire system assumes, with load balancers, timeouts, and circuit breakers all depending on responses arriving within that window. Route that same call through a reasoning system and you're looking at 8-45 seconds of processing time, which means your circuit breakers will trip before the decision even completes.

Authentication assumptions fail. OAuth was built for human-speed interactions, where maybe ten authenticated requests per minute was a reasonable upper bound. Agents operate at machine speed, triggering thousands of requests per task as they search, compare, and verify across multiple systems. When you ask "who authorized this decision?" after the fact, the logs will show you activity, but activity is not the same as intention.

Monitoring answers the wrong question. Traditional observability tools are designed to answer "what happened?" but reasoning systems demand something more fundamental: why did it decide that? What's missing is decision provenance, the complete reasoning chain captured at execution time rather than pieced together afterward.

Cost economics become unpredictable. A $0.002 API call might authorize a $50,000 wire transfer, while a more expensive call handles something trivial. Same workload and same task definitions can produce a 50x cost swing depending entirely on which model handles each step.

The Infrastructure We Need

Decision Provenance addresses the observability gap. Tracing what happened has always been straightforward, but tracing why something was decided requires capturing reasoning at execution time to create records you can query, replay, and defend.

Decision Authority solves the authentication problem. Every agent action needs a provable chain back to human intent, not just validated tokens but sanctioned judgment.

Portfolio Awareness prevents the failure mode that opened this essay. A thousand loans approved individually might need rejection as a batch when their combined risk exceeds tolerance, because individual soundness guarantees nothing about aggregate safety.

Policy as Code provides the right abstraction for agent directives. Business rules are deterministic, following simple if-then logic, but agent directives work differently by expressing constraints plus objectives: "minimize delivery time while respecting these boundaries, given this context."

The Question You Should Be Asking

If you're building with AI agents, two questions matter more than any others: Can you reconstruct why a decision was made, with the complete reasoning chain rather than just inputs and outputs? And would your monitoring catch systemic risk across a thousand decisions, or would it only show you individual approvals that each look fine in isolation?

If the answer to either question is no, you're building on infrastructure that doesn't exist yet.

← Back to all writing