Silicon Valley Insiders Admit AI Agents Are Still “Rickety” — And Cost More Than Expected

California Gazette Staff

The pitch for AI agents has been consistent across boardrooms and investor decks throughout 2026: autonomous digital workers that never sleep, plow through office tasks at scale, and transform enterprise productivity. What actually happened this week at two Silicon Valley events told a more complicated story.

At the Generative AI and Agentic AI Summit in San Jose on Wednesday — and at a separate industry session the same week — engineers and executives from companies including Google, Amazon, Microsoft, and Meta acknowledged publicly that building and deploying AI agents at scale remains technically difficult, operationally complex, and potentially expensive in ways many companies are not prepared for.

The Token Problem Nobody Warned You About

Despite the C-suite’s enthusiasm over artificial intelligence agents that can plow through office tasks like never-sleeping interns, the underlying technology is still rickety and a potential cost-sucker.

Kevin McGrath, the CEO of the AI startup Meibel, said during a session that “the biggest problem that we’re working with in AI right now” involves the misguided idea that everything needs to be processed by a large language model, or LLM.

McGrath warned: “Just give all of your tokens and all of your money to an AI Claw bot that will just waste millions and millions of tokens” — explaining that companies need to be more deliberate when deciding which tasks are actually suited for AI agents.

The point is not that AI agents are ineffective. It is that companies rushing to plug them into every workflow are discovering that indiscriminate deployment burns compute resources without proportional return. Routing the wrong task through a large language model does not just slow things down — it adds direct operating costs that compound quickly at enterprise scale.

Google Engineers Lay Out the Deployment Realities

At the Generative AI and Agentic AI Summit in San Jose, technical staff from companies like Google and its DeepMind AI unit, Amazon, Microsoft, and Meta revealed that creating and operating AI agents is not an easy task. One session led by Google software engineer Deep Shah focused on new techniques intended to help manage the operational costs of running fleets of AI agents.

Shah framed the challenge directly: “If you think of a machine learning system or any multi-agent system, there are multiple challenges you will find when you try to deploy that system at scale. The first one is the inference cost.”

Ravi Bulusu, CEO of the startup Synchtron, pointed to the problem of complexity — noting the various ways companies organize data, choose tech platforms, and build their software and workforces — as a compounding factor that makes multi-agent deployments far harder to manage in practice than in the demo environment.

The OpenClaw Frenzy and Its Enterprise Limits

The backdrop to this week’s candor is the rapid, somewhat chaotic spread of OpenClaw — the open-source agentic framework whose GitHub repository has accumulated hundreds of thousands of stars since it emerged in late 2025. Nvidia CEO Jensen Huang called it “the largest, most popular, most successful open-source project in the history of humanity” at GTC 2026 in San Jose last month.

But popularity and enterprise readiness are not the same thing. Despite OpenClaw’s growing adoption, ThinkingAI CEO Han said it is “too complicated and too prone to security flaws for businesses.” He was direct: “OpenClaw is a good tool for personal things, but definitely cannot reach the enterprise level. In terms of the enterprise level, you have to figure out a lot of things — your memory, how to manage your agents, teams, communications; there are a lot of things you have to figure out.”

Security researchers have flagged structural exposure issues in OpenClaw deployments, and the framework’s community-driven skills marketplace — while expanding rapidly — introduces untrusted code risks that most enterprise compliance frameworks are not yet equipped to address.

Why This Matters for California

California sits at the center of this tension in ways no other state does. The San Francisco Bay Area is home to the companies building the most advanced AI agent systems — Anthropic, OpenAI, Google DeepMind, Microsoft Azure’s AI operations — and simultaneously the companies deploying them at scale across finance, healthcare, logistics, and legal services.

The state also carries the regulatory weight. Multiple California AI laws took effect in 2026, including the Transparency in Frontier AI Act and automated decision-making disclosure requirements under the California Consumer Privacy Act. If AI agents deployed in enterprise California settings are malfunctioning silently, burning budgets, or operating beyond what companies can actually monitor, that is a compliance exposure as much as a technical one.

Governor Newsom’s executive order earlier this month — directing California to independently assess federal designations of California AI companies as supply-chain risks — signals that the state views itself as the primary authority on how AI infrastructure gets evaluated here. That positioning also means California policymakers are watching the gap between what the industry is selling and what it is actually delivering.

Industry analysis projects that inference — running AI continuously in production — is expected to represent 70 to 80 percent of total AI compute costs by 2026, with cloud providers already redesigning pricing models to account for AI workloads. For California enterprises that have committed to agentic AI without fully modeling those costs, the bills are coming.

The Value of Saying It Out Loud

What stood out this week was not the technical problems themselves — those have been visible to engineers for months. What stood out was that senior figures from Google, Amazon, Microsoft, and Meta said them openly at an industry event in San Jose, on the record.

The C-suite pitch has run well ahead of engineering reality for over a year. This week’s sessions suggest the engineering community is pulling that gap back into view. For California investors, buyers, and policymakers navigating an AI landscape built largely on projections rather than proven operating results, that correction is worth paying attention to.

PEOPLE ARE READING