AI agents are entering their rebuild era as enterprises confront the reliability problem
Enterprise organizations deploying artificial intelligence agents into production environments are confronting a critical inflection point: rapid initial deployments have exposed fundamental architectural weaknesses that demand comprehensive redesigns. This realization emerged sharply during discussions at the AI Impact Series event in New York, where Preeti Somal, Senior Vice President of Engineering at Temporal Technologies, outlined how companies are now entering what amounts to a wholesale rebuild phase for their agent implementations. The pattern is distinctly recognizable across industries and organizational sizes—teams that moved aggressively to deploy first-generation agents are now discovering that language model performance alone provides insufficient foundation for production reliability. These organizations face the uncomfortable reality that their hastily constructed agent systems lack the underlying infrastructure necessary to handle crashes, preserve execution state, recover from failures efficiently, manage inference costs transparently, and coordinate seamlessly across distributed APIs and enterprise systems. The recognition marks a turning point in enterprise AI maturity, one where technical sophistication must extend well beyond model capabilities into the unglamorous but essential domain of systems engineering.
The current predicament reflects a familiar pattern in enterprise technology adoption cycles, though one typically obscured by the intensity surrounding generative AI's emergence. When organizations accelerated cloud migration in previous decades, many pursued straightforward lift-and-shift strategies that simply moved existing workloads into cloud infrastructure without reconsidering underlying architectural assumptions. That approach ultimately proved economically inefficient and operationally problematic, requiring subsequent waves of redesign and modernization to achieve actual value. The parallel to today's AI agent deployments is striking: enterprises rushed to implement agent systems without first establishing the foundational operational patterns necessary to sustain them at scale. The consequence is that organizations now operate production AI systems lacking basic reliability mechanisms—visibility into system behavior, durability guarantees for long-running processes, state management capabilities, and structured recovery protocols. This timing matters considerably within the broader AI landscape because enterprise deployments now outnumber experimental implementations, shifting the competitive advantage toward organizations that can establish production-grade reliability rather than those simply pursuing the latest architectural novelties. The shift also illuminates why vendors and platforms focused on operational reliability, rather than raw model innovation, are experiencing renewed strategic importance as enterprises mature beyond the initial deployment enthusiasm.
The specific technical challenges underlying this transition are increasingly well understood, though they remain underestimated in scope by many organizations. Long-running agent workflows frequently persist over hours or even days while invoking multiple language models, accessing retrieval systems, triggering external applications, and managing complex state information—a structural reality that transforms familiar engineering problems into urgent practical concerns. When workflow crashes occur, as they inevitably do, organizations face consequential decisions: should the entire process restart from the beginning, incurring duplicate inference costs and extended latency, or does the system possess mechanisms to resume execution from the exact point of failure? Similarly, cost visibility across multi-step agent processes represents a challenge that many enterprises have simply not yet confronted in concrete operational terms. Temporal's implementation philosophy emphasizes observability as a core capability, allowing teams to track where tokens are being consumed across an agent's entire execution pipeline in what Somal described as a single pane of glass view. This visibility directly addresses the economic concern that drives much enterprise decision-making: understanding and controlling the token expenditure associated with complex agent workflows, particularly when failures force reexecution from early stages. The distinction between state (capturing where an agent stands within a process and which steps have completed) and memory (the context and information the agent carries forward) proves critical when designing recovery mechanisms for processes that extend beyond simple conversational interactions into substantive business workflows.
For organizations currently operating AI agents in production environments, these architectural insights translate directly into operational and financial impact. A procurement workflow, healthcare documentation process, customer support escalation, or compliance verification cannot afford to fail silently when a model call times out or an external dependency crashes, yet many current implementations lack explicit handling for such scenarios. The deterministic spine framework that Temporal advocates treats orchestration software as the reliability backbone while accepting that language models remain fundamentally probabilistic and non-deterministic in their outputs. This separation of concerns enables enterprises to maintain execution consistency and recovery capabilities independent of model behavior variability. The immediate financial implications are substantial: a workflow failure occurring at step eight of a ten-step process that lacks recovery mechanisms forces complete reexecution from step one, multiplying token consumption and incurring costs that organized orchestration could eliminate entirely. For enterprises operating under tightening AI budgets and facing pressure to demonstrate concrete ROI from AI investments, this represents a tangible and measurable source of waste. Governance frameworks present an equally concrete concern—standardized internal frameworks that establish guardrails, enforce model selection policies, implement cost management controls, and maintain comprehensive observability cannot be addressed through off-the-shelf managed agent systems alone. Organizations increasingly recognize that they must construct internal infrastructure layers that preserve flexibility while establishing necessary operational controls and visibility.
The broader significance of this enterprise shift extends beyond individual organizational challenges to reveal a fundamental pattern in how AI technology is maturing within business contexts. The initial wave of AI adoption emphasized model innovation and rapid experimentation, but production deployment requires systematization and architectural discipline that shifts advantages toward infrastructure and platforms focused on reliability rather than model advancement alone. This pattern parallels how cloud computing technology matured beyond initial deployment enthusiasm toward platforms and services that enabled organizations to actually operate at scale cost-effectively. The rise of orchestration and workflow management in the AI context suggests that infrastructure layers addressing execution reliability, state management, observability, and cost control will become increasingly central to enterprise AI strategy. Notably, platforms like Temporal possess significant strategic advantage because they often already exist within enterprise modernization initiatives launched before AI became a dominant priority, positioning them naturally for expansion into agentic AI domains. This reality contradicts the assumption that specialized AI-native platforms will necessarily dominate enterprise adoption—existing infrastructure that can be extended and adapted often proves more effective than purpose-built replacements, particularly when organizational inertia and existing integration patterns favor incremental enhancement. The pattern suggests that enterprise AI success increasingly depends on systematic integration with existing operational infrastructure rather than deployment of isolated AI systems, a perspective that fundamentally reframes how organizations should approach their AI strategy.
Organizations evaluating their AI agent implementations should monitor several specific developments over the coming months that will signal how this rebuild transition unfolds across the enterprise technology landscape. Temporal Technologies' expansion of its enterprise customer base implementing version 2.0 deployments of their initial agents will serve as a concrete leading indicator of how extensively this architectural rethinking spreads through Fortune 500 and mid-market organizations, with particular attention to whether these rebuilds demonstrate measurable improvements in reliability metrics, cost efficiency, and time-to-recovery. Additionally, the emergence of enterprise governance frameworks for agentic AI—including standards for model selection, cost allocation, identity integration, and observability requirements—will reveal whether organizations are developing industry-wide patterns or remaining siloed in proprietary approaches. Looking forward into 2025, the market will increasingly differentiate between vendors offering narrow AI capabilities and those providing comprehensive operational infrastructure capable of supporting production AI systems reliably. Industry analysts should particularly track how established enterprise software providers integrate agentic AI into existing application platforms versus how pure-play AI vendors attempt to expand beyond model and inference layers into operational reliability domains where they lack existing infrastructure. The ultimate competitive outcome will likely depend on whether organizations prioritize rapid model advancement or steady, reliable operational deployment—a tension that will fundamentally reshape enterprise AI investment patterns and determine which platform providers achieve lasting strategic significance in the AI era.