Berkins Consulting
Insight Image
Brief

AI technology stack for business growth: driving speed, scale and ROI

From episodic agents to long-running agents.

Author
Eric Sheng

Partner

AI transformation strategist focused on enterprise intelligence and long-term digital capability building.

At a Glance

  • Most AI agents operate episodically.
  • Long-running agents maintain memory.
  • Economic value shifts to persistence.
  • Governance becomes critical.

From Pilot Projects to Production Power: The AI Technology Stack That Changes Everything

 

There is a particular inflection point that every serious enterprise AI programme eventually reaches, and most are not prepared for it. The pilots have run. The board has approved further investment. The strategy document is beautifully designed. And now the organisation faces the question that separates the 34% of companies actually generating value from AI from the 66% still waiting to: how do we build the infrastructure that makes AI work at scale, reliably, securely, and at a return that justifies the investment?

That infrastructure has a name: the AI technology stack. And in 2026, getting it right has become the defining competitive differentiator for enterprises across every sector. According to Deloitte's 2026 State of AI in the Enterprise report, one of the most comprehensive global surveys of its kind, covering 3,235 senior leaders' workers ' access to AI tools rose by 50% in 2025 alone. Twice as many business leaders as last year are reporting transformative impact, not just productivity gains. The enterprise AI market is valued at $114.87 billion in 2026. Global AI investment reached $434 billion. And yet only 34% of organisations have moved AI into production at scale.

The gap between that ambition and that reality is almost entirely a technology stack problem. Not an idea's problem. Not a budget problem. A design, architecture, and execution problem. This article examines what a modern, production-grade AI technology stack actually consists of, what separates the stacks that deliver ROI from those that produce expensive experiments, and how Berkins Consulting helps enterprise clients move from architectural ambiguity to measurable business performance.

 

 

 

 

 

The Architecture Gap: Why Investment Alone Does Not Produce Returns

 

For three consecutive years, enterprise AI investment has grown at a pace that outstrips almost every other technology category. In 2025 alone, AI investments accounted for 48% of global venture capital, $225.8 billion. EY research found that 88% of mid-to-large organisations now spend more than 5% of their total IT budget on AI. More than half of senior leaders plan to double their AI budgets within the next twelve months.

And yet IBM's CEO research found that only 25% of AI initiatives have delivered expected ROI over the past few years. Only 16% have scaled enterprise-wide. Deloitte's 2026 survey found that while 74% of organisations hope to grow revenue through AI, only 20% are already doing so. The Wharton / GBK Collective's three-year enterprise study covering leaders from across industries confirms the pattern: the organisations generating measurable, compounding AI returns are those that have invested in the systems around AI, not just the models themselves.

What separates them is stack architecture. The organisations that are winning have moved beyond treating AI as a collection of point solutions and have built integrated, layered, governed technology stacks in which data flows cleanly, models are deployed reliably, outputs are auditable, and performance is continuously monitored. The organisations that are struggling have licensed impressive AI products and discovered that impressive products produce no value when the data feeding them is poor, the integration to business systems is missing, and there is no governance framework managing outputs in production.

 

 

 

 

Where Enterprise AI Investment Currently Delivers ROI — 2026

Percentage of organisations reporting measurable returns by deployment category

Productivity & efficiency

██████████████████████████████████░░░░░░░░░░░░░░░░░░

66%

Decision-making quality

████████████████████████████░░░░░░░░░░░░░░░░░░░░░░░░

53%

Cost reduction

█████████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░

40%

Customer experience

█████████████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░

32%

Revenue growth (direct)

██████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░

20%

New product/service lines

███████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░

14%

 

 

 

 

The Shift from Model-Centric to Architecture-Centric Thinking

 

One of the most important conceptual shifts underway in enterprise AI is the move from model-centric to architecture-centric thinking. For the first two or three years of the generative AI era, the dominant question in most organisations was 'which model should we use?' That question still matters — but it has become far less important than the questions that surround it: How will this model access accurate, current, governed data? How will its outputs be integrated into existing workflows? How will performance be monitored over time? What happens when the model drifts or the data quality degrades?

Research from Tismo.ai's 2026 enterprise AI architecture study captures this succinctly: 'Competitive advantage now depends on stack design rather than model access.' The foundation models themselves — whether from OpenAI, Anthropic, Google, Meta, or Mistral — are increasingly commoditised. The organisations generating disproportionate returns are those whose stack design allows them to deploy faster, integrate more deeply, monitor more rigorously, and iterate more rapidly than their competitors.

PwC's 2026 AI business predictions formalise this as the 'AI studio' model: a centralised hub that brings together reusable technology components, frameworks for assessing use cases, sandboxes for testing, deployment protocols, and skilled interdisciplinary teams. Critically, this structure links business goals to AI capabilities so high-ROI opportunities can be systematically identified — not discovered by accident.

 

 

The Six Layers That Determine Whether AI Scales or Stalls

 

A production-grade AI technology stack is not a single product or platform — it is a layered architecture in which each component performs a specific function and passes reliable outputs to the layer above it. Understanding those layers, and the design decisions within each one, is essential for any executive making AI infrastructure investments in 2026.

 

 

 

LAYER

 

 

Layer 1: The Data Foundation — Where Most Stacks Actually Break

 

Every AI system is, at its core, a sophisticated data transformation system. The quality of the outputs is bounded by the quality of the inputs — and in most enterprises, the data inputs are neither clean, nor governed, nor accessible in the form that AI workloads require. The data layer of the AI technology stack must handle collection, storage, versioning, pipeline orchestration, quality validation, and lineage documentation — at batch and real-time processing speeds, with strict governance controls embedded throughout.

Apache Kafka is the standard for streaming data ingestion at scale. Data lakes — whether on AWS S3, Azure Data Lake, or Google Cloud Storage — provide the storage architecture. Tools like dbt handle transformation and quality validation. Data versioning systems such as DVC (Data Version Control) ensure model reproducibility. Governance platforms like Alation manage the metadata, catalogue access controls, and lineage documentation that compliance functions require.

The organisations that build this layer rigorously achieve faster model iteration cycles, better production performance, and significantly lower incident rates when models go live. Those that skip it — typically in the rush to get models trained and deployed — discover the cost of that shortcut in the first production quarter, when data quality issues cascade into model failures that are expensive and slow to diagnose.

 

Layer 3: Orchestration and Agentic AI — The Layer Transforming Enterprise AI in 2026

 

The orchestration layer is where the most significant architectural evolution is occurring in 2026. Enterprise AI is no longer just about models producing outputs in response to inputs — it is increasingly about AI agents that coordinate reasoning, tool execution, memory retrieval, and task delegation across complex, multi-step workflows. NVIDIA's 2026 State of AI survey found that 44% of companies were deploying or assessing agentic AI by late 2025, with full deployments accelerating rapidly into 2026.

The practical implication is substantial. An agent-orchestrated AI system can autonomously execute a multi-step business process — researching a customer issue, drafting a response, checking compliance rules, escalating where necessary, and logging the outcome — without human intervention at each step. Gartner predicts that agentic AI will resolve 80% of common customer service issues without human intervention by 2029, cutting operational costs by 30%. AI agents are projected to intermediate more than $15 trillion in B2B spending by 2028.

Tools like LangChain, LlamaIndex, and Temporal are becoming standard components of the orchestration layer in mature AI stacks. Vector databases — Pinecone, Chroma, and Weaviate among them — form the backbone of context management systems that allow agents to maintain relevant, accurate information retrieval across complex workflows.

 

Layer 6: Monitoring and Governance — The Layer Most Organisations Forget Until Something Goes Wrong

 

The monitoring and governance layer is the component most consistently underfunded in enterprise AI programmes — and the one that produces the most expensive failures when it is absent. AI models degrade. Data drifts. Business conditions change. Regulatory requirements evolve. A model that delivers strong results at deployment will, without active monitoring, become a source of operational and compliance risk within months.

McKinsey's 2026 AI governance research found that fewer than 25% of organisations have board-approved AI governance policies. PwC's 2026 AI predictions emphasise that responsible AI governance — which 60% of executives say boosts ROI and efficiency — remains more aspiration than practice for most organisations. In regulated industries, the absence of a functioning governance and monitoring layer is not just an operational risk. It is a regulatory and reputational one.

 

 

 

 

What the High-Performing 5% Are Doing Differently

 

Early AI adopters report $3.70 in value for every dollar invested, with top performers achieving returns as high as $10.30 per dollar. Financial services leads all industries at 4.2x ROI. Media and telecommunications follow at 3.9x. Sales teams using AI-powered automation across the revenue cycle report 2.3x higher revenue growth compared to teams using disconnected point solutions. Enterprise companies that consolidate their technology stack around an AI-first architecture reduce tool spend by 23% while improving quota attainment by 18%.

These are not marginal differences. They are compounding advantages that widen with time. And the organisations achieving them share a set of architectural characteristics that distinguish their AI technology stacks from those of their peers.

 

How AI Investment Is Allocated in High-Performing vs. Average Enterprises

 

30% Tools/Models 30%    â—†    35% Data / Gov 35%    â—†    20% People 20%    â—†    15% MLOps 15%

 

Technology tools and model licensing (average enterprise spends 70%+ here)

30%

 

Data infrastructure and governance (high performers prioritise this)

35%

 

People, training, and change management

20%

 

MLOps, monitoring, and continuous improvement

15%

Note: High-performing organisations invert the typical investment ratio — prioritising data, governance, and people over model licensing. Source: McKinsey 2025.

 

 

The Consolidation Dividend

 

One of the most reliable ROI drivers in mature AI technology stacks is architectural consolidation. Many enterprises that moved quickly into AI adoption during 2023 and 2024 did so through a proliferation of point solutions — individual AI tools licensed for specific functions, operating without integration, sharing no data, and managed by different teams with different governance approaches. The result is what researchers describe as a 'tool sprawl' problem: high licensing costs, low utilisation, no compounding improvement, and an operational complexity that slows rather than accelerates AI delivery.

Forrester's Revenue Technology Consolidation Imperative (2025) quantifies the dividend available from fixing this: enterprise companies that consolidate their revenue technology stack around an AI-first architecture reduce tool spend by 23% while improving quota attainment by 18%. The improvement comes not from spending less on AI, but from spending more intentionally — on integrated architecture rather than fragmented point solutions.

The practical implication is that organisations with 30, 40, or 50 AI tools deployed across their enterprise are not more sophisticated than those with 10 or 15 deeply integrated ones. They are more expensive and less effective. Stack rationalisation — painful in the short term because it requires decommissioning tools that individual teams have become attached to — is one of the highest-return investments available to AI programme leaders.

 

 

 

The following case study is drawn from Berkins Consulting's AI transformation engagement portfolio. Identifying details have been composited to protect client confidentiality. The architecture decisions, failure modes, and outcomes are real and representative of patterns we observe consistently across mid-to-large enterprise clients.

 

ENGAGEMENT PROFILE

 

Sector: Professional Services (Legal & Compliance Technology)

Organisation: 2,100-person firm, operating across 6 jurisdictions, processing 40,000+ documents monthly

AI Initiative: Contract analysis, compliance flagging, and client reporting automation

Status at Engagement: 14 months post-launch. Three separate AI tools were deployed. Zero production integration. Annual spend: £1.7M. Measurable ROI: none.

 

 

The Situation: Three AI Tools, No AI System

 

The firm had approached AI adoption in the way many professional services organisations do pragmatically, department by department, driven by individual leaders who identified specific use cases and licensed tools to address them. The legal review team had adopted an AI contract analysis platform. The compliance department had deployed a separate AI-powered monitoring tool. The client reporting function had implemented a generative AI writing assistant.

Fourteen months later, all three tools were in active use within their respective departments. And none of them was delivering measurable business impact at the firm level. The contract analysis platform was producing outputs that fee earners did not trust sufficiently to use without full manual verification — defeating its purpose as an efficiency tool. The compliance monitoring system was generating alert volumes so high that the team had begun routing them directly to a folder that was reviewed quarterly. The writing assistant was being used by approximately 20% of the team it had been licensed for, with the remainder citing unreliable outputs and unclear governance.

The total annual spend across the three tools was £1.7 million. The measurable business impact — in saved fee-earner hours, reduced compliance incidents, or improved client reporting turnaround — was, by the firm's own assessment, negligible. When Berkins was engaged, the board had two questions: why was this happening, and could it be fixed?

 

 

What Berkins Found: A Stack Architecture Problem in Three Dimensions

 

 

No Shared Data Foundation

 

Each of the three AI tools was operating on its own data source, in its own format, governed by its own (or no) data quality protocol. The contract analysis tool was being fed documents in seventeen different formats from four different document management systems, with no standardisation. The compliance monitoring tool was accessing transaction data from two legacy systems that had never been reconciled — meaning alerts were being generated for discrepancies that were artefacts of system inconsistency, not actual compliance issues. There was no shared data governance, no lineage documentation, and no quality validation layer.

 

No Integration with Downstream Workflows

 

The outputs from all three AI tools were being delivered as standalone reports or alerts — not integrated into the workflows that fee earners, compliance officers, and client managers actually used daily. Accessing an AI output required navigating to a separate interface, interpreting the output without context, and then manually acting on it in the primary workflow system. The friction was high enough that most users simply reverted to their established manual processes. The tools were producing outputs that nobody was acting on.

 

No Monitoring, Governance, or Feedback Loop

 

None of the three tools had monitoring in place. There was no systematic tracking of output accuracy, model drift, user adoption, or business impact. The contract analysis tool's accuracy had degraded significantly since its initial deployment as the firm's contract formats evolved — but because no one was tracking output quality, this went undetected for eleven months. The compliance tool's false positive rate had been high from day one but had never been measured, leading to the alert-suppression behaviour that had turned it into a quarterly rather than a real-time compliance instrument.

 

The Berkins Approach: Architecture Before More Tools

 

The Berkins recommendation was counterintuitive but evidence-based: before deploying any additional AI capabilities, the firm needed to build the architectural foundation on which those capabilities could actually function. That meant tackling three things in sequence, not in parallel.

 

Data Foundation (Layer 1)

 

Designed and implemented a unified document ingestion pipeline with format normalisation, quality validation, and lineage tracking across all source systems. Established a single, governed data store as the input source for all three AI tools. Reconciled the two legacy compliance data systems into a single source of truth with real-time synchronisation.

 

Workflow Integration (Layer 5)

 

Redesigned the output delivery architecture for all three tools — embedding AI outputs directly into the fee earner, compliance, and client management workflows rather than requiring separate platform access. Built lightweight integration APIs between each AI tool and the firm's primary matter management system, compliance platform, and client reporting portal.

 

Monitoring & Governance (Layer 6)

 

Established output accuracy tracking, drift monitoring, and user adoption dashboards for all three tools. Implemented a feedback loop allowing fee earners to flag incorrect contract analysis outputs, creating a retraining signal that improved model accuracy by 34% over the first six months. Set monthly review cadences for each tool's performance metrics with accountability at the partner level.

 

Agentic Orchestration (Layer 3)

As a Phase 2 initiative, I designed and deployed an orchestration layer connecting all three AI tools into a unified workflow, allowing a single matter to flow through contract analysis, compliance screening, and client report generation without manual handoffs between systems. This created the integrated AI system that the original three-point tools had never been.

 

 

 

BERKINS REFLECTION

This firm did not need more AI. It needed its existing AI to work, which meant building the infrastructure around it that had never been built. The tools were sound. The data foundation, integration architecture, and governance layer were absent. In our experience, this is the situation at the majority of enterprises that believe their AI programmes are underperforming: the problem is not the model. It is the stack.

 

 

 

 

 

Six Design Principles for AI Technology Stacks That Compound Over Time

 

The research from Deloitte, McKinsey, PwC, Wharton, and NVIDIA is consistent: AI technology stacks that compound in value over time share a set of design principles that distinguish them from stacks optimised for speed of initial deployment. These principles are not abstract — they are architectural decisions made at each layer of the stack that either enable or constrain future performance.

 

 

 

 

The Agentic AI Horizon: What Forward-Looking Stacks Are Building Now

 

The most forward-looking architectural investments in 2026 are in agentic AI infrastructure — the orchestration layer that transforms AI from a question-and-answer system into an autonomous executor of complex, multi-step business processes. NVIDIA's 2026 survey found that enterprise applications featuring task-specific AI agents will jump from less than 5% in 2025 to 40% by the end of 2026. Gartner predicts that 15% of day-to-day work decisions will be made autonomously by 2028, up from 0% in 2024.

The organisations building this infrastructure now — investing in agent orchestration frameworks, persistent memory layers, multi-agent coordination systems, and the governance frameworks specific to autonomous AI — are positioning themselves to capture that capability as it matures, rather than scrambling to retrofit their stacks when agentic AI becomes a competitive necessity.

PwC's description of mature agentic deployment is instructive: agents are tested before deployment with working demos for future users, rolled out as part of redesigned workflows with clearly articulated human oversight steps, monitored by other agents checking their work, and governed through a centralised platform with a shared library of templates and tools. That is not a technology description. It is an architecture and governance description — and it is the blueprint for any enterprise serious about capturing agentic AI returns.

 

 

 

 

Diagnosing Stack Maturity Before Investing in More Capability

 

Berkins Consulting's AI Stack Readiness Framework is a diagnostic tool used at the start of every AI transformation engagement. It assesses the maturity of each stack layer against the production requirements of the organisation's specific use cases — identifying the gaps that are suppressing ROI before any new investment is made.

The framework evaluates six dimensions, corresponding to the six layers of the enterprise AI stack:

 

 

 

 

 

7

What Separates the 34% from the 66% Is Not Ambition — It Is Architecture

 

The gap between the 34% of organisations that have scaled AI into production and the 66% still waiting to is not a gap in ambition, investment, or technical sophistication. The organisations in the second group are spending more on AI, not less. They have licensed more tools, run more pilots, and generated more strategy documents. What they have not done is build the infrastructure — the layered, integrated, governed AI technology stack — that turns those investments into durable business returns.

The global AI market stands at $434 billion in 2026. Enterprise AI investment is growing at nearly 19% annually. GenAI adoption has gone from 6% of enterprises in 2023 to 30% in 2025 — a fivefold increase in two years. The acceleration is real, and it is not slowing. The organisations that are capturing value from it are those that treated architecture as a strategic asset and built stacks designed for production from the outset — not as a retrofit after the pilots were done.

Building a production-grade AI technology stack is not glamorous work. It does not generate the kind of press releases that a new model deployment does. It requires patient investment in data foundations, governance frameworks, integration architecture, and monitoring infrastructure that are rarely visible to anyone outside the engineering and data teams building them. But it is this work — more than any model, any tool, or any pilot — that determines whether AI delivers speed, scale, and ROI at the enterprise level.

That is the work Berkins Consulting does. And it is the work that makes everything else possible.

 

 

ABOUT BERKINS CONSULTING

Berkins Consulting partners with enterprise leaders across financial services, professional services, healthcare, and manufacturing to design and build AI technology stacks that move organisations from pilot-stage investment to production-scale returns. Our Stack Readiness Assessment identifies the architectural gaps suppressing ROI before additional investment is made. If your organisation is ready to close the gap between AI ambition and AI results, we are ready to start.

 

Selected for you ✖
Related

The Rise of Artificial Intelligence: Transforming the Future of Humanity

Artificial Intelligence is reshaping industries, redefining human productivity, and revolutionizing ...

Read More →