In May 2026, McKinsey published "The end of ERP as we know it? Five ways AI is disrupting ERP."1 Its thesis is that AI is restructuring the systems that run finance, supply chain and operations: instead of people navigating screens, networks of autonomous agents act on top of the system — what McKinsey calls a "headless, agentic" architecture. The upside it cites is real and large: AI agents "have the potential to reduce the effort needed to implement ERP systems by at least 50 percent and cut down program duration by half," and early adopters report "EBIT improvements of 5 percent or more."1
That is the headline. But the article's first and most important shift is not about speed at all. McKinsey names it "value mission control," and states the principle plainly:
Read that twice, because it inverts how most organizations think about automation. With a deterministic process you can assume the value once and forget it. With an autonomous agent you cannot — you have to measure the value continuously, per agent, against a baseline, or you genuinely do not know whether the thing is helping or quietly costing you money. The rest of this piece is about why that is hard, what the evidence says happens when you skip it, and what it takes to do it right.
The evidenceWhy "it works" is not the same as "it creates value"
The uncomfortable part of McKinsey's own analysis is the base rate. In the same article it reports that "just 25 to 35 percent of large tech programs achieve their targeted EBITDA and cash-flow impact, while 65 to 80 percent exceed their planned budget or timeline."1 Adding autonomy to a system does not automatically move those numbers — and the independent evidence on AI specifically is sobering:
The pattern is consistent across McKinsey, Gartner, MIT and Deloitte: the constraint is not whether the model can do the task. It is whether the organization can prove the value and control the risk. Those are properties of the system around the model — not the model itself.
The riskAn ERP agent acts on money and identity
This is where ERP raises the stakes above a chatbot. An ERP transaction is a payment released, a purchase order raised, a vendor master record changed, a credit limit adjusted. When an agent acts here, it acts on money and identity — and the security field has been explicit about what that requires.
OWASP's Top 10 for LLM Applications (2025) ranks prompt injection as LLM01 — the number-one risk, defined as user-supplied input that "alter[s] the LLM's behavior or output in unintended ways," including content that "need not be human-visible/readable, as long as the content is parsed by the model."3 In an ERP context, that "input" is a vendor note, a free-text memo field, a comment in custom code — any of which a malicious actor could craft to steer an agent. The lesson is blunt: untrusted input must be treated as data, never as instructions.
OWASP's companion risk, Excessive Agency (LLM06), addresses the other half directly. Its recommended mitigation is unambiguous:
This is not a flow8 opinion; it is the consensus of the application-security community, since formalized further in OWASP's dedicated Top 10 for Agentic Applications (December 2025), which introduces the principle of "least agency" — the agentic extension of least privilege.6 And for regulated operations it is no longer merely best practice. It is law.
The lawGovernance and audit are now obligations, not options
For high-risk uses — and the EU AI Act explicitly lists AI that evaluates "the creditworthiness of natural persons or establish[es] their credit score" as high-risk7 — two requirements land squarely on any agentic ERP deployment:
- Human oversight (Article 14). High-risk systems must be designed so they "can be effectively overseen by natural persons," who can "disregard, override or reverse the output" and "interrupt the system through a 'stop' button" — while the design actively guards against automation bias.7
- Record-keeping (Article 12). High-risk systems "shall technically allow for the automatic recording of events (logs) over the lifetime of the system" to ensure traceability.7 High-risk obligations apply from August 2026.
The U.S. NIST AI Risk Management Framework reaches the same destination from a different direction: its four core functions are Govern, Map, Measure, Manage, with Govern described as "a cross-cutting function that is infused throughout AI risk management."8 Govern first, then build. An audit trail and a human-on-every-high-consequence-action are not features you bolt on after a successful pilot — they are the conditions under which the pilot is allowed to exist.
There is a sovereignty dimension too. Cisco's 2025 Data Privacy Benchmark Study found 90% of organizations believe local storage of data is inherently safer, and 64% worry about inadvertently sharing sensitive information with AI systems.9 For the data inside an ERP — financials, customer master data, payroll — "where does it run" is not a preference. It decides whether you are permitted to use AI on the core at all.
The synthesisThis is a platform problem, not a model problem
Put the evidence together and a clear conclusion falls out. The model is not the hard part. The hard part is the system around it: continuous value measurement, treating every input as untrusted, keeping a human on every money-and-identity action, logging all of it immutably, and running it on infrastructure you control. Solve those once, as platform capabilities, and every new use case inherits them. Solve them per-pilot, and you get exactly the graveyard Gartner and MIT describe — impressive demos that never reach production because each one re-litigates security, audit and value from scratch.
That platform stance is the design principle behind flow8. A handful of non-negotiables apply to every automated process it runs, whether reconciling an invoice or triaging an AI use case:
And it runs self-hosted — on-premise, private cloud or air-gapped — so the data inside your ERP never crosses a boundary you don't own, answering the sovereignty concern Cisco quantifies.
flow8 in practiceThe five ERP theses, running as governed flows
We built each of McKinsey's five theses as a concrete flow8 flow. All five are producers writing into one value bus — an agent_actions ledger — so the whole program rolls up to a single, human-reviewed P&L. The architecture, not the prose:
agent_actions ledger. Each prepares and recommends; nothing touches money or identity without a human.agent_actions
The takeawayBuild the governed core once
McKinsey is right that AI changes ERP, and the speed is genuinely transformative. But every serious source — the analysts on value, the security community on injection and agency, the regulators on oversight and logging — converges on the same unglamorous half of the story. The organizations that capture the value will be the ones that measured impact continuously, treated every input as hostile until proven otherwise, kept a human on every money-and-identity decision, and logged all of it on infrastructure they own. Get that governed core right, and each new use case is a fast, safe addition. Get it wrong, and you have simply built a faster way to lose control of the systems your business runs on.
Bring a trending use case in safely.
flow8 is the platform for running AI use cases in a standardized, secure, governed way — measured against a baseline, with a human on every high-consequence decision, on infrastructure you own.
Talk to our team →Sources
- McKinsey & Company, "The end of ERP as we know it? Five ways AI is disrupting ERP," McKinsey Technology, May 2026. mckinsey.com
- Gartner, "Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027," press release, June 25, 2025. gartner.com
- OWASP, "LLM01:2025 Prompt Injection," OWASP Top 10 for LLM Applications 2025. genai.owasp.org
- MIT NANDA, "The GenAI Divide: State of AI in Business 2025," July 2025; coverage via Fortune, Aug 18, 2025. fortune.com
- Deloitte, "State of Generative AI in the Enterprise," Wave 4, Jan 21, 2025. deloitte.com
- OWASP GenAI Security Project, "Top 10 Risks & Mitigations for Agentic AI," Dec 9, 2025. genai.owasp.org
- EU AI Act — Article 14 (Human Oversight), Article 12 (Record-keeping), Annex III §5(b). artificialintelligenceact.eu/article/14 · article/12
- NIST, "Artificial Intelligence Risk Management Framework (AI RMF 1.0)," NIST AI 100-1, Jan 2023. nist.gov
- Cisco, "2025 Data Privacy Benchmark Study," Apr 2, 2025. newsroom.cisco.com
- Informatica, "CDO Insights 2024" (42% of data leaders cite data quality as the main obstacle to GenAI adoption), Jan 31, 2024. informatica.com