The ERP that acts: putting a control tower between the agent and the ledger

2027

Gartner: 40%+ of agentic-AI projects cancelled by year-end — costs, unclear value, weak controls²

LLM01

prompt injection — OWASP's #1 risk for LLM applications, and an ERP is full of untrusted text³

Art. 14

of the EU AI Act: high-risk systems must be overseeable — override, reverse, stop⁷

In April 2026, Roland Berger published "AI and ERP: A roadmap to harness AI's capabilities and turbocharge your ERP" — circulated under the sharper title "Rivals or best friends?"¹ Its argument is a clean reframe of a debate usually staged as a fight. The ERP does not get replaced by AI; it stays the certified system of record — the auditable, controlled place where the numbers legally live. What changes is the layer on top: the interface becomes a set of intelligent agents, and the ERP shifts from a passive ledger you navigate into, in the authors' words, a "system that acts."

"By layering AI capabilities and advanced reasoning on top of the ERP foundation, the system evolves from a passive ledger into an active business partner."— Christina Gröger, Roland Berger, "AI and ERP", April 2026

That shift rides on three drivers the article lays out: processes move from manual to autonomous; AI becomes the primary interface to the system, conversational and increasingly headless; and — the one most coverage treats as a footnote — the AI has to be trusted and auditable. Roland Berger is blunt about that third driver: "Automation without accountability is a liability."¹ Every AI action, it insists, must stay transparent, compliant and auditable.

That principle is where this piece picks up. A "system that acts" is exciting right up until you ask the obvious question: acts on what, and who checks it? In an ERP the answer is payments, orders, journal postings and vendor records — and the moment an agent can touch those, the interesting engineering is no longer the agent. It is the gate the agent has to pass through before anything executes. Call it a control tower: AI proposes, a deterministic gate validates, a human approves. That pattern is not Roland Berger's coinage — but as the rest of this piece shows, every serious authority on agentic AI, from Gartner to OWASP to the EU, converges on it.

The evidenceWhy "it can act" is not "it is safe to act"

The uncomfortable base rate is that giving software more autonomy does not, by itself, make it work out. The most-cited number in the agentic-AI debate is Gartner's: it predicts over 40% of agentic-AI projects will be cancelled by the end of 2027, citing "escalating costs, unclear business value, or inadequate risk controls."² Read the three causes again — none of them is "the model wasn't smart enough." All three are properties of the system around the model: cost discipline, value measurement, and risk control. That is exactly the control tower.

40%+Gartner, 2025

Over 40% of agentic-AI projects cancelled by end of 2027 — for "escalating costs, unclear business value, or inadequate risk controls."² Governance and value failures, not capability failures.

~5%MIT NANDA, 2025

An MIT NANDA study found roughly 95% of organizations getting "zero return" on enterprise GenAI, with only ~5% of integrated pilots extracting real value.⁴ (A small, self-selected, non-peer-reviewed sample — and publicly contested — so we cite the precise framing, not the folk "95% of pilots fail.")

42%Informatica, 2024

Among data leaders, 42% cite data quality as the single biggest obstacle to adopting GenAI.⁶ An ERP "system that acts" is only as trustworthy as the records it acts on — and most cores are not clean.

The pattern is consistent: the constraint is rarely whether the model can read the purchase order or spot the variance. It is whether the organization can prove the value and control the risk of letting it act. Roland Berger's own framing agrees — which is why the article's load-bearing idea is not the agent, but the gate the agent has to pass through.

The riskAn ERP agent acts on money and identity

This is where ERP raises the stakes far above a chatbot. An "action" in this world is a payment released, a purchase order raised, a vendor master record changed, a journal posted, a credit limit adjusted. When an agent acts here, it acts on money and identity — and the security field has been explicit about what that requires.

OWASP's Top 10 for LLM Applications (2025) ranks prompt injection as LLM01 — the number-one risk, defined as user-supplied input that "alter[s] the LLM's behavior or output in unintended ways," including content that "need not be human-visible/readable, as long as the content is parsed by the model."³ Roland Berger's own headline example is order-to-cash: an agent reads an inbound customer purchase order and drafts the confirmation. But that purchase order is untrusted text from outside your walls — and so is a vendor's contract, a ledger memo, a logistics note. Any of them can be crafted to steer the agent. The lesson is blunt: untrusted input must be treated as data, never as instructions.

OWASP's companion risk, Excessive Agency (LLM06), addresses the other half directly. Its recommended mitigation is unambiguous:

"Utilise human-in-the-loop control to require a human to approve high-impact actions before they are taken."— OWASP Top 10 for LLM Applications, LLM06: Excessive Agency

That is the control tower, in the words of the security community — not a flow8 opinion. OWASP has since formalized the agentic case in a dedicated Top 10 for Agentic Applications (December 2025), which carries the principle of least privilege straight into agent design: an agent should hold only the goals, tools and data it needs.⁵ For an ERP, least privilege means the agent gets to propose — and only a human gets to execute. And for regulated operations, that is no longer just good practice. It is law.

The lawThe gate is a legal obligation, not a nicety

An agentic ERP routinely touches high-risk territory — the EU AI Act explicitly lists AI used to evaluate "the creditworthiness of natural persons or establish their credit score" as high-risk⁷ — and two of its requirements land squarely on the control tower:

Human oversight (Article 14). High-risk systems must be designed so they "can be effectively overseen by natural persons," who can "disregard, override or reverse the output" and "interrupt the system through a 'stop' button" — while the design actively guards against automation bias.⁷ That is the approval gate, written into statute.
Record-keeping (Article 12). High-risk systems "shall technically allow for the automatic recording of events (logs) over the lifetime of the system" to ensure traceability.⁷ That is the immutable ledger of what was proposed, gated and decided. High-risk obligations apply from August 2026.

The U.S. NIST AI Risk Management Framework reaches the same destination from a different direction: its four core functions are Govern, Map, Measure, Manage, with Govern described as "a cross-cutting function that is infused throughout AI risk management."⁸ Govern first, then build. A human on every high-consequence action and a full audit trail are not features you bolt on after a successful pilot — they are the conditions under which the pilot is allowed to exist at all.

There is a sovereignty dimension too. Cisco's 2025 Data Privacy Benchmark Study found 90% of organizations believe local storage of data is inherently safer, and 64% worry about inadvertently sharing sensitive information with AI systems.⁹ For the data inside an ERP — financials, customer master data, payroll — "where does it run" is not a preference. It decides whether you are permitted to point AI at the core at all.

The synthesisThe control tower is a platform capability

Put the evidence together and a clear conclusion falls out. The exciting part — the agent that reads the PO or reconciles the close — is the easy half. The hard half is the control tower every agent has to pass through: treat every input as untrusted, keep a human on every money-and-identity action, validate against deterministic rules before execution, and log all of it immutably on infrastructure you control. Build that once, as a platform capability, and every new agent inherits it. Rebuild it per-pilot, and you get exactly the graveyard Gartner describes — impressive demos that never reach production because each one re-litigates security, approval and audit from scratch.

That platform stance is the design principle behind flow8. A handful of non-negotiables apply to every automated process it runs — whether drafting an order confirmation or proposing a journal posting:

🛑 Never auto-act on money or identity Every producer flow prepares and recommends — it never executes. A high-consequence action fails safe to a drafted proposal plus a flag; a named human approves it at the gate. OWASP LLM06 and EU AI Act Article 14, operationalized.

🧪 Untrusted input is data, not instructions Every ingested text — a customer PO, a vendor contract, a ledger memo, a logistics note — is injection-scanned before any model acts on it. A poisoned input is blocked at the gate, not shown to a human as actionable. A direct answer to OWASP LLM01.

📋 One ledger, one immutable record Every proposed action has a stable key, written before the side-effect and confirmed after, so a re-run never double-acts. Each one is logged with its rule verdicts and decision — attributable and replayable. EU AI Act Article 12 by construction.

⚖️ A deterministic gate, not an LLM, decides The control tower is pure rule code: amount-over-threshold, hold-list, compliance flag, low-confidence, injection. No model decides whether to act — it only ever proposes. Least privilege, applied to agents.

And it runs self-hosted — on-premise, private cloud or air-gapped — so the data inside your ERP never crosses a boundary you don't own, answering the sovereignty concern Cisco quantifies.

flow8 in practiceOne control tower, four agents that propose

We built Roland Berger's "system that acts" as five concrete flow8 flows. Four are producers — order-to-cash, finance close, procurement risk, supply-chain watch — and each one only ever drafts a recommendation into one shared action ledger (a proposed_actions table). The fifth flow is the control tower: it reads every proposal, runs the deterministic policy gate, and opens exactly one human approval task per action. The architecture, not the prose:

Four self-hosted producer flows, one shared proposed_actions ledger. Each drafts and recommends; the control-tower gate decides; nothing touches money or identity without a human.

📨 Order-to-cash PO email → drafted confirmation OCR · extract

📒 Finance close match → journal proposal IFRS/GAAP rule

🏷️ Procurement risk supplier score · maverick spend BM25 + AI

🚚 Supply-chain watch anomaly → corrective plan graph BFS

🧱 System of record the ERP stays the ledger read · never auto-write

Control-tower gate · proposed_actions deterministic rules · injection block · idempotent · audit-logged

👤 Human-gated One approval task per action propose → a person decides → execute

Self-hosted · no data egress 185+ audited modules Never auto-acts on money/identity Add a 5th agent — same gate, no rework

The question is no longer "can AI act on the ERP?" It is "can we stop it from acting when it shouldn't, and prove what it did?" Every credible authority — Gartner, OWASP, NIST, the EU — agrees that is the real test. It is a platform answer, not a model answer.

The takeawayBuild the control tower once

Roland Berger is right that AI turns the ERP from a passive ledger into a system that acts, and the shift is genuinely transformative. But the same article is just as clear that automation without accountability is a liability — and every serious authority agrees on the unglamorous half of the story. The organizations that capture the upside will be the ones that treated every input as hostile until proven otherwise, put a deterministic gate in front of every action, kept a human on every money-and-identity decision, and logged all of it on infrastructure they own. Build that control tower once, as a platform capability, and each new agent is a fast, safe addition. Skip it, and you have simply built a faster way to lose control of the systems your business runs on.

On the framing: the "system of record → system that acts" thesis, the three drivers (manual→autonomous, AI-as-interface, trusted/auditable AI), and the quotes "an active business partner" and "automation without accountability is a liability" are drawn from Roland Berger's April 2026 article.¹ The "control tower" name and the "AI proposes, a human approves" pattern are flow8's framing — independently grounded in OWASP LLM06 and EU AI Act Article 14, cited below — not direct quotes from that piece. The five-flow implementation is our own.

Let your ERP act — without losing the controls.

flow8 is the platform for running agentic ERP use cases in a standardized, secure, governed way — every action through one control tower, a human on every high-consequence decision, on infrastructure you own.

Talk to our team →

Sources

Roland Berger (E. Goos, C. Gröger, R. Seidel), "AI and ERP: A roadmap to harness AI's capabilities and turbocharge your ERP" (a.k.a. "Rivals or best friends?"), April 9, 2026. rolandberger.com
Gartner, "Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027," press release, June 25, 2025. gartner.com
OWASP, "LLM01:2025 Prompt Injection," OWASP Top 10 for LLM Applications 2025. genai.owasp.org
MIT NANDA, "The GenAI Divide: State of AI in Business 2025," 2025; coverage via Fortune, Aug 18, 2025. fortune.com
OWASP GenAI Security Project, "OWASP Top 10 for Agentic Applications," Dec 9, 2025 (LLM06:2025 Excessive Agency, and least-privilege for agents). genai.owasp.org · LLM06
Informatica, "CDO Insights 2024" — 42% of data leaders cite data quality as the main obstacle to GenAI adoption, Jan 31, 2024. informatica.com
EU AI Act — Article 14 (Human Oversight), Article 12 (Record-keeping), Annex III (incl. creditworthiness/credit scoring); high-risk obligations apply from Aug 2, 2026. artificialintelligenceact.eu/article/14 · article/12
NIST, "Artificial Intelligence Risk Management Framework (AI RMF 1.0)," NIST AI 100-1, Jan 2023. nist.gov
Cisco, "2025 Data Privacy Benchmark Study," Apr 2, 2025 (90% see local storage as inherently safer; 64% worry about sharing sensitive data). newsroom.cisco.com

← All insights