Disagreements in Recent Research on Runtime Governance of AI Applications

Shared conversation from DR-Tulu Research AssistantDR Tulu 8B (1.9k step)

What are the main disagreements in recent papers on runtime governance of AI applications?

Analyzed

Search: "runtime governance AI applicat..."

Analyzed

Search: "Runtime Governance for AI Agen..."

Analyzed

Search: "Governance-in-the-Loop Runtime..."

Analyzed

Search: "LLM guardrails runtime safety ..."

Analyzed

Search: "Adaptive Runtime Governance fo..."

Analyzed

What "runtime governance" means and where enforcement happens

Recent work broadly defines runtime governance as enforcing policies at the moment an agent or service takes an action—rather than only pre-deployment—"so governance operates where the system meets users," with controls over identity/entitlements, execution, egress, and audit evidence of what was allowed or blocked.Recent work defines runtime governance as enforcing policies at action time, emphasizing identity/entitlements, execution/egress controls, and audit evidence.^{[1, 4, 5, 8]} A key scholarly line formalizes the "execution path" as the central object for governance and expresses compliance policies deterministically over these paths, situating governance in the concrete, non-stochastic sequence of operations an agent executes.A formal framework argues that the execution path is the central object for effective runtime governance and formalizes compliance policies as deterministic over paths.^[11] This path-centric view contrasts with pre-deployment "model safety" approaches and positions runtime control as essential for agentic systems, where goals, tools, and context evolve dynamically.There is a shift from model safety to runtime control in agentic systems, with path-focused policies operationalizing governance.^{[1, 15]}

Disagreements arise on where exactly to place enforcement gates: per-user/identity, per-goal/permission, per-tool-call, per-output, or at egress, and whether to intercept and block (hard enforcement) versus monitor and score (soft signals). Industry frameworks and toolkits advocate embedding policy engines, identity, and sandboxing at runtime, while some posts warn that governance must reach the actual execution path to be effective.Frameworks propose embedding policy engines, identity, and sandboxing at runtime, with emphasis on controlling state transitions along execution paths.^{[2, 22, 9]}

Scope of controls: identity/authorization vs content/harm moderation vs execution/egress

A major dispute centers on the relative priority and integration of different control classes. One camp emphasizes identity and zero-trust authorization as the foundation, using goal-aware permissions and delegation safety to adapt rights to changing objectives during runtime, alongside telemetry and conformance checks.Goal-aware permission management, delegation safety, continuous authorization, and agent-semantic telemetry are promoted as core runtime controls.^{[39, 40, 4]} Another strand elevates content and behavior moderation—blocking outputs, tool invocations, or external actions—to reduce harm, often combining built-in provider filters, output classifiers, and guardrails that validate inputs/outputs and enforce formats.Guardrails and moderation mechanisms sit between model and system to constrain inputs, outputs, and actions, with format enforcement and automated blocking.^{[26, 33, 34]}

A related disagreement concerns egress and execution isolation: should runtime governance restrict tool calls and external actions (network/OS), or rely on monitoring and retrospective audit? Security-oriented posts argue for enforceable controls at egress and sandboxed execution with clear audit trails, while others treat audit as secondary to preventive enforcement.Runtime governance adds enforceable controls around identity/entitlements, execution, and egress, plus audit evidence.^[4] This reflects a tension between secure design (deny-by-default, least privilege) and observability-first (allow-and-audit, monitor then act) approaches.

Decision logic: deterministic policy vs adaptive learning-based control vs human-in-the-loop

Another core disagreement is how governance decisions are made. "Governance-in-the-Loop" and path-policy work propose non-bypassable, deterministic enforcement embedded into execution, formalized along agent paths and aligned to explicit policy semantics.Non-bypassable, deterministic policy enforcement embedded in AI execution, formalized over execution paths, is advocated.^{[20, 11]} By contrast, "Adaptive Runtime Governance" argues that authorization and policies must adapt to behavior drift, adversarial adaptation, and shifting decision patterns; it promotes continuous monitoring and dynamic interventions to bound unobserved risk, articulating an "Informational Viability Principle."Adaptive Runtime Governance contends that authorized agents can become unsafe as behavior drifts; it supports continuous monitoring and adaptive interventions grounded in an informational viability principle.^{[35, 19, 37]}

This tension manifests in practical guardrail choices: conservative, high-sensitivity filters reduce harms but increase false positives and latency; more adaptive, learning-based moderators might improve coverage but require drift management and careful calibration.Empirical reports note 12% false positive rates on runtime filters, sensitivity to threshold choices, and guardrail drift; comparative studies highlight accuracy, FPR, and latency trade-offs.^{[25, 27, 29, 31]} Some platforms centralize filtering inside providers (e.g., built-in content moderation), while others layer application-level guardrails that can override or complement provider policies, creating friction over ownership and consistency of enforcement.Providers may include content filters, but application-level guardrails and format enforcement are common, leading to questions about consistency and drift across layers.^{[33, 34, 31]}

Operational deployment: monitoring vs preventive enforcement; telemetry/audit standards; canary/rollback

There is also disagreement over operational practices for rolling out and rolling back governance. Integrated frameworks emphasize continuous authorization, telemetry, and conformance checking over time and path, suggesting coordinated mechanisms (risk index, goal-aware permissions, state conformance, and monitoring) to manage agentic workflows in production.Integrated runtime frameworks provide continuous authorization, agent-semantic telemetry, goal-aware permissions, and conformance engines to manage agentic workflows.^{[39, 40, 41]} Industry posts advocate "shift-left governance," embedding policy engines, identity, and sandboxing early in the runtime stack, which implies testing and validating governance policies during deployment rather than only at model review time.Shift-left governance embeds policy engines, zero-trust identity, and sandboxing into the runtime stack from the start.^{[2, 9]}

A recurring tension concerns the balance between proactive prevention and reactive monitoring. Path-policy and GiL emphasize deterministic enforcement as the primary control, with monitoring providing evidence and conformance, whereas adaptive governance places monitoring at the center for continuous risk estimation and intervention, sometimes allowing operations to dynamically alter permissions or block trajectories absent operator action.Deterministic enforcement with monitoring is contrasted with adaptive governance that estimates unobserved risk and allows interventions to bound risk over trajectories.^{[11, 20, 36, 37]} On telemetry/audit, some sources push for standardized, governance-semantic telemetry and continuous recording to enable audit and conformance, while others treat telemetry as ancillary to runtime enforcement engines or present open frameworks that unify detection and guardrails without prescriptive standards.There is advocacy for governance-semantic telemetry and open guardrail platforms, highlighting the role of standardized telemetry in audit and conformance.^{[40, 32]}

Canary/rollback strategies reflect a deeper disagreement about how much autonomy to grant policies and controls in production. Integrated frameworks call for temporal and path conformance checking to detect deviations and unsafe patterns during rollout, enabling immediate runtime adjustments or rollbacks; practical reports stress the need to project expected vs. intervened trajectories to make rollback decisions explicit.Temporal and path conformance checking support rollouts by detecting deviations and enabling intervention; comparing projected trajectories underscores the need for explicit rollback policies.^{[41, 36]}

Measured trade-offs: latency, false positives/negatives, guardrail drift, and coverage

There is converging but contested evidence on the operational cost of runtime governance. Empirical reports document nontrivial false positive rates (e.g., ~12%) for output/route filters, which can degrade user experience and throughput, and warn that overly aggressive guardrails are common across systems; at the same time, measured latency and accuracy impacts vary across providers and configurations, motivating unified guardrail platforms and configurable policies.False positive rates of roughly 12% are observed in runtime filters, with cross-system aggressiveness and a call for comparative evaluation of detection accuracy, FPR, and latency.^{[25, 29, 27]} Guardrail drift is another practical disagreement point: some argue that guardrails must be continuously recalibrated to maintain safe coverage, while others embed them as deterministic, path-aligned policies that reduce drift by design.Guardrail drift motivates continuous recalibration, whereas path-aligned, deterministic policies aim to reduce drift by design through explicit path semantics.^{[31, 11]}

Summary of main disagreements

Where to enforce: Identity/authorization and goal-aware permissions versus content/output and tool/egress controls, with path-centric enforcement emerging as a scholarly focal point.Scholarly work formalizes enforcement over execution paths, while frameworks emphasize goal-aware authorization and egress controls.^{[11, 39]}
How to decide: Deterministic, non-bypassable enforcement versus adaptive, monitoring-driven control that adjusts permissions and allows interventions based on bounded unobserved risk.Deterministic enforcement is contrasted with adaptive, monitoring-driven governance that continuously estimates risk and allows interventions.^{[20, 37]}
Scope priorities: Security/zero trust and authorization-first models versus harm-avoidance guardrails and moderation, with ongoing debate about the appropriate balance and integration.Security-first controls versus harm-avoidance guardrails highlight differing priorities and integration choices.^{[4, 33]}
Operational practices: Continuous monitoring and telemetry/audit for conformance versus preventive, policy-first architectures; and strategies for rollout/rollback shaped by whether governance is adaptive or deterministic.Integrated frameworks emphasize telemetry and conformance for rollouts and rollbacks, while others prioritize preventive enforcement with audit support.^{[40, 41]}
Trade-offs: Latency, false positives/negatives, and guardrail drift versus coverage and safety; empirical studies show sensitivity to thresholds and configurations, and motivate open, configurable guardrail systems.Empirical findings show nontrivial FPRs, latency impacts, and drift, motivating configurable guardrails and careful thresholding.^{[25, 27, 29, 32]}

Where consensus is emerging

There is growing agreement that governance must live at runtime in agentic systems, with formal semantics for paths and states, combined telemetry and enforcement, and mechanisms to handle drift and escalation. Industry frameworks and scholarly proposals converge on embedding identity, authorization, monitoring, and audit into the runtime stack, even if they diverge on the degree of determinism versus adaptivity and on which control class to prioritize first.Runtime governance is increasingly framed as embedding identity, authorization, monitoring, and audit over formalized paths, with evolving mechanisms for drift and escalation.^{[1, 11, 4, 39, 40]}

DR Tulu can make mistakes. Please double-check the responses.

This is a read-only view of a shared conversation. Try DR-Tulu Research Assistant

Sources

Search: "runtime governance AI app..."

From Model Safety to Runtime Governance | ai-and-datascience

As AI systems evolve into enterprise agents, governance must shift from model safety to runtime control. Learn how policy enforcement, ...

blogs.oracle.com

Introducing the Agent Governance Toolkit - Microsoft Open Source

Discover how the Microsoft Agent Governance Toolkit brings policy, identity, and reliability to autonomous AI agent systems.

opensource.microsoft.com

[2603.16586] Runtime Governance for AI Agents: Policies on Paths

We develop the formal framework for analyzing AI agent governance, present concrete policy examples (inspired by the AI act), discuss a ...

arxiv.org

Runtime AI Governance Security Platforms for LLM Systems (2026)

Runtime AI governance adds enforceable controls around identity/entitlements, execution, and egress, plus audit evidence for what was allowed or ...

accuknox.com

What is Runtime Governance for AI Agents? - Prefactor

Runtime governance is the enforcement of policies at the moment an agent acts, not before or after. It gives enterprises the ability to block, ...

prefactor.tech

GenOps AI — LLM runtime governance built on OpenTelemetry

Built on OpenTelemetry, it's an open-source runtime governance framework for AI that standardizes cost, policy, and compliance telemetry across ...

reddit.com

Agentic AI Governance Framework: Policy, Operations & Runtime ...

This three-layer governance framework enables autonomous AI agents to optimize continuously while operating within your brand, compliance, ...

attentive.com

AI Governance starts at runtime | Unleash

We'd add one thing: the governance that matters most is the governance that operates at runtime, where your software actually meets your users.

getunleash.io

Govern Agentic AI with Runtime Governance, Not Just Model Review

Agentic AI should operate autonomously only inside a governed decision environment where permissions, dependencies, evidence and policy ...

linkedin.com

Best AI Governance Platforms Reviews 2026 | Gartner Peer Insights

AI governance platforms support a wide range of AI techniques across built, blended, embedded and bring-your-own-AI applications. Learn More About This Category.

gartner.com

Search: "Runtime Governance for AI..."

Runtime Governance for AI Agents: Policies on Paths - arXiv

We argue that the execution path is the central object for effective runtime governance and formalize compliance policies as deterministic ...

arxiv.org

Runtime Governance for AI Agents: Policies on Paths | Maurits Kaptein

We've been building runtime governance infrastructure for AI agents at Kyvvu for a while now, and the conversations we keep having — with ...

linkedin.com

(PDF) Runtime Governance for AI Agents: Policies on Paths

We develop the formal framework for analyzing AI agent governance, present concrete policy examples (inspired by the AI act), discuss a ...

researchgate.net

Harnessing Embodied Agents: Runtime Governance for Policy ...

This paper argues that embodied intelligence requires not only stronger agents, but also stronger runtime governance. We propose a runtime ...

arxiv.org

Runtime Governance for AI Agents: Policies on Paths - alphaXiv

View recent discussion. Abstract: AI agents -- systems that plan, reason, and act using large language models -- produce non-deterministic, path-dependent ...

alphaxiv.org

I built a runtime governance library that intercepts AI agent tool calls ...

We published the findings on arXiv. What Edictum does: Sits between the agent's decision to call a tool and the actual execution. YAML ...

reddit.com

Runtime Governance for Agentic AI Systems - MI9 - arXiv

MI9 introduces real-time controls through six integrated components: agency-risk index, agent-semantic telemetry capture, continuous ...

arxiv.org

Governance-in-the-Loop: Runtime Policy Enforcement for ...

We introduce Governance-in-the-Loop (GiL), a runtime architecture that embeds non-bypassable policy enforcement directly into AI execution ...

ijaibdcms.org

Adaptive Runtime Governance for Autonomous AI Agents - arXiv

Autonomous AI agents can remain fully authorized and still become unsafe as behavior drifts, adversaries adapt, and decision patterns shift ...

arxiv.org

Search: "Governance-in-the-Loop Ru..."

Runtime Policy Enforcement for Autonomous and Distributed AI ...

We introduce Governance-in-the-Loop (GiL), a runtime architecture that embeds non-bypassable policy enforcement directly into AI execution ...

researchgate.net

View of Governance-in-the-Loop: Runtime Policy Enforcement for ...

Return to Article Details Governance-in-the-Loop: Runtime Policy Enforcement for Autonomous and Distributed AI Systems Download Download PDF.

ijaibdcms.org

Kheng Chun Lee's Post - LinkedIn

Trajectory-Level Controls Governance must monitor the actual execution path of the AI. By treating workflows as controlled state transitions, ...

linkedin.com

Runtime Security for AI Agents: An Identity Governance Perspective

Today's report finds that deterministic governance builds on this by adding policy-driven enforcement, bringing structure and predictability to ...

softwareanalyst.substack.com

How the Agent Governance Toolkit Helps You Catch Violations

In part one of this series, we covered AGT's runtime governance: the policy engine, zero-trust identity, execution sandboxing, and the OWASP ...

techcommunity.microsoft.com

Search: "LLM guardrails runtime sa..."

LLM Guardrails That Actually Work: Input, Output, and Runtime

We had a 12% false positive rate on that filter. For every 100 legitimate responses, 12 got blocked or regenerated unnecessarily. Users saw ...

kalviumlabs.ai

Best AI Guardrails in 2026: Tools, Architecture, and How to Choose

An AI guardrail is any runtime mechanism that sits between an AI model and the rest of the system to constrain what the model can receive, produce, or cause.

generalanalysis.com

Benchmarking LLM Guardrail Providers: A Data-Driven Comparison

Benchmarks assess guardrails across dimensions such as detection accuracy, false positive rate, latency impact, and coverage of harm ...

truefoundry.com

Part 2, Guardrail Testing, Validating, Tools and Frameworks

However, guardrails themselves can fail in two critical ways: by overblocking innocuous content (false positives) or by letting harmful content ...

budecosystem.com

[PDF] How Good Are the LLM Guardrails on the Market? A Comparative ...

Overly aggressive filtering (false positives): Highly sensitive guardrails across different systems frequently misclassified harmless queries as threats ...

unit42.paloaltonetworks.com

Essential Guide to LLM Guardrails: Llama Guard, NeMo.. - Medium

Overly aggressive guardrails can hinder the LLM's usefulness, leading to false positives (blocking legitimate content). Find the right ...

medium.com

Guardrails for Generative AI Apps at Runtime: What Works

Azure OpenAI content filtering has triggered false positive blocks on safe prompts due to conservative thresholds. Guardrail drift happens ...

feeds.trussed.ai

OpenGuardrails: A Configurable, Unified, and Scalable Guardrails ...

We present OpenGuardrails, the first fully open-source platform that unifies large-model-based safety detection, manipulation defense, and deployable guardrail ...

arxiv.org

AI Guardrails for Enterprise LLMs: Safety Mechanisms and Tools

Technical guardrails operate at the code level — input validation filters, output classifiers, format enforcement rules, and automated blocking ...

agility-at-scale.com

LLM guardrails: a practical guide to input, output, and pipeline-level ...

How to implement LLM guardrails that actually work in production. Covers input validation, output filtering, PII detection, ...

coverge.ai

Search: "Adaptive Runtime Governan..."

Adaptive Runtime Governance for Autonomous AI Agents - arXiv

Abstract:Autonomous AI agents can remain fully authorized and still become unsafe as behavior drifts, adversaries adapt, and decision ...

arxiv.org

[PDF] Adaptive Runtime Governance for Autonomous AI Agents - arXiv

To make the value of continuous monitoring explicit, we project the trajectory absent the op-300 intervention and report the four events at ...

arxiv.org

Adaptive Runtime Governance for Autonomous AI Agents

This paper proposes The Informational Viability Principle: governing an agent reduces to continuously estimating a bound on unobserved risk and allowing an ...

themoonlight.io

Adaptive Runtime Governance for Autonomous AI Agents - ChatPaper

Autonomous AI agents can remain fully authorized and still become unsafe as behavior drifts, adversaries adapt, and decision patterns shift ...

chatpaper.com

[PDF] MI9: An Integrated Runtime Governance Framework for Agentic AI

The framework enables goal-aware permission management where authorization adapts to changing agent objectives, delegation safety mechanisms that regulate per-.

openreview.net

MI9: An Integrated Runtime Governance Framework for Agentic AI

MI9 provides six coordinated mechanisms: Agency-Risk Index, agent-semantic telemetry, goal-aware authorization monitoring, finite-state conformance engines, ...

arxiv.org

[PDF] Scalable Runtime Governance for Agentic AI in Financial Services

Runtime governance encompasses governance-semantic telemetry, continuous authorisation, temporal and path conformance checking, monitoring ...

papers.ssrn.com

The Future of AI Security Is Runtime Governance - Rampart-AI

Runtime governance solves the three hardest problems in agentic systems: Drift — agents reinterpret goals. Scale — thousands ...

rampart-ai.com