Proposed Methodology | AI Governance Blog

Once the literature review converges and the qualitative research begins to surface evidence from real deployments, the question will shift from understanding the problem to building something that helps solve it. That something is a methodology called ALiEn, short for Agency Licensing and Enforcement, and the way it will be built matters as much as what it contains.

Why a methodology, and not a framework

The enterprise world is not short on frameworks. There are governance frameworks, risk frameworks, ethics frameworks, and responsible AI frameworks, many of them thoughtful, most of them built to operate at a distance from the actual moment of decision. They tend to live in documents, in review boards, in pre-deployment checklists, and in post-incident audits. They bracket the work rather than inhabiting it.

What this research is trying to produce is something different. Not a set of principles to be consulted before things go live, but a set of practices that operate while things are running. A methodology is chosen deliberately here because the problem is not a lack of good intentions or even a lack of good policy. The problem is that governance loses its grip the moment an agent starts making decisions at speed, under ambiguity, across systems, and nobody in the room can quite explain whether what just happened was aligned with what the organization actually meant.

ALiEn is an attempt to close that gap, not by restricting what agents can do, but by making what they do governable in a continuous, operational, and human-legible way.

The early shape of ALiEn

The methodology is organized around a loop, and the loop is designed to feel familiar to anyone who has ever managed delegation in a serious organization. It mirrors the way enterprises already govern human discretion, but it translates that governance into artifacts, evidence, and correction levers that work for artificial actors.

The loop begins with mandate translation. Before an agent is granted authority, the organization's intent for that domain needs to be expressed in terms the delegation can carry. That means more than goals and KPIs. It means permissions, boundaries, escalation duties, tradeoff rules, red lines, and the interpretive guidance that defines what acting on behalf of the organization means in this particular context. The result is not a prompt or a policy document. It is a delegation charter, a living artifact that specifies what the agent may decide, what it may do, when it must defer, and who answers when something goes wrong.

The second movement is runtime evidence. Once an agent is operating under delegated authority, governance cannot depend on periodic audits or retrospective reviews alone. The methodology calls for evidence routines that watch decisions and actions the way organizations have always watched delegates, through patterns, consequences, and observable behavior. The point is not to inspect model internals or chase statistical metrics. The point is to produce evidence that a governance forum, made of real people with real judgment, can actually interpret and act on.

The third movement is authority calibration. This is where the methodology departs most clearly from the binary thinking that dominates most AI governance conversations, the assumption that an agent is either trusted or not, either supervised or autonomous, either approved or blocked. ALiEn treats authority as something that can be adjusted proportionally based on evidence. Where alignment with mandate holds steady over time, autonomy can expand. Where drift appears, authority contracts. The levers available for that calibration are practical ones: permissions, autonomy tiers, tool constraints, workflow gates, escalation thresholds, and adjustments to the delegation charter itself. The goal is not permanent oversight, which exhausts organizations, and not blind trust, which eventually betrays them, but a calibrated middle ground that stays legible and auditable under pressure.

The fourth movement is mandate evolution. This is perhaps the most important, and the most easily overlooked.

The mandate an organization starts with is never complete. It cannot be, because the whole reason delegation exists is that work is too uncertain to specify in advance. New edge cases will surface, ambiguities will be exposed, and the agent itself will sometimes force tensions into the open that the organization has been carrying quietly for years. A good governance methodology does not treat those moments as failures. It treats them as data, as fuel for refining the organization's own understanding of what legitimate authority looks like in that domain. The mandate gets versioned, updated, and improved, not because something went wrong, but because governance is a craft that improves with practice.

Design Science Research, and why it matters here

The way ALiEn will be built is as important as the methodology itself, and the research approach chosen for this work is called Design Science Research.

Design Science Research is a tradition within information systems and organizational research that is built for exactly this kind of problem: situations where the goal is not just to understand a phenomenon, but to create something useful that addresses it. It is the research equivalent of building a bridge while studying the river, where understanding the current and building the crossing are not separate activities but deeply connected ones.

The logic of Design Science Research moves through a deliberate sequence. It begins with a problem that is grounded in practice, not invented in a lab. It draws on existing theory to understand the nature of that problem and to define what a solution would need to do. It then produces an artifact, in this case a methodology, that is designed to address the problem in a principled way. And critically, it evaluates that artifact against the problem, refining it through iteration until it can demonstrate that it does what it claims to do.

What makes this approach especially appropriate for the ALiEn project is that it refuses to separate diagnosis from design. The qualitative research that comes first, interviews with practitioners, analysis of governance artifacts, thematic coding of detection and correction episodes, is not just an academic exercise that precedes the real work. It is the foundation from which design requirements are derived. The breakdowns and successes that surface in the field become the specification for what the methodology must be able to do. ALiEn does not arrive pre-formed and then get validated. It emerges from evidence, gets shaped by what practitioners actually experience, and is refined through expert review and scenario testing until it earns its place.

That is an important distinction, because it is the difference between a methodology that is invented and one that is discovered. The ambition here is the latter.

How the research will unfold

The research program follows a sequence that is tightly aligned to this logic.

The first phase is the qualitative diagnosis. Semi-structured interviews with people closest to the work, product owners, governance and risk professionals, engineers, and business owners who depend on what the agent produces. The focus is on episodes, real moments where an AI agent exercised delegated discretion and something happened. Someone noticed a deviation. Someone made a judgment call about whether the agent's behavior was acceptable. Someone pulled a correction lever, or chose not to. The goal is to understand how detection and correction actually work in the wild, what signals people look for, what evidence they trust, what forums they use to adjudicate, and what levers they reach for when something needs to change.

Alongside those interviews, governance artifacts will be collected as secondary data: agent charters, escalation playbooks, incident records, evaluation rubrics, policy excerpts, and audit traces. These are the documents that carry institutional intent, and they often reveal as much about how governance actually operates as the interviews themselves.

The second phase is where design requirements are derived. The patterns and breakdowns that emerge from the qualitative work are translated into explicit requirements for what an operational mandate drift methodology must be able to do. This is the bridge between understanding the problem and building the response.

The third phase is the design of ALiEn itself, drawing on those requirements to produce a methodology with concrete artifacts, evidence routines, calibration protocols, and mandate evolution practices. The methodology is not expected to arrive fully formed. It is expected to arrive defensibly, meaning every design choice can be traced back to a requirement, and every requirement can be traced back to evidence.

The fourth phase is evaluation. Given the realities of access and timing, the most defensible near-term evaluation is scenario-based and expert-informed, using realistic governance scenarios to stress-test the methodology's components, supplemented by limited operational pilots where feasible. The central capability under test is drift detectability: can the methodology produce evidence that a governance forum can use to detect, interpret, and respond to mandate drift before it becomes invisible or irreversible.

What this methodology is not claiming

Intellectual honesty requires being clear about boundaries.

ALiEn is not claiming that organizational mandate can be fully reduced to rules. Mandate is inherently interpretive, and the methodology is designed to work with that reality rather than pretend it away. It is not claiming that monitoring alone solves governability, because monitoring without interpretation is just surveillance, and surveillance without judgment is just noise. It is not claiming that the first version of a delegation charter will be complete, because completeness is not a realistic starting point for governance of genuinely uncertain work.

What it is claiming is more modest and more durable: that to integrate AI agents into organizational life without destabilizing governance, enterprises need versioned, testable, and auditable mandate artifacts, combined with evidence routines that make drift visible, and correction mechanisms that allow authority to be calibrated proportionally over time.

The methodology will also carry boundary conditions. It will likely behave differently in highly regulated domains than in lightly regulated ones, in high-stakes decisions than in low-stakes ones, in organizations with mature governance documentation than in those where governance is still informal. The degree of tool autonomy, the quality of available telemetry, and the organization's appetite for transparency will all shape how the methodology lands in practice. Those boundaries are not weaknesses. They are the mark of a methodology that takes context seriously.

Why this is being shared now

Everything described in this chapter is preliminary. The qualitative research has not yet been conducted. The design requirements have not yet been derived. The methodology itself exists as a set of early constructs and design intuitions, grounded in years of practitioner experience and sharpened by academic supervision, but not yet validated by the kind of evidence that earns real confidence.

It is being shared now because the most useful feedback tends to arrive before the cement sets. Practitioners who have lived through the moments this research is trying to understand, the moment something felt off mandate, the governance decision that was harder than it should have been, the correction lever that worked in theory but failed in practice, those practitioners can see things in this early design that the research has not yet imagined.

Researchers who work in adjacent areas, organizational control, information systems design, AI governance, sociotechnical systems, can strengthen the theoretical spine and challenge the assumptions that might otherwise travel unchecked into the final design.

This methodology is not being presented as an answer. It is being presented as a carefully structured attempt to earn one, through evidence, through iteration, and through the kind of honest engagement with complexity that governance of artificial agency genuinely requires.