Roast & Rise

Riseplan from Roast & Rise

AI Agent Control Loop for Work

Build a durable operating loop for practical workplace AI agents.

Learn to move from chat assistants to actionable AI agents by setting up control loops, approval ladders, permission maps, and evaluation scorecards. This plan gives you the operating routines, outputs, and rollout strategy to pilot agents with confidence - and proof.

Editorial scene with warm directional light falling across a large, abstract workflow diagram on a table. Several human hands reach in, marking boundaries and checkpoints with simple objects. Visual focus: clarity, order, and proof - not mystique. Palette: orange, deep orange, burnt red, white, and warm neutrals. No visible screens, dashboards, or text.
The agent control loop: from risky unknowns into mapped, visible operating flow. Not a magic trick - a system you can actually see, touch, and steer.

Course thesis

A safe, effective transition to multi-tool AI agents requires a practical control loop: clear delegation, permission mapping, oversight, logging, and a 30-day pilot that makes the agent useful without losing control.

What you leave with

By the end, you will be able to diagnose good use cases, set permission boundaries, run pilots with controlled delegation, measure agent impact, and maintain oversight - so your team can use AI agents safely and productively from day one.

For

Founders, operators, AI product owners, managers, and cross-functional business/technical teams piloting agentic work automation.

Workflow

Piloting workplace AI agents for multi-step task automation across files, browsers, and business tools, using a defined control loop.

Change

From ad hoc chat assistant use to operating an agent loop: defining tasks, gating permissions, documenting runs, and auditing agent actions to prove both capability and control.

What you can do

Use these as checks while you move through the plan.

Distinguish agents from chatbots and select real agent use cases

Set up permission maps and approval paths for agent actions

Run and log agent tasks with auditable records and simple fallback plans

Evaluate and escalate agent outputs, intervening when needed

Plan, execute, and measure a 30-day AI agent pilot with clear safety and utility criteria

Chapters

01

Chapter 1: Diagnose - Where an Agent Makes Sense (and Where It Doesn't)

Define what a practical workplace AI agent is (and isn't). Use the Task Delegation Rubric to shortlist and score real, bounded tasks for agent delegation - avoiding risky guesswork, and giving yourself a focused, viable agent use case to pilot.

A top-down editorial shot of an operator using a scoring rubric to sort paper task cards into two clearly marked zones: 'Agent-Ready' and 'Not Ready.' The operator's hand hovers over a task card, hesitating at the boundary, while others are clearly sorted. Visual focus: honest selection, clear boundaries. Palette: orange, deep orange, burnt red, brand white.
Sifting the real from the risky: every agent starts with a scored, explainable task - no wishlists, no blanks.

Why this matters in the workflow

If you skip diagnosis, you give your agent a wish list, not a job. Not everything that can be automated should be. The cost of a misstep isn't just wasted time - it's risk, noisy runs, and loss of trust in the team or tool. The right agent pilot starts with the right assignment: a task that is structured, bounded, and delivers real value without asking the agent to perform magic or take uncontrolled action.

OpenAI's GPT-5.5 is built for action, not just conversation. But as the Stanford AI Index 2026 makes clear, capability growth is outpacing governance. The work is in control, not speculation. Getting this diagnosis right stops you from delegating the wrong things or using a chatbot when you need a real agent. [1]

The working model

Quality checklist

Every rubric question is answered (no blanks or hand-waves)

Task is a real, live workflow - no hypothetical wishlists

Unclear criteria are explained (why Y/N)

Summary shows a real go/no-go decision

Output lets others review and challenge your choice

Common mistakes

Marking 'yes' without evidence - e.g., calling a task 'well-bounded' with vague steps

Choosing a made-up task rather than a real, running workflow

Skipping fallback/rollback thinking

Delegating sensitive or poorly understood tasks

Not differentiating between chat and action work

Checkpoint

Does your scored rubric and decision summary show a real, bounded agent task - ready for permission mapping?

Exercise

Task Delegation Rubric - Score Your Candidate Agent Task

You can do this in 15 minutes.

  1. Identify one real workflow your team runs today (e.g., onboarding, reporting, invoice processing).
  2. Write out the exact steps and systems involved.
  3. Using the rubric template, score the task on the following dimensions:
  • Multi-step process?
  • Actions span multiple tools/systems?
  • Well-structured with clear rules?
  • Well-bounded (not open-ended)?
  • Measurable outcome?
  • Clear rollback/fallback path?
  • Safe for pilot (no sensitive, legal, or high-risk data)?
  1. For each, mark Y/N and add a note for any that aren't a clear yes.
  2. If your task has more than one unclear/no, pick another. Attach your scored rubric and a 2-line summary of your decision.

Use this at work tomorrow

Apply the rubric to one real process your team actually runs. Give three honest 'no' answers before saying a task is agent-ready.

02

Chapter 2: Design - Permission Maps and Approval Ladders

Turn system access and workflow steps into a clear map: What your agent can do, who says yes, and where escalation kicks in.

Diagonal, editorial illustration of an abstract workflow diagram with clear stage markers: permission zones, approval checkpoints, and escalation arrows. Hands adjust tokens or markers at approval steps. The map transitions from tangled lines to a controlled, staged sequence. Orange, deep orange, burnt red, brand white.
Every control loop is a live map - permissions, approvals, and human stopgaps marked out step by step.

Why this matters in the workflow

Control is the make-or-break line between a useful agent pilot and business chaos. Modern agents act across files, apps, browsers - sometimes in ways no single team member sees. EU AI law (2026) and live operator experience both demand: document what the agent is allowed to do, who approves it, and how exceptions are caught. No permission map, no trust. No approval ladder, no rescue when things run off-course.

Assume every action an agent takes could impact data, compliance, or money. That makes your permission and approval design non-optional - and your safety net when pilots go live.

The working model

Quality checklist

All tools and actions listed for the agent task

Minimum (least privilege) permissions set for each tool/file

Approval points clearly marked in the sequence

Escalation/rollback decision points are explicit, not vague

Checklist answers the practical 'what if' failures

Everything fits on one clear diagram/checklist - screenshot/table OK

Common mistakes

Granting agents too-broad permissions (full admin, blanket write)

Missing approval checkpoints early in workflow

Forgetting to add rollback/escape paths

Assuming all steps need equal approval - overloading humans

Making map static and forgetting to update when workflow changes

Checkpoint

Can you show, on one page, exactly what your agent will do, who approves each step, and when it escalates?

Exercise

Draw Your Agent Permission Map and Approval Ladder

In 15 minutes, produce a live diagram and checklist for one candidate agent task.

Steps:

  1. Choose one concrete agent task from your workflow (e.g., updating billing data in accounting software, posting reports, or triaging support tickets).
  2. Paint your Permission Map: List all tools/files involved. For each, note the minimum permission (read, write, send, update, delete, etc.) the agent truly needs to complete the flow.
  3. Sketch your step sequence - what happens, start to finish?
  4. Mark approval points for each agent action. Who needs to review or decide - agent alone, a human, or on exception?
  5. Highlight your escalation/rollback triggers. When does the agent stop or hand over to a human?
  6. Save your combined diagram and checklist in a team folder.

Template and review rubric below.

Use this at work tomorrow

Map one real agent workflow before launch - spot your permission holes and fix your approval ladder in 15 minutes, before it costs you.

03

Chapter 3: Practice - Run Logging and Oversight in Action

This chapter shows you how to implement live, transparent run logging and human oversight as you put your AI agent to work. You'll use a run log and escalation matrix to capture every agent action, decision, and intervention point. By the end, you'll have a complete, auditable record that proves control and sets up fast recovery if things go wrong.

Cinematic side shot of a shared analog logbook open on a table, with timestamped rows and colored placeholders for entries. Nearby are escalation tokens and a hand reaching in to review or mark an escalation. Editorial feel, orange and warm neutrals.
Nothing hidden: live agent operations get logged, traced, and surfaced for human review at every step.

Why this matters in the workflow

You don't control what you don't log. Frontier agents take multi-step actions at computer speed, across files, emails, and business tools. If you can't trace what happened - step by step - you lose both trust and the ability to recover from surprises. EU AI Act obligations (2026) put this into hard law for many markets: logging, traceability, and oversight aren't optional; they are operational essentials.

A run log does more than tick compliance boxes. When your agent hits a strange case (a permissions bug, an output nobody expected) you need to see exactly what happened: which inputs, which actions, who got notified, whether escalation worked. Oversight isn't a rituals - it's an operating loop that lets your team stay in the driver's seat, even as the agent scales up.

The working model

Quality checklist

All agent actions recorded with timestamp and outcome

Escalation actions clearly logged with responsible human and responses

Human review and sign-off included after run

Run log present in shared, accessible location

Escalation matrix used for every unexpected or blocked event

Common mistakes

Forgetting to log minor or failed agent actions

Using vague escalation steps ("someone should check")

Not updating escalation contacts - leading to dead ends

Treating oversight as optional; skipping final review

Letting agent outputs go unreviewed before external actions

Checkpoint

Can you produce a fully logged and reviewed agent run - including every escalation and the human sign-off?

Exercise

Build and Use an Agent Run Log With Escalation

Objective: Run a real or dry agent task, logging each step, and use the escalation matrix if there is any block, error, or ambiguity. Output is a complete run record.

Steps:
  1. Pick one agent task you mapped earlier (real, sandbox, or tabletop scenario).
  2. Copy the template below into your team doc or notepad.
  3. As the agent (or you, in simulation) works through each step, fill out the log in real time - do not skip error or escalation events.
  4. Consult the escalation matrix for each block/uncertainty; follow and record the protocol exactly.
  5. When done, have a human review and sign off on the log.
  6. Share for team review, or - if running a real agent - archive for compliance/audit.

Output: A complete, team-readable agent run log and escalation record, capturing the workflow, escalation, outcome, and final review.

Use this at work tomorrow

Copy the log template, run your agent task, and document every step and escalation - see exactly where control matters.

04

Chapter 4: Evaluate - Proof and Measurement of Control

Score your agent's task performance, risk management, and recovery using an Eval Scorecard and Audit Checklist. Capture agent utility, error handling, and control evidence in an auditable form.

Editorial shot showing a tabletop with a scorecard artifact (simple shapes for checks and flags) beside the analog logbook. Several hands in discussion: one points to a flagged error, another to a completed action, as they work through a structured checklist sequence. Orange, burnt red, white, warm neutrals.
Post-run review: every agent action scored, risks surfaced, and recovery checked - before scaling responsibility.

Why this matters in the workflow

Every agent run is a risk, not just a sprint. Regulators, executives, and teammates all ask the same question: Did the agent do real work, or did it hide behind plausible nonsense?

When you don't evaluate - when you skip the logs, ignore escalations, or wave away errors - you lose proof of both control and progress. And you build nothing you can trust, scale, or defend.

The solution: a plain, fast, repeatable agent run review. Score what mattered. Check what the agent did vs. what was allowed. Document where the system needed rescue.

Quality checklist

Every scored item is filled honestly: not just wins.

All agent actions and permission links are explicit in the checklist.

Escalations, rollbacks, and human interventions are clearly described.

No unexplained or unlogged actions remain.

The review is saved/stored with the pilot record for future audits.

Common mistakes

Leaving run logs unreviewed, especially after failures or errors.

Only reviewing success cases, not documenting risk or escalation.

Confusing agent autonomy with agent reliability - missing approval step failures.

Relying on memory rather than documented run evidence.

Filling the checklist as a formality, without team discussion of the results.

Checkpoint

Can you show a scored agent run, with risks and human interventions, and link every action to a permission, escalation, or policy?

Exercise

Complete an Eval Scorecard and Audit Checklist on a Recent Agent Run

In the next 15 minutes:

  1. Pull an agent run log from your pilot (or use a recent test run if the pilot hasn't started).
  2. Fill in the Eval Scorecard: score task completion, risk events, and recovery/rollback quality.
  3. Walk through the Audit Checklist for that run: are the right permissions, escalations, and human reviews documented?
  4. Note any gaps, risks, or points where oversight failed - or proved essential.
  5. Save the review for the pilot record. Discuss the results with one teammate if possible.

This output becomes the proof of agent control, directly usable with leadership or audit teams.

Use this at work tomorrow

Pull your next agent run log and walk it through the scorecard. One real review is worth more than ten dashboards.

05

Chapter 5: Implement - Pilot and Scale with a 30-Day Plan

Run a controlled 30-day agent pilot that applies your control loop - logging, permissions, approvals, escalation - so you collect results, measure impact, and decide to scale, adapt, or retire the agent.

Why this matters in the workflow

A live agent pilot is the final exam. All the design - tasks, permissions, logs, approvals - means nothing if a pilot isn't run clean, logged tight, and reviewed with intent. Vendor slides fade. Run data lasts. A 30-day window is short enough to control, long enough to see if the agent is more than a demo trick.

Pilots clarify what the agent can do, where it breaks, and how your team actually steers oversight. EU AI Act rules on transparency, logging, and oversight aren't theory; they shape the real pilot: no logs, no audit, no license to keep scaling.

The working model

Quality checklist

Goals and scope of agent task are specific and mapped to a real workflow

All permissions, forbidden actions, and approvals are concrete and tied to outputs

Escalation and rollback procedures are clearly described and actionable

Weekly review is scheduled with named participants and exact check criteria

Success criteria and pause/stop triggers are measurable, realistic, and clear

Log structure matches compliance and traceability needs

Final review owner is named and accountable

Common mistakes

Skipping weekly reviews or pilot checkpoints - loses control

Setting airy goals like 'improve productivity' with no anchor

Letting agents operate with permissions not documented in plan

Not defining clear rollbacks for agent mistakes

Leaving the final review/decision owner undefined

Checkpoint

Does your 30-day pilot plan specify concrete agent tasks, permissions, approvals, routine reviews, and a go/no-go decision with a named owner?

Exercise

Draft Your 30-Day Agent Pilot Plan

Your task: In 15 minutes, use the template to draft your real 30-Day Agent Pilot Plan. Pick one workflow, set concrete goals, detail agent permissions, specify review points, escalation paths, and document how you'll decide outcome.

Steps:

  1. Identify one workflow and the agent task to automate (use your Task Delegation Rubric).
  2. List the agent's explicit permissions and forbidden actions (reference your Permission Map).
  3. Describe approval steps and escalation/rollback routines (use your Approval Ladder).
  4. Set weekly review meetings: who attends, what's checked?
  5. State pilot success criteria and what triggers pause or rollback.
  6. Define log structure and how evidence will be collected.
  7. Decide: who owns the final go/no-go review at day 30?
  8. Write the plan in the fields below. Share it with your team for quick feedback.

You should finish with a complete, realistic plan ready to pilot in your work setting.

Use this at work tomorrow

Draft your 30-Day Agent Pilot Plan for one real workflow, run a dry review with your team, and be ready to pilot - in less than an hour from now.

30-day path

Week 1: Use the rubric to select your agent use case. Align the team on what is and isn't an agent.

Week 2: Map permissions and set up your approval ladder. Run a tabletop scenario.

Week 2-3: Build your run log template and train agents/humans to use it.

Week 3: Launch a real or sandboxed agent task. Use escalation and fallback if needed.

Week 4: Complete self-audit using scorecard/checklist. Capture lessons and define pilot outcomes (scale, adapt, stop, or try a new workflow).

Success signals

Agent task selection meets rubric criteria and doesn't include risky or poorly defined jobs.

All agent actions and permissions are mapped and have explicit human approval gates.

100% of agent runs during the pilot are logged and auditable.

Escalations and rollbacks are documented and handled quickly.

Pilot outcomes (successes/failures) are tracked, and next steps are clear and actioned within 30 days.

Reflection prompts

Where does this topic show up in real work?

What behavior should change first?

What evidence would prove this Riseplan worked?

Manager checklist

Choose one owner for the behavior change.

Use the exercise on live work.

Review the output before scaling the habit.

Decide what changes after 30 days.

Want this shaped around your company?

Risey can research your company foundation first, then build a version of this path around your real workflows, customers, and culture.

Start with your company