Skip to main content

Kill switch

The kill switch is the emergency brake. When something has gone wrong — an attack in progress, an agent misbehaving, a regulatory inspection, an integration sending bad data — you need to stop everything immediately, with no questions asked. The kill switch does that.

When to use it

The kill switch is for situations where you can’t tolerate any more agent activity until the problem is understood. Examples:
  • Active security incident — credentials may have leaked, agents may be acting on attacker-injected prompts
  • Runaway behavior — agents are looping, spending money, or generating output faster than you can review
  • Regulatory or audit event — a regulator is on-site and you need to demonstrate that everything is paused
  • Integration failure — your data source returned bad data and you’d rather pause than have agents act on it
  • Configuration mistake — a policy change accidentally allowed something that shouldn’t be allowed; pause until you fix it
It’s not for routine policy enforcement. If you’re using the kill switch regularly, the underlying policy is probably wrong and should be tightened instead.

How to activate

In the dashboard, click the prominent red “Halt All Agents” button on the dashboard home. A confirmation modal opens requiring you to:
  1. Type the word HALT to confirm intent (prevents accidental clicks — the button stays disabled until you type the exact string)
  2. Provide a reason for the halt (stored on the org record)
Once confirmed, the kill switch activates immediately. No further agent activity proceeds. The button is only visible to users with the admin role. Approvers and developers can’t activate the kill switch.

What happens to your agents

Once activated: New chains — any call to proofrail.Chain(...) immediately receives a ProofRailKillSwitchError when it tries to record its first event. Your application sees:
from proofrail.exceptions import ProofRailKillSwitchError

try:
    async with proofrail.Chain("any-workflow") as chain:
        await chain.record_agent_action(...)
except ProofRailKillSwitchError as exc:
    # Show a maintenance message — agent activity is paused
    print(f"Agents paused: {exc.reason}")
    print(f"Organization: {exc.organization_id}")
ProofRailKillSwitchError is distinct from ActionDeniedError so your application can handle it differently — typically with a “maintenance mode” message rather than a policy denial. In-flight chains — any chain currently running stops at its next event. The next record_agent_action raises ProofRailKillSwitchError regardless of what the policy would otherwise have decided. The chain is sealed at the point of interruption; a partial receipt is generated. Pending approvals — approval requests already in the dashboard still resolve normally if an approver acts before they’re explicitly canceled. New approval requests can’t be created while the kill switch is active.

What the dashboard shows

While the kill switch is active, every dashboard page shows a banner across the top indicating agents are paused. Anyone with dashboard access knows immediately that the org is in halt state and can see who activated it and why. The settings page shows:
  • Current pause state and timestamp
  • Who activated it and the reason they gave
  • A Resume button (admin-only)

How to resume

Resuming requires the admin role. Once resumed, agents can immediately start new chains. There’s no cooldown period; if your kill switch was a mistake, you can undo it instantly.

SDK behavior detail

The kill switch check happens at the start of every policy evaluation on the backend. Specifically:
  1. Your SDK records an event
  2. The backend receives the event
  3. First step on the backend: check if the org’s kill switch is active
  4. If active, return a denial with kill-switch context
  5. The SDK translates this into ProofRailKillSwitchError
This check is fast (a single database lookup) and runs before any other policy. The kill switch can’t be circumvented by any other policy — it’s the highest-priority rule.
Fast-path local evaluation does not currently check the kill switch. If the SDK takes the local fast-path for an action (because it matches the safe-action criteria), the kill switch doesn’t apply to that action — the action proceeds without hitting the backend.This is a known limitation. The fast-path is meant for low-risk reads — actions that don’t typically need to be halted. But if your threat model requires halting everything including reads, set enable_local_fast_path=False in proofrail.init() so that every action goes through the backend.Full kill-switch enforcement across the fast-path is on the roadmap (background polling of pause state from the SDK).

ProofRailKillSwitchError attributes

When you catch the exception, the available data is:
AttributeDescription
messageDefault: "All agent actions are denied: organisation kill switch is active"
organization_idThe org the kill switch applies to
reasonThe reason the admin gave when activating
See Exceptions reference for the full exception hierarchy.

Audit trail

Every activation and deactivation is logged in your organization’s audit log. The audit log is admin-only and append-only — kill switch events cannot be deleted or modified after the fact.

Receipts during a halt

Receipts are still generated for chains that were interrupted by a kill switch. The receipt records:
  • All events that successfully completed before the halt
  • The exact event where the halt occurred
  • The kill switch reason
  • A summary indicating the chain was interrupted
This means: even in an emergency halt, you don’t lose the audit trail. Anything that did happen before the halt is sealed, signed, and verifiable.

When NOT to use the kill switch

A few cases where the kill switch is the wrong tool:
  • Routine policy enforcement — that’s what regular policies are for. If you’re halting because “this kind of thing keeps happening,” tighten the policy instead.
  • Pausing one agent — the kill switch is org-wide. To pause a single agent, set its risk tier to high (forces approval on all actions) or disable the agent in your code.
  • Pausing one workflow — the kill switch affects all chains. To pause a specific workflow, stop calling its entry point.
  • Slowing things down for testing — use shadow mode on your policies instead. Shadow mode lets you see what would have happened without disrupting your agents.
The kill switch is a blunt instrument by design. It’s meant for emergencies, not routine operations.

Where to go next

Human approval

The more common case: pause one action for review.

Policies

Tightening rules instead of halting after the fact.

Audit receipts

What gets recorded when a halt happens.

Configuration

Disable the fast-path to make the kill switch absolute.