Shadow Mode
Shadow mode is the safety mechanism that ensures no agent acts on real data until it has proven its decision quality matches or exceeds human decisions.
How it works
When an agent runs in shadow mode:
- Observe — The agent gathers context from banking services (accounts, transactions, compliance data, etc.)
- Decide — The agent produces a structured decision with confidence score, risk level, and reasoning
- Log — The decision is recorded in the audit trail alongside the eventual human decision
- Skip act — The agent does not execute any action, regardless of confidence or risk level
The human operator handles the case as they normally would, unaware of the agent's shadow decision. Later, the platform compares the two.
Decision comparison
Every shadow decision is paired with the human outcome:
| Field | Description |
|---|---|
decision | What the agent decided (approve, reject, escalate, resolve, alert, skip) |
confidence | Agent's confidence in its decision (0-100%) |
risk_level | Risk classification (LOW, MEDIUM, HIGH, CRITICAL) |
reasoning | Natural language explanation of why the agent decided this way |
human_decision | What the human actually decided |
human_agreed | Whether the human's decision matched the agent's |
2-week observation period
The minimum observation period is 14 days. During this time:
- The agent must have processed a statistically significant number of cases (varies by agent type, typically 50+ decisions)
- The platform tracks the agreement rate between agent and human decisions
- Daily metrics are rolled up including total runs, auto-resolved count, escalated count, and agreement rates
The 2-week minimum is a platform default. Tenants can extend the observation period but cannot shorten it below 14 days.
Activation threshold
An agent is considered ready for activation when:
- The observation period (minimum 14 days) has elapsed
- The agent has made enough decisions for statistical significance
- The agreement rate meets the threshold (default: 85%)
- There are no CRITICAL-risk disagreements (cases where the agent would have made a dangerous wrong decision)
You can check activation readiness via the metrics endpoint:
curl https://api.korastratum.com/ai/api/v1/ai/agents/reconciliation_agent/metrics \
-H "Authorization: Bearer $KORA_API_KEY" \
-H "X-Tenant-ID: $KORA_TENANT_ID"
The response includes a ready_for_activation boolean and a shadow_comparison object with detailed agreement statistics.
Promoting to active mode
Once an agent is ready, promote it by updating its configuration:
curl -X PUT https://api.korastratum.com/ai/api/v1/ai/agents/reconciliation_agent/config \
-H "Authorization: Bearer $KORA_API_KEY" \
-H "X-Tenant-ID: $KORA_TENANT_ID" \
-H "Content-Type: application/json" \
-d '{
"mode": "active",
"auto_act_threshold": "MEDIUM"
}'
Promoting an agent to active mode means it will execute actions autonomously for decisions at or below the configured risk threshold. Decisions above the threshold are escalated to humans via the workflow system.
Risk thresholds in active mode
When an agent is active, the auto_act_threshold controls which decisions it can execute autonomously:
| Threshold | Agent can auto-act on |
|---|---|
LOW | Only LOW-risk decisions |
MEDIUM | LOW and MEDIUM-risk decisions |
HIGH | LOW, MEDIUM, and HIGH-risk decisions |
CRITICAL | All decisions (not recommended) |
Decisions above the threshold create a human approval request in the workflow system.
Reverting to shadow mode
You can always revert an agent to shadow mode:
curl -X PUT https://api.korastratum.com/ai/api/v1/ai/agents/reconciliation_agent/config \
-H "Authorization: Bearer $KORA_API_KEY" \
-H "X-Tenant-ID: $KORA_TENANT_ID" \
-H "Content-Type: application/json" \
-d '{"mode": "shadow"}'
This is a non-destructive operation. The agent immediately stops acting but continues observing and deciding for comparison.