Operations Dashboard

# Amodal

> Documentation for the Amodal Agent Runtime — build domain-specific AI agents from your repo.

## @amodal/chat-widget

A standalone, embeddable chat widget for adding agent-powered chat to any web application. Lighter than `@amodal/react` — just the chat interface with no extra components.

### Installation

```bash
npm install @amodal/chat-widget
```

### Quick Start

```tsx
import { ChatWidget } from '@amodal/chat-widget'
import '@amodal/chat-widget/style.css'

function App() {
  return (
    <ChatWidget
      apiUrl="http://localhost:3847"
      tenantId="tenant_123"
      position="floating"
    />
  )
}
```

### Positions

| Position   | Behavior                                       |
| ---------- | ---------------------------------------------- |
| `inline`   | Renders in-place within your layout            |
| `floating` | Floating button that expands into a chat panel |
| `right`    | Fixed panel on the right side                  |
| `bottom`   | Fixed panel at the bottom                      |

### Callbacks

```tsx
<ChatWidget
  apiUrl="http://localhost:3847"
  onToolCall={(tool, args) => {
    console.log(`Agent called: ${tool}`, args)
  }}
  onKBProposal={(proposal) => {
    console.log('Knowledge proposal:', proposal)
  }}
/>
```

### SSE Events

The widget handles these event types from the runtime:

| Event                  | Description                    |
| ---------------------- | ------------------------------ |
| `text`                 | Streaming text output          |
| `tool_call`            | Tool invocation                |
| `skill_activated`      | Skill activation               |
| `kb_proposal`          | Knowledge base update proposal |
| `ConfirmationRequired` | Write operation needs approval |

### Theming

CSS custom properties — no Tailwind dependency:

```css
.pcw-widget {
  --pcw-primary: #6e56cf;
  --pcw-background: #ffffff;
  --pcw-text: #1a1a1a;
  --pcw-border: #e5e5e5;
  --pcw-radius: 12px;
}
```

### Bundle Size

| Format | Size   |
| ------ | ------ |
| ESM    | \~18KB |
| UMD    | \~13KB |
| CSS    | \~6KB  |


## React SDK

Embed Amodal agents in your product with two React packages:

* **[@amodal/react](/sdk/react)** — High-level components: `AmodalProvider`, `AmodalChat`, `AmodalAction`, and hooks for common agent interactions.
* **[@amodal/chat-widget](/sdk/chat-widget)** — Standalone chat widget with SSE streaming, theming, and callbacks.

### Quick Start

```bash
npm install @amodal/react
```

```tsx
import { AmodalProvider, AmodalChat } from '@amodal/react'

function App() {
  return (
    <AmodalProvider apiUrl="http://localhost:3847">
      <AmodalChat />
    </AmodalProvider>
  )
}
```

### Core SDK

For server-side or non-React usage, the `@amodal/core` package provides the full runtime:

```typescript
import { AgentSDK } from '@amodal/core'

const sdk = new AgentSDK({
  platformApiUrl: process.env.PLATFORM_API_URL,
  platformApiKey: process.env.PLATFORM_API_KEY,
  orgId: 'org_123',
})

await sdk.initialize()
const config = sdk.getConfig()
```

#### Runtime Capabilities

The SDK provides out of the box:

* **ReAct loop** with configurable max turns and timeout
* **Smart compaction** — structured state snapshots across context compression
* **Loop detection** — pattern matching + LLM-based detection
* **Tool output masking** — backward-scan FIFO that prunes bulky outputs
* **Task dispatch** — parallel sub-agent execution with depth limiting
* **Knowledge loading** — on-demand KB documents
* **Role-based filtering** — tools and skills scoped by role
* **Audit logging** — every tool call and session logged
* **SSE streaming** — real-time events for web clients
* **MCP support** — Model Context Protocol client


## @amodal/react

High-level React components for embedding Amodal agents in your product. Provides a provider, chat UI, action components, and hooks.

### Installation

```bash
npm install @amodal/react
```

### Components

#### AmodalProvider

Wraps your app with agent context:

```tsx
import { AmodalProvider } from '@amodal/react'

<AmodalProvider
  apiUrl="http://localhost:3847"
  tenantId="tenant_123"
>
  {children}
</AmodalProvider>
```

#### AmodalChat

Full chat interface:

```tsx
import { AmodalChat } from '@amodal/react'

<AmodalChat />
```

#### AmodalAction

Trigger agent actions from buttons or other UI elements:

```tsx
import { AmodalAction } from '@amodal/react'

<AmodalAction prompt="Summarize today's alerts">
  Get Summary
</AmodalAction>
```

### Hooks

| Hook               | Description                            |
| ------------------ | -------------------------------------- |
| `useAmodalBrief`   | Get a quick agent summary on a topic   |
| `useAmodalInsight` | Request a detailed analysis            |
| `useAmodalTask`    | Start and track background tasks       |
| `useAmodalQuery`   | Run a query and get structured results |

### Confirmation & Review

Built-in confirmation and review UIs for write operations. When the agent needs user approval, these components render automatically within the chat flow.

### Bundle Size

\~26.7KB ESM


## Quick Start

### 1. Initialize a project

```bash
npx amodal init
```

This scaffolds an `amodal.json` config file, sample skill, and sample knowledge document in your repo. The init command is interactive — it asks for your product type and sets up appropriate templates.

### 2. Configure your provider

Set your LLM provider credentials. Amodal auto-detects from environment variables:

```bash
# Pick one:
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
export GOOGLE_API_KEY=...
```

Or configure explicitly in `amodal.json`:

```json
{
  "name": "My Agent",
  "provider": "anthropic",
  "model": "claude-sonnet-4-20250514"
}
```

### 3. Add a connection

Install a pre-built plugin or create a custom connection:

```bash
# Install a plugin (e.g., Slack, GitHub, Datadog)
amodal connect slack

# Or sync from an OpenAPI spec
amodal sync --from https://api.example.com/openapi.json
```

Connections live in `connections/` and give the agent both API access and documentation.

### 4. Start the dev server

```bash
amodal dev
```

This starts the runtime server on `localhost:3847` with:

* Hot reload — edit any config file and the agent updates instantly
* File watching with 300ms debounce
* Session management with TTL

### 5. Chat with your agent

```bash
amodal chat
```

This opens a terminal chat UI (React-based TUI with Ink). You can also connect to a remote server:

```bash
amodal chat --url http://localhost:3847
```

Or resume a previous session:

```bash
amodal chat --resume latest
```

### 6. Validate and inspect

```bash
amodal validate    # Check for missing connections, config issues
amodal inspect     # Show compiled context with token counts
```

### What's Next

* **[Project Structure](/quickstart/project-structure)** — Understand the repo layout
* **[CLI Reference](/cli)** — All 30+ commands
* **[Add skills](/guide/skills)** — Author reasoning frameworks
* **[Deploy](/cli/deploy)** — Ship to production


## Introduction

Amodal is a **git-repo-centric agent runtime**. Your agent's configuration — connections, skills, knowledge, tools, automations — lives as files in your repository. The runtime reads these files, compiles them into an optimized context, and runs a reasoning loop against any supported LLM provider.

### How It Works

```
Your Repo
  ├── amodal.json          ← agent identity, provider config
  ├── connections/          ← API credentials + docs (or plugins)
  ├── skills/              ← Markdown reasoning frameworks
  ├── knowledge/           ← Domain knowledge documents
  ├── tools/               ← Custom HTTP/chain/function tools
  ├── automations/         ← Scheduled or trigger-based runs
  └── evals/               ← Test cases for agent quality

       ↓ amodal dev

Runtime Server (localhost:3847)
  ├── Context Compiler     ← builds optimized prompts
  ├── Token Allocator      ← manages context window budget
  ├── Security Layer       ← field scrubbing, output guards, action gates
  ├── Provider Adapter     ← Anthropic / OpenAI / Gemini / Bedrock / Azure
  └── Session Manager      ← TTL-based sessions with hot reload
```

### The Core Loop

Every agent runs the same fundamental cycle:

```
Explore → query connected systems, load knowledge, gather context
Plan    → reason about findings, decide next steps
Execute → call APIs, dispatch sub-agents, present results, learn
```

Simple questions skip planning. Complex questions get the full loop with multi-agent dispatch and skill activation. The runtime matches depth to the question automatically.

### Key Capabilities

| Capability                  | What It Does                                                                       |
| --------------------------- | ---------------------------------------------------------------------------------- |
| **Multi-provider**          | Anthropic Claude, OpenAI, Google Gemini, AWS Bedrock, Azure OpenAI — with failover |
| **Git-native config**       | Everything is a file. Version, diff, review, and deploy your agent like code       |
| **20+ connection plugins**  | Slack, GitHub, Stripe, Datadog, Jira, PagerDuty, Salesforce, and more              |
| **Security infrastructure** | Field scrubbing, output guards, action gates, scope checking, leak detection       |
| **Evaluation framework**    | LLM-judged evals, experiments, cost tracking, multi-model comparison               |
| **Hot reload**              | File watcher on your repo — edit config, agent updates instantly                   |
| **React SDK**               | `@amodal/react` components and `@amodal/chat-widget` for embedding                 |
| **Snapshot deployment**     | Build → snapshot → deploy to platform or self-host                                 |

### Next Steps

* **[Quick Start](/quickstart/create-agent)** — Build your first agent
* **[Project Structure](/quickstart/project-structure)** — What goes where in your repo
* **[CLI Overview](/cli)** — All available commands


## Project Structure

Everything that defines your agent lives in your repo root. The runtime reads these files at startup and watches for changes during development.

```
my-agent/
├── amodal.json                  ← root config: name, models, platform, sandbox, MCP
├── connections/                 ← API connections (plugins or custom)
│   ├── slack/
│   │   ├── spec.json            ← API source, auth, entities, sync config
│   │   ├── access.json          ← field restrictions, action tiers, scoping
│   │   ├── surface.md           ← (optional) endpoint documentation
│   │   ├── entities.md          ← (optional) entity definitions
│   │   └── rules.md             ← (optional) business rules
│   └── internal-api/
│       ├── spec.json
│       └── access.json
├── skills/                      ← Markdown reasoning frameworks
│   └── triage/
│       └── SKILL.md
├── knowledge/                   ← Domain knowledge documents
│   ├── environment.md
│   └── baselines.md
├── tools/                       ← Custom tool handlers
│   └── create_ticket/
│       ├── tool.json            ← (optional) metadata, parameters, confirmation
│       └── handler.ts           ← handler code
├── stores/                      ← Persistent data store schemas
│   └── active-alerts.json
├── pages/                       ← React UI views (dashboards, briefs)
│   ├── ops-dashboard.tsx
│   └── morning-brief.tsx
├── automations/                 ← Scheduled or trigger-based runs
│   └── daily-digest.json
├── evals/                       ← Test cases for agent quality
│   └── triage-accuracy.md
├── agents/                      ← subagents + built-in overrides
│   ├── explore/                 ← override: replaces default explore agent
│   │   └── AGENT.md
│   ├── compliance-checker/      ← custom subagent
│   │   └── AGENT.md
│   └── vendor-lookup/           ← custom subagent
│       └── AGENT.md
└── .amodal/
    └── store-data/              ← PGLite data directory (gitignored)
```

### Config Reference

| File/Directory       | What It Does                                           | Docs                                    |
| -------------------- | ------------------------------------------------------ | --------------------------------------- |
| `amodal.json`        | Agent identity, models, platform, sandbox, MCP servers | [amodal.json](/guide/config)            |
| `connections/`       | API credentials + docs + access rules                  | [Connections](/guide/connections)       |
| `skills/`            | Expert reasoning frameworks                            | [Skills](/guide/skills)                 |
| `knowledge/`         | Domain knowledge documents                             | [Knowledge Base](/guide/knowledge-base) |
| `tools/`             | Custom tool handlers with code                         | [Tools](/guide/tools)                   |
| `stores/`            | Persistent data store schemas                          | [Stores](/guide/stores)                 |
| `pages/`             | React UI views (dashboards, briefs, detail screens)    | [Pages](/guide/pages)                   |
| `automations/`       | Scheduled/triggered agent runs                         | [Automations](/guide/automations)       |
| `evals/`             | Test cases and assertions                              | [Evals](/guide/evals)                   |
| `agents/`            | Custom subagents + built-in agent overrides            | [Agents](/guide/agents)                 |
| MCP in `amodal.json` | External tool servers via Model Context Protocol       | [MCP Servers](/guide/mcp)               |

### Hot Reload

During `amodal dev`, the runtime watches all config files for changes with 300ms debounce. Edit any file and the agent picks up changes instantly — no restart needed.


## Learn \[Architecture, patterns, and use cases]

Amodal is a git-repo-centric agent runtime built on the explore-plan-execute loop. This section covers how the architecture works, why decisions were made, and how different industries use the platform.

### Architecture

* **[The Core Loop](/learn/architecture/core-loop)** — Explore, plan, execute — and how the runtime matches depth to the question.
* **[Agent Architecture](/learn/architecture/agents)** — Primary agents, task agents, context isolation, and multi-agent coordination.
* **[Context Management](/learn/architecture/context)** — Smart compaction, tool output masking, and how agents stay under context limits.

### Use Cases

* **[Security Operations](/learn/use-cases/security)** — Triage alerts, hunt threats, investigate incidents, and hand off across shifts.
* **[Financial Analysis](/learn/use-cases/finance)** — Reconcile records, explain anomalies, and monitor financial metrics.
* **[IT Operations](/learn/use-cases/it-ops)** — Incident response, environment scanning, and proactive monitoring.
* **[Customer Support](/learn/use-cases/support)** — Route tickets, gather context, and accelerate resolution.

### Detailed Reference

For implementation details, see the Reference section covering the runtime server, platform API, and admin UI.


## Financial Analysis

Finance teams use Amodal to **reconcile records across systems**, **explain metric anomalies**, and **monitor financial health**. The agent connects to accounting and payment systems, loads financial analysis skills, and reasons about discrepancies with precision.

### Reconciliation

```
User: "Reconcile yesterday's QuickBooks entries against Stripe"

Agent activates: Financial Reconciliation skill
  → Dispatches parallel task agents:
    1. Pull QuickBooks transactions for yesterday
    2. Pull Stripe charges and payouts for yesterday
    3. Load matching rules from KB

  → Matches records by amount, timestamp, and reference
  → Identifies 3 unmatched entries with explanations
  → Presents comparison table with discrepancy details
```

### Anomaly Explanation

```
User: "Why is this month's revenue 15% below forecast?"

Agent activates: Anomaly Explanation skill
  → Gathers revenue data from QuickBooks
  → Pulls customer churn data from CRM
  → Checks for pricing changes, seasonal patterns
  → Correlates with marketing spend changes

  → Explains: "Two enterprise customers churned (accounting for 12%),
     combined with a seasonal dip (3%)"
```

### Key Connections

| System                | What It Provides                           |
| --------------------- | ------------------------------------------ |
| QuickBooks            | Accounting data, invoices, journal entries |
| Stripe                | Payment transactions, payouts, disputes    |
| Shopify               | Commerce data, orders, refunds             |
| Excel / Google Sheets | Custom reports, forecasts                  |

### Relevant Skills

* **Financial Reconciliation** — match records across systems, explain discrepancies
* **Anomaly Explanation** — identify why a metric deviated from baseline
* **Triage** — prioritize financial alerts and exceptions


## IT Operations

IT operations teams use Amodal to **respond to incidents**, **scan environments for changes**, and **monitor infrastructure health**. The agent connects to cloud providers, monitoring tools, and incident management systems.

### Incident Response

```
User: "The checkout API is returning 500s"

Agent activates: Incident Response skill
  → Dispatches parallel task agents:
    1. Query Datadog for checkout-api error metrics
    2. Check recent deployments to checkout-api
    3. Query dependent service health (database, cache, payment gateway)
    4. Pull PagerDuty alert history

  → Identifies: database connection pool exhaustion after a traffic spike
  → Recommends: increase pool size, add connection retry logic
  → Offers to create a Jira ticket with findings
```

### Proactive Monitoring

Set up automations to catch issues before users notice:

```typescript
await client.automations.create({
  name: 'Infrastructure Health Check',
  prompt: `Scan all monitored services for:
    - Error rates above baseline
    - Latency degradation
    - Resource utilization above 80%
    - Certificate expirations within 30 days
    Report findings by severity.`,
  schedule: '0 */4 * * *', // Every 4 hours
  output: { channel: 'slack', target: '#ops-alerts' },
  skills: ['triage'],
})
```

### Key Connections

| System               | What It Provides                           |
| -------------------- | ------------------------------------------ |
| AWS / GCP / Azure    | Cloud resource state, logs, metrics        |
| Datadog / New Relic  | APM, infrastructure metrics, log analytics |
| PagerDuty / OpsGenie | Alert management, oncall routing           |
| Jira / Linear        | Ticket creation, incident tracking         |

### Relevant Skills

* **Incident Response** — gather context, assess impact, coordinate response
* **Triage** — scan, prioritize, filter noise
* **Deep Dive** — exhaustive service profiling


## Security Operations

Security teams use Amodal to **triage alerts**, **hunt threats**, **investigate incidents**, and **hand off across shifts**. The agent connects to security tools (SIEM, EDR, vulnerability scanners), loads security-specific skills, and reasons about findings with domain expertise.

### Alert Triage

```
User: "Review the last hour of alerts"

Agent activates: Triage skill
  → Dispatches 3 parallel task agents:
    1. Query SIEM for alerts in the last hour
    2. Check known false positives in KB
    3. Pull recent deployment/change context

  → Filters noise: 47 alerts → 3 worth investigating
  → Presents findings with severity cards and timeline
```

### Threat Hunting

```
User: "Hunt for lateral movement from 10.0.3.42"

Agent activates: Threat Hunt skill
  → Dispatches task agents to query:
    - Network flow logs for 10.0.3.42
    - Authentication logs for the associated user
    - Process execution logs on the host
    - DNS query logs for unusual domains

  → Correlates findings across data sources
  → Builds a timeline of suspicious activity
  → Presents scope map showing affected systems
```

### Key Connections

| System                    | What It Provides                        |
| ------------------------- | --------------------------------------- |
| Datadog / Splunk          | SIEM data, log queries, metric analysis |
| CrowdStrike / SentinelOne | EDR telemetry, process trees, IOCs      |
| PagerDuty                 | Alert management, oncall routing        |
| Jira / ServiceNow         | Ticket creation, incident tracking      |
| Slack                     | Team communication, status updates      |

### Relevant Skills

* **Triage** — scan, prioritize, filter noise
* **Deep Dive** — exhaustive entity profiling
* **Threat Hunt** — proactive targeted search
* **Incident Response** — context gathering, impact assessment
* **Shift Handoff** — summarize findings for the next shift


## Customer Support

Support teams use Amodal to **gather context before responding**, **route tickets intelligently**, and **accelerate resolution** by pulling relevant information from internal systems automatically.

### Context Gathering

```
User: "Customer reports they can't complete checkout — ticket #4521"

Agent:
  → Pulls ticket details from Zendesk
  → Looks up customer account in the CRM
  → Checks for active incidents affecting checkout
  → Queries recent error logs for the customer's session
  → Checks if the customer's payment method has known issues

  → Presents: "Customer is on iOS 17.2 Safari, which has a known bug
     with our payment form (KB: known-issues/safari-17.2). The fix
     shipped in v2.3.1 but their CDN is still serving v2.3.0."
```

### Intelligent Routing

Set up automations to classify and route tickets:

```typescript
await client.automations.create({
  name: 'Ticket Router',
  prompt: `Classify the incoming ticket by:
    1. Category (billing, technical, account, feature-request)
    2. Severity (critical, high, medium, low)
    3. Required expertise (frontend, backend, payments, security)
    Route to the appropriate team queue.`,
  trigger: {
    source: 'webhook',
    filter: { 'event.type': 'ticket.created' },
  },
  output: { channel: 'webhook', target: 'https://api.zendesk.com/routing' },
})
```

### Key Connections

| System                    | What It Provides                    |
| ------------------------- | ----------------------------------- |
| Zendesk / Intercom        | Ticket details, customer history    |
| CRM (Salesforce, HubSpot) | Account info, subscription status   |
| Internal APIs             | Product data, feature flags, config |
| Monitoring (Datadog)      | Error logs, performance data        |

### KB Growth

Support agents contribute significantly to the learning flywheel. Every resolved ticket can produce:

* **Known issues**: bugs and workarounds
* **Resolution patterns**: common fixes for recurring problems
* **False positives**: alerts that look like customer issues but aren't


## Agent Architecture

Amodal uses a multi-agent architecture where a **primary agent** delegates data-intensive work to ephemeral **task agents**, keeping its own context clean for reasoning.

### Agent Types

#### Primary Agent

One per chat session. Handles the conversation, reasons about findings, decides next steps, and presents results. Never loads API docs or parses raw JSON directly.

#### Task Agents

Ephemeral workers dispatched by the primary agent (or by other task agents). Each gets its own fresh context, loads KB docs, queries systems, interprets raw responses, and returns a clean summary (\~200-500 tokens). Context is discarded after.

#### Automations

Scheduled or triggered agent runs from installed marketplace automations plus custom automations. Admin configures parameters (frequency, channels), not logic.

### Context Isolation

This is the key architectural insight. Without task agents, raw data accumulates in the primary agent's context and never leaves:

```
Without task agents:                 With task agents:

[System prompt: 2K]                 [System prompt: 2K]
[User: "Why is cash flow negative?"][User: "Why is cash flow negative?"]
[API docs: 8K]     ← stuck forever  [Task result: 200 tokens]
[Raw JSON: 3K]     ← stuck forever  [Task result: 250 tokens]
[Pattern docs: 4K] ← stuck forever  [Task result: 150 tokens]
Context: 28K+ and growing           Context: 3K — clean, focused
```

Task agents can dispatch sub-task agents (depth 2). Depth 3 is rare. Depth 4+ is blocked.

### Task Agent Capabilities

Task agents get a focused set of tools:

| Tool                   | Available           |
| ---------------------- | ------------------- |
| `request` (read-only)  | Yes                 |
| `shell_exec`           | Yes                 |
| `load_knowledge`       | Yes                 |
| `dispatch` (sub-tasks) | Yes (depth limited) |
| `present` (widgets)    | No                  |
| `propose_knowledge`    | No                  |
| Skills                 | No                  |

Task agents produce **200-500 token summaries** from 8-15K of processed data. The primary agent reasons about these summaries, not the raw data.

### Dispatch Example

```
User: "What happened to the payment service in the last hour?"

Primary Agent dispatches 3 task agents in parallel:
  1. "Query Datadog for payment-service metrics (error rate, latency, throughput)"
  2. "Check PagerDuty for alerts on payment-service in the last hour"
  3. "Query deployment logs for payment-service changes"

Each task agent:
  - Loads relevant KB docs (API documentation)
  - Makes API calls via `request`
  - Processes raw JSON responses
  - Returns a clean summary

Primary Agent receives 3 summaries (~600 tokens total)
  → Correlates: deployment at 2:15 PM matches error spike at 2:17 PM
  → Presents findings with timeline widget
```


## Context Management

The SDK provides several mechanisms to keep agent context clean and within limits, even during complex multi-step investigations.

### Smart Compaction

When context approaches the limit, the SDK performs **structured state snapshots** that preserve key findings while discarding intermediate reasoning and raw data.

What's preserved:

* Key findings and conclusions
* Active hypotheses
* Entities and relationships discovered
* User preferences from the session

What's discarded:

* Intermediate reasoning chains
* Raw tool outputs already summarized
* Superseded hypotheses

### Tool Output Masking

A **backward-scan FIFO** mechanism prunes bulky tool outputs while protecting recent context. Older tool responses are truncated or removed when newer, more relevant data arrives.

The mask operates on a priority system:

1. Recent tool outputs are protected
2. Older outputs with summaries already incorporated are candidates for removal
3. Large raw JSON responses are first to be pruned

### On-Demand Knowledge Loading

The agent starts with a **compact KB index** (\~400 tokens for \~20 docs) rather than loading all knowledge upfront. Full documents are loaded via `load_knowledge` only when needed.

```
Prompt: [KB Index — 400 tokens]
  - system_docs/datadog: "Datadog API endpoints and authentication"
  - methodology/error-rates: "Error rate interpretation thresholds"
  - patterns/maintenance-windows: "Known maintenance window schedules"
  ...

Agent: "I need the Datadog API docs"
→ load_knowledge("system_docs/datadog")
→ Full document loaded into context (2K tokens)
```

Task agents load their own KB docs independently — the primary agent's context is unaffected.

### Loop Detection

The SDK detects when an agent enters unproductive loops:

* **Pattern matching** — repeated identical tool calls or reasoning patterns
* **LLM-based detection** — the model evaluates whether it's making progress

When a loop is detected, the agent is nudged to try a different approach or escalate to the user.


## The Core Loop

Every Amodal agent runs the same fundamental loop:

```
Explore → what's going on? query systems, load context, gather data
Plan    → what should happen? reason about findings, decide next steps
Execute → do it. call APIs, dispatch agents, present results, learn
```

### Adaptive Depth

Not every question needs the full loop. The runtime matches depth to the question automatically:

| Question                           | Loop Behavior                                                           |
| ---------------------------------- | ----------------------------------------------------------------------- |
| "What's the current error rate?"   | Explore only — query and answer                                         |
| "Why did latency spike at 3 PM?"   | Explore + Plan — gather data, correlate, explain                        |
| "Investigate the payment failures" | Full loop — multi-agent dispatch, iterative reasoning, skill activation |

### The Compounding Effect

The loop compounds through the knowledge base. Every execution can feed knowledge back via `propose_knowledge`, so the next explore phase starts smarter.

```
Session 1: Explore → slow, everything is new
           Plan    → generic reasoning
           Execute → discover false positive, propose KB update

Session 50: Explore → fast, KB has patterns and baselines
            Plan    → informed reasoning with historical context
            Execute → focused on novel signals, skip known patterns
```

This is the flywheel — the system learns from use. See [Knowledge Base](/guide/knowledge-base) for details.

### ReAct Loop

Under the hood, the core loop is implemented as a **ReAct loop** (Reason + Act). The agent alternates between reasoning about what it knows and taking actions (tool calls) to learn more.

The SDK provides configurable limits:

* **Max turns** — prevent infinite loops
* **Timeout** — hard time limit on sessions
* **Loop detection** — pattern matching + LLM-based detection of unproductive states


## Agents

The `agents/` directory defines **custom subagents** and lets you **override built-in agents**. Each subdirectory is an agent with an `AGENT.md` file.

```
agents/
├── explore/              ← override: replaces default explore agent
│   └── AGENT.md
├── plan/                 ← override: replaces default plan agent
│   └── AGENT.md
├── compliance-checker/   ← subagent: custom task agent
│   └── AGENT.md
└── vendor-lookup/        ← subagent: custom task agent
    └── AGENT.md
```

### Reserved Names (Overrides)

These directory names override built-in agents:

| Name      | What It Overrides                                                                 |
| --------- | --------------------------------------------------------------------------------- |
| `explore` | The explore sub-agent that gathers data from connected systems                    |
| `plan`    | The plan agent that reasons before executing                                      |
| `main`    | The primary agent prompt (backward compat: `agents/main.md` flat file also works) |

Override agents use the raw `AGENT.md` content as the system prompt.

### Custom Subagents

Any directory that isn't a reserved name defines a **custom subagent** — a reusable task agent the primary agent can dispatch by name for specialized work.

#### AGENT.md Format (heading-based)

```markdown
# Agent: Compliance Checker

Checks regulatory compliance across transactions and flags violations.

## Config

tools: [shell_exec, load_knowledge, dispatch]
maxDepth: 2
maxToolCalls: 15
timeout: 60
modelTier: advanced
targetOutputMin: 200
targetOutputMax: 500

## Prompt

You are a compliance specialist. When dispatched:

1. Load the relevant compliance KB documents for the regulation in question
2. Query the transaction system for the entities specified
3. Check each entity against the compliance rules
4. Return a structured report:
   - Compliant items (brief)
   - Violations (detailed, with rule references)
   - Recommendations
```

#### AGENT.md Format (frontmatter)

```markdown
---
displayName: Vendor Lookup
description: Enriches vendor profiles from CRM and public data
tools: [shell_exec, load_knowledge]
maxDepth: 1
maxToolCalls: 10
timeout: 30
modelTier: simple
---

Query the vendor management system for the requested vendor.
Cross-reference with public data sources.
Return a standardized vendor profile with:
- Company info, industry, size
- Contract history
- Risk indicators
```

### Configuration Fields

| Field             | Type                            | Default                        | Description                        |
| ----------------- | ------------------------------- | ------------------------------ | ---------------------------------- |
| `displayName`     | string                          | directory name                 | Human-readable name                |
| `description`     | string                          | displayName                    | Short description                  |
| `tools`           | string\[]                       | `[shell_exec, load_knowledge]` | Tools this agent can use           |
| `maxDepth`        | number (1-4)                    | `1`                            | Dispatch depth (1 = no sub-agents) |
| `maxToolCalls`    | number (1-100)                  | `10`                           | Max tool calls per execution       |
| `timeout`         | number (5-600)                  | `20`                           | Timeout in seconds                 |
| `targetOutputMin` | number (50-2000)                | `200`                          | Min output tokens                  |
| `targetOutputMax` | number (50-2000)                | `400`                          | Max output tokens                  |
| `modelTier`       | `simple \| default \| advanced` | —                              | Model selection tier               |

#### Model Tiers

| Tier       | Use Case                                           |
| ---------- | -------------------------------------------------- |
| `simple`   | Data gathering, API queries, structured extraction |
| `default`  | Standard reasoning                                 |
| `advanced` | Complex analysis, multi-step reasoning             |

#### Available Tools

| Tool             | Description                                   |
| ---------------- | --------------------------------------------- |
| `shell_exec`     | Run scripts and commands                      |
| `load_knowledge` | Pull KB documents into context                |
| `dispatch`       | Delegate to sub-task agents (increases depth) |

### How Subagents Are Dispatched

The primary agent uses the `dispatch` tool to invoke subagents by name:

```
Primary Agent: "I need to check compliance for these transactions"

dispatch({
  agent: "compliance-checker",
  query: "Check SOX compliance for transactions TXN-001 through TXN-050"
})

→ Compliance checker agent runs in isolated context
→ Returns 200-500 token summary
→ Primary agent continues reasoning with the result
```

Each subagent gets its own context window. Results are returned as clean summaries, keeping the primary agent's context focused on reasoning.

### Context Isolation

```
Without subagents:                    With subagents:

[System prompt: 2K]                   [System prompt: 2K]
[User: "Check compliance"]           [User: "Check compliance"]
[Compliance rules: 8K]  ← stuck      [Subagent result: 300 tokens]
[Raw transactions: 5K]  ← stuck      [Subagent result: 250 tokens]
Context: 20K+ and growing            Context: 3K — clean, focused
```

### Backward Compatibility

Flat files `agents/main.md` and `agents/explore.md` still work as override files. If both a flat file and a subdirectory exist, the subdirectory takes precedence.


## Automations

Automations run your agent on a schedule or in response to webhooks. Define them in `automations/` as JSON (or Markdown).

### JSON Format (recommended)

```json
{
  "title": "Daily Revenue Digest",
  "schedule": "0 9 * * 1-5",
  "prompt": "Pull yesterday's revenue data and summarize by region. Highlight any anomalies compared to the weekly baseline."
}
```

| Field      | Type                              | Description                                            |
| ---------- | --------------------------------- | ------------------------------------------------------ |
| `title`    | string                            | Display name                                           |
| `prompt`   | string                            | The message sent to the agent when the automation runs |
| `schedule` | string                            | Cron expression (triggers `cron` mode)                 |
| `trigger`  | `"cron" \| "webhook" \| "manual"` | Trigger type (auto-inferred from `schedule`)           |

### Trigger Types

| Type        | How It Runs                                                 |
| ----------- | ----------------------------------------------------------- |
| **cron**    | On a schedule — inferred when `schedule` is present         |
| **webhook** | In response to an HTTP POST to the automation's webhook URL |
| **manual**  | On-demand via CLI or API                                    |

### Markdown Format (legacy)

```markdown
# Automation: Morning Brief

Schedule: 0 7 * * *

## Check

Pull all active deals and recent activities from the CRM.
Summarize wins, losses, and pipeline changes.
```

### How Runs Work

Each run is **stateless** — queries systems fresh with `since=lastRunTimestamp`. The agent gets `lastRunSummary` for continuity.

1. Scheduler or webhook triggers the automation
2. Fresh session created with the automation's prompt
3. Agent runs explore-plan-execute loop
4. Results routed to configured output channel

### Managing Automations

```bash
amodal automations list
amodal automations pause <name>
amodal automations trigger <name>    # manual trigger
```

### Guardrails

* Automations **cannot write by default**
* Each run is stateless — no accumulated side effects
* Output is routed to channels, not acted on


## amodal.json

The root configuration file at the repo root. Defines the agent's identity, model configuration, and platform integration.

### Full Schema

```json
{
  "name": "ops-agent",
  "version": "1.0.0",
  "description": "Infrastructure monitoring and incident response",

  "models": {
    "main": {
      "provider": "anthropic",
      "model": "claude-sonnet-4-20250514",
      "fallback": {
        "provider": "openai",
        "model": "gpt-4o"
      }
    },
    "explore": {
      "provider": "anthropic",
      "model": "claude-haiku-4-5-20251001"
    }
  },

  "userContext": "You are an operations agent for Acme Corp...",

  "platform": {
    "projectId": "proj_abc123",
    "apiKey": "env:PLATFORM_API_KEY"
  },

  "sandbox": {
    "shellExec": false,
    "template": "daytona-template-id",
    "maxTimeout": 30000
  },

  "stores": {
    "dataDir": ".amodal/store-data",
    "backend": "pglite",
    "postgresUrl": "env:DATABASE_URL"
  },

  "proactive": {
    "webhook": "https://api.example.com/webhook"
  },

  "mcp": {
    "servers": {
      "github": {
        "transport": "stdio",
        "command": "uvx",
        "args": ["mcp-server-github"],
        "env": { "GITHUB_TOKEN": "env:GITHUB_TOKEN" }
      }
    }
  }
}
```

### Fields

#### Required

| Field         | Type        | Description                    |
| ------------- | ----------- | ------------------------------ |
| `name`        | string      | Agent name (min 1 char)        |
| `version`     | string      | Semantic version               |
| `models.main` | ModelConfig | Primary agent model (required) |

#### Optional

| Field            | Type        | Description                                          |
| ---------------- | ----------- | ---------------------------------------------------- |
| `description`    | string      | Agent description                                    |
| `userContext`    | string      | Injected at the top of every session prompt          |
| `models.explore` | ModelConfig | Model for explore/gather sub-agents (cheaper/faster) |
| `platform`       | object      | Platform API integration                             |
| `sandbox`        | object      | Sandbox execution config (Daytona)                   |
| `stores`         | object      | Data store backend config                            |
| `proactive`      | object      | Webhook for external triggers                        |
| `mcp`            | object      | MCP server connections                               |

#### ModelConfig

```typescript
{
  "provider": "anthropic" | "openai" | "google" | "bedrock" | "azure",
  "model": "claude-sonnet-4-20250514",
  "region": "us-east-1",           // optional, for Bedrock/Azure
  "baseUrl": "https://...",        // optional, custom endpoint
  "credentials": {                 // optional, explicit keys
    "api_key": "env:ANTHROPIC_API_KEY"
  },
  "fallback": { ... }             // optional, another ModelConfig
}
```

#### Environment Variable References

Any string value can reference an environment variable with the `env:` prefix:

```json
{
  "platform": {
    "apiKey": "env:PLATFORM_API_KEY"
  }
}
```

Resolved at parse time. Missing env vars throw `ENV_NOT_SET` error.


## Connections

Each connection is a directory in `connections/` containing a spec, access rules, and optional documentation. Pre-built plugins come with everything maintained — you just provide credentials.

```bash
amodal connect slack          # install a plugin
amodal sync --from <url>      # sync from OpenAPI/GraphQL
```

### Directory Structure

```
connections/my-api/
├── spec.json       ← API source, auth, entities, sync config
├── access.json     ← field restrictions, action tiers, row scoping
├── surface.md      ← (optional) endpoint documentation
├── entities.md     ← (optional) entity definitions
└── rules.md        ← (optional) business rules
```

### spec.json

```json
{
  "source": "My API",
  "baseUrl": "https://api.example.com",
  "format": "openapi",
  "auth": {
    "type": "bearer",
    "token": "env:MY_API_TOKEN",
    "header": "Authorization",
    "prefix": "Bearer "
  },
  "sync": {
    "auto": true,
    "frequency": "on_push",
    "notify_drift": true
  },
  "filter": {
    "tags": ["public"],
    "include_paths": ["/api/v2/**"],
    "exclude_paths": ["/api/v2/internal/**"]
  }
}
```

| Field       | Description                                                |
| ----------- | ---------------------------------------------------------- |
| `source`    | API name/label                                             |
| `baseUrl`   | API base URL                                               |
| `format`    | `"openapi"`, `"graphql"`, or `"grpc"`                      |
| `auth.type` | `"bearer"`, `"api_key"`, `"oauth2"`, `"basic"`, `"header"` |
| `sync`      | Auto-sync settings and drift notification                  |
| `filter`    | Include/exclude endpoints by tag or path glob              |

### access.json

Controls what the agent can see and do:

```json
{
  "endpoints": {
    "GET /api/deals/{id}": {
      "returns": ["Deal"],
      "confirm": false
    },
    "POST /api/deals": {
      "returns": ["Deal"],
      "confirm": true,
      "reason": "Creates a new deal",
      "thresholds": [
        { "field": "body.amount", "above": 10000, "escalate": "review" }
      ]
    },
    "DELETE /api/deals/{id}": {
      "returns": ["Deal"],
      "confirm": "never",
      "reason": "Deletion not allowed via agent"
    }
  },
  "fieldRestrictions": [
    {
      "entity": "Contact",
      "field": "ssn",
      "policy": "never_retrieve",
      "sensitivity": "pii_identifier",
      "reason": "PII — never exposed"
    },
    {
      "entity": "Contact",
      "field": "email",
      "policy": "retrieve_but_redact",
      "sensitivity": "pii_name"
    },
    {
      "entity": "Deal",
      "field": "internal_notes",
      "policy": "role_gated",
      "sensitivity": "internal",
      "allowedRoles": ["supervisor"]
    }
  ],
  "rowScoping": {
    "Deal": {
      "owner_id": {
        "type": "field_match",
        "userContextField": "userId",
        "label": "your deals"
      }
    }
  },
  "delegations": {
    "enabled": true,
    "maxDurationDays": 7,
    "escalateConfirm": true
  },
  "alternativeLookups": [
    {
      "restrictedField": "Contact.ssn",
      "alternativeEndpoint": "GET /api/contacts/{id}/verification-status",
      "description": "Use verification status instead of raw SSN"
    }
  ]
}
```

#### Action Tiers

| Tier                 | Behavior                               |
| -------------------- | -------------------------------------- |
| `false` / omitted    | Allow without confirmation             |
| `true` / `"confirm"` | Ask user for approval before executing |
| `"review"`           | Show the full plan before executing    |
| `"never"`            | Block the operation entirely           |

#### Field Restriction Policies

| Policy                | Effect                                                |
| --------------------- | ----------------------------------------------------- |
| `never_retrieve`      | Field completely removed from API responses           |
| `retrieve_but_redact` | Kept in data, replaced with `[REDACTED]` in output    |
| `role_gated`          | Removed if user lacks `allowedRoles`, else redactable |

#### Threshold Escalation

Endpoints can escalate their confirmation tier based on request parameters:

```json
{ "field": "body.amount", "above": 10000, "escalate": "review" }
```

If `body.amount > 10000`, the tier escalates from `confirm` to `review`.

### Drift Detection

```bash
amodal sync --check    # report drift without updating (CI-friendly)
amodal sync            # update local specs from remote
```

### Available Plugins

See [Plugins](/guide/plugins) for 20+ pre-built connections.


## Evals

Evals live in `evals/` as Markdown files. Each eval defines a query, setup context, and assertions that measure agent quality.

### Eval File Format

```markdown
# Eval: Revenue Drop Investigation

Tests the agent's ability to investigate a revenue anomaly.

## Setup

Tenant: test-tenant
Context: Revenue dropped 30% yesterday compared to the weekly average.

## Query

"Revenue was down 30% yesterday. What happened?"

## Assertions

- Should query Stripe charges for the relevant time period
- Should compare against baseline or previous period
- Should check for known issues (billing cycle, deployments, timezone effects)
- Should provide specific numbers (dollar amounts, percentages)
- Should NOT fabricate data or guess without querying
- Should NOT blame external factors without evidence
```

### Parsed Fields

| Field           | Source                                 | Description              |
| --------------- | -------------------------------------- | ------------------------ |
| `name`          | Filename without `.md`                 | Eval identifier          |
| `title`         | `# Eval: Title` heading                | Display name             |
| `description`   | Text between heading and first `##`    | What the eval tests      |
| `setup.tenant`  | `Tenant:` line in `## Setup`           | Tenant to use            |
| `setup.context` | `Context:` line in `## Setup`          | Background context       |
| `query`         | Content of `## Query` (without quotes) | The user message to test |
| `assertions`    | `## Assertions` list items             | Quality criteria         |

### Assertions

Lines starting with `- Should` are **positive assertions** — things the agent must do.

Lines starting with `- Should NOT` are **negated assertions** — things the agent must avoid.

### Running Evals

```bash
amodal eval                              # run all evals
amodal eval --file revenue-drop.md       # run one eval
amodal eval --providers anthropic,openai # compare providers
```

### Evaluation Methods

| Method            | Description                                                  |
| ----------------- | ------------------------------------------------------------ |
| **LLM Judge**     | A separate LLM evaluates the response against each assertion |
| **Tool usage**    | Verify expected tools were called                            |
| **Cost tracking** | Token usage and cost per eval case                           |

### Experiments

Compare configurations side-by-side:

```bash
amodal experiment
```

Test different models, skills, knowledge docs, or prompts. Results include quality scores, costs, and latency.

### Multi-Model Comparison

```bash
amodal eval --providers anthropic,openai,google
```

Runs the same suite against each provider for cost/quality/latency comparison.

### Platform Integration

Results can be sent to the platform API for trend tracking, baseline comparison, and run history.


## Knowledge Base

Knowledge documents live in `knowledge/` as Markdown files. They teach the agent about your domain — environment, baselines, procedures, patterns.

```
knowledge/
├── environment.md
├── baselines.md
├── team.md
└── response-procedures.md
```

### Document Format

```markdown
# Knowledge: Normal Traffic Patterns

- Weekday: 2,000-4,000 RPS (peak 12-2 PM EST)
- Weekend: 800-1,500 RPS
- Error rate: < 0.05% on api-gateway
- Deployment windows: 10-11 AM, 3-4 PM (brief spikes expected)
- Black Friday: 15,000-25,000 RPS (sustained 12 hours)
```

The first `# Heading` becomes the title. The filename (without `.md`) becomes the document name/ID. Everything after the heading is the body.

### Categories

Knowledge documents cover these categories:

| Category                 | What                                     | Typical Source                   |
| ------------------------ | ---------------------------------------- | -------------------------------- |
| **system\_docs**         | API endpoints, auth, response formats    | Auto from connections            |
| **methodology**          | What the data means, how to interpret it | Author writes                    |
| **patterns**             | Known patterns worth detecting           | Author seeds, agent discovers    |
| **false\_positives**     | Known benign anomalies                   | Agent discovers, author approves |
| **response\_procedures** | SOPs, escalation paths                   | Author writes                    |
| **environment**          | Infrastructure layout, inventory         | Author writes                    |
| **baselines**            | What "normal" looks like                 | Author seeds, agent refines      |
| **team**                 | Contacts, preferences, escalation paths  | Author maintains                 |
| **incident\_history**    | Past sessions, resolutions               | Agent proposes                   |
| **working\_memory**      | Agent's persistent context               | Agent maintains                  |

### On-Demand Loading

The agent starts each session with a **compact KB index** (\~400 tokens for \~20 docs). Full documents are loaded via `load_knowledge` only when needed. Task agents load their own docs independently.

### Learning Flywheel

The agent proposes knowledge updates via `propose_knowledge`:

1. Agent discovers something the KB doesn't know
2. Proposes an update (new doc or edit)
3. Goes through approval policy (manual review or auto-approve per category)
4. Next session starts smarter

Configure approval policies in the Admin UI.


## MCP Servers

Amodal supports the [Model Context Protocol](https://modelcontextprotocol.io) for connecting external tool servers. MCP tools appear alongside your custom tools and platform tools — the agent uses them transparently.

### Configuration

Define MCP servers in `amodal.json`:

```json
{
  "mcp": {
    "servers": {
      "github": {
        "transport": "stdio",
        "command": "uvx",
        "args": ["mcp-server-github"],
        "env": { "GITHUB_TOKEN": "env:GITHUB_TOKEN" }
      },
      "filesystem": {
        "transport": "sse",
        "url": "http://localhost:8001"
      },
      "docs": {
        "transport": "http",
        "url": "https://docs.example.com/api/mcp",
        "trust": true
      }
    }
  }
}
```

### Transports

| Transport | Config                   | Use Case                      |
| --------- | ------------------------ | ----------------------------- |
| `stdio`   | `command`, `args`, `env` | Local tools via child process |
| `sse`     | `url`                    | Server-Sent Events over HTTP  |
| `http`    | `url`, `trust`           | Streamable HTTP transport     |

### Tool Discovery

MCP tools are automatically discovered when servers connect. Tools are namespaced with the server name to avoid collisions:

```
github__create_issue
github__list_repos
filesystem__read_file
```

The separator is `__` (double underscore).

### CLI

```bash
# MCP servers start automatically with amodal dev
amodal dev

# The agent sees MCP tools alongside custom and platform tools
amodal inspect   # shows all discovered tools including MCP
```

### Behavior

* **Non-fatal startup** — if an MCP server fails to connect, other servers and the agent still work
* **Auto-discovery** — tools are discovered after connection, no manual registration needed
* **Graceful shutdown** — servers are cleaned up when the runtime stops
* **Environment variables** — use `env:VAR_NAME` pattern for credentials

### Example: GitHub MCP

```json
{
  "mcp": {
    "servers": {
      "github": {
        "transport": "stdio",
        "command": "uvx",
        "args": ["mcp-server-github"],
        "env": {
          "GITHUB_TOKEN": "env:GITHUB_TOKEN"
        }
      }
    }
  }
}
```

This gives the agent access to GitHub tools (create issues, list PRs, read files, etc.) via the GitHub MCP server.


## Pages

Pages are **React components** that render custom UI views in the runtime app. They live in `pages/` and use SDK hooks to read from stores, invoke skills, and display structured data. This is how you build dashboards, morning briefs, investigation views, and other composed screens.

```
pages/
├── ops-dashboard.tsx
├── morning-brief.tsx
└── deal-detail.tsx
```

### Page Format

Each page file exports a `page` config object and a default React component:

```tsx
import { useStoreList } from '@amodal/react'

export const page = {
  name: 'ops-dashboard',
  icon: 'shield',
  description: 'Facility overview — zones, alerts, and device status',
}

export default function OpsDashboard() {
  const { data: alerts } = useStoreList('classified-alerts', {
    sort: { field: 'timestamp', order: 'desc' },
    limit: 20,
  })

  const { data: zones } = useStoreList('zone-status')

  return (
    <div>
      <h1>Operations Dashboard</h1>
      <AlertsTable alerts={alerts} />
      <ZoneMap zones={zones} />
    </div>
  )
}
```

### Page Config

| Field         | Type                     | Default  | Description                                                     |
| ------------- | ------------------------ | -------- | --------------------------------------------------------------- |
| `name`        | string                   | filename | Page identifier and URL slug                                    |
| `icon`        | string                   | —        | Lucide icon name (e.g., `'shield'`, `'monitor'`, `'bar-chart'`) |
| `description` | string                   | —        | Shown in sidebar tooltip                                        |
| `context`     | `Record<string, string>` | —        | Route params (e.g., `{ dealId: 'string' }`)                     |
| `hidden`      | boolean                  | `false`  | If true, excluded from sidebar                                  |

### SDK Hooks

Pages use hooks from `@amodal/react` to access agent data:

| Hook                             | Description                                                      |
| -------------------------------- | ---------------------------------------------------------------- |
| `useStoreList(store, options)`   | Fetch multiple documents with filtering, sorting, and pagination |
| `useStore(store, key)`           | Fetch a single document by key                                   |
| `useSkillAction(skill, options)` | Invoke a skill from the page                                     |

#### useStoreList

```tsx
const { data, loading, error, refresh } = useStoreList('classified-alerts', {
  filter: { severity: 'P1' },
  sort: { field: 'timestamp', order: 'desc' },
  limit: 50,
  refreshInterval: 10000,  // auto-refresh every 10s
})
```

#### useStore

```tsx
const { data: alert } = useStore('classified-alerts', alertId)
```

#### useSkillAction

```tsx
const { invoke, loading } = useSkillAction('triage')

const handleTriage = () => {
  invoke({ query: 'Triage the latest alerts' })
}
```

### Sidebar Integration

Pages appear in the runtime app sidebar under a **Pages** section. The name is auto-formatted from the filename (`ops-dashboard` → "Ops Dashboard"). Click a page to navigate to `/pages/{pageName}`.

Hidden pages (`hidden: true`) are excluded from the sidebar but still accessible via direct URL — useful for detail pages navigated to from other pages.

### Context Pages

Pages with `context` params are detail views that receive route parameters:

```tsx
export const page = {
  name: 'deal-detail',
  icon: 'file-text',
  context: { dealId: 'string' },
  hidden: true,  // navigated to, not shown in sidebar
}

export default function DealDetail({ dealId }: { dealId: string }) {
  const { data: deal } = useStore('deals', dealId)
  // ...
}
```

### Entity Pages vs. Composed Pages

* **Entity pages** are auto-generated from store definitions — list and detail views come for free without writing page files
* **Composed pages** are what you define in `pages/` — custom views that combine data from multiple stores with custom layout and logic

Only composed pages need explicit page files. If you just need a list/detail view of a single store, the runtime generates that automatically.

### Hot Reload

During `amodal dev`, changes to page files trigger hot module replacement (HMR) via the Vite plugin. Edit a page and see changes instantly in the browser.

### Example: Surveillance Dashboard

```tsx
import { useStoreList } from '@amodal/react'

export const page = {
  name: 'ops-dashboard',
  icon: 'shield',
  description: 'Facility surveillance overview',
}

export default function OpsDashboard() {
  const { data: alerts } = useStoreList('classified-alerts', {
    sort: { field: 'confidence', order: 'desc' },
    limit: 10,
  })
  const { data: zones } = useStoreList('zone-status')
  const { data: devices } = useStoreList('device-profiles')

  return (
    <div className="grid grid-cols-2 gap-4">
      <section>
        <h2>Active Alerts</h2>
        {alerts?.map((a) => (
          <AlertCard key={a.key} alert={a.payload} />
        ))}
      </section>
      <section>
        <h2>Zone Status</h2>
        {zones?.map((z) => (
          <ZoneCard key={z.key} zone={z.payload} />
        ))}
      </section>
    </div>
  )
}
```


## Plugins

Plugins are pre-built connection packages for popular APIs. Each plugin includes API documentation, auth configuration, entity definitions, and access rules — maintained by Amodal so you just provide credentials.

### Install

```bash
amodal connect <plugin-name>
```

### Available Plugins

#### Communication

| Plugin       | Description                          |
| ------------ | ------------------------------------ |
| **Slack**    | Channels, messages, users, reactions |
| **SendGrid** | Email sending, templates, contacts   |
| **Twilio**   | SMS, voice, messaging                |
| **Intercom** | Conversations, contacts, articles    |

#### Developer Tools

| Plugin         | Description                          |
| -------------- | ------------------------------------ |
| **GitHub**     | Repos, issues, PRs, actions, commits |
| **Jira**       | Issues, projects, sprints, boards    |
| **Linear**     | Issues, projects, cycles, teams      |
| **Confluence** | Pages, spaces, search                |
| **Notion**     | Pages, databases, blocks             |

#### Monitoring & Ops

| Plugin             | Description                           |
| ------------------ | ------------------------------------- |
| **Datadog**        | Monitors, events, metrics, logs       |
| **PagerDuty**      | Incidents, services, oncall schedules |
| **OpsGenie**       | Alerts, teams, escalations            |
| **AWS CloudWatch** | Metrics, alarms, logs, dashboards     |

#### Commerce & Finance

| Plugin         | Description                                  |
| -------------- | -------------------------------------------- |
| **Stripe**     | Payments, customers, subscriptions, invoices |
| **Shopify**    | Orders, products, customers, inventory       |
| **QuickBooks** | Accounts, invoices, payments, reports        |

#### CRM & Sales

| Plugin         | Description                              |
| -------------- | ---------------------------------------- |
| **Salesforce** | Contacts, opportunities, cases, accounts |
| **HubSpot**    | Contacts, deals, tickets, companies      |

#### Support

| Plugin            | Description                           |
| ----------------- | ------------------------------------- |
| **Zendesk**       | Tickets, users, organizations, search |
| **Google Sheets** | Read/write spreadsheet data           |

### Plugin Structure

Each plugin is a package containing:

```
plugin-name/
├── package.json    ← name, description, icon, category
├── spec.json       ← API config: source, auth, entities, sync
└── access.json     ← field restrictions, action tiers, scoping
```

### Custom Connections

For APIs not in the plugin list, create custom connections with OpenAPI or GraphQL specs. See [Connections](/guide/connections).


## Providers

Amodal supports multiple LLM providers with a unified interface. Switch providers by changing an environment variable — no code changes needed.

### Supported Providers

| Provider         | Models                     | Auth                              |
| ---------------- | -------------------------- | --------------------------------- |
| **Anthropic**    | Claude Opus, Sonnet, Haiku | `ANTHROPIC_API_KEY`               |
| **OpenAI**       | GPT-4o, GPT-4, GPT-3.5     | `OPENAI_API_KEY`                  |
| **Google**       | Gemini Pro, Flash          | `GOOGLE_API_KEY`                  |
| **AWS Bedrock**  | Claude, Titan, Llama       | AWS credentials                   |
| **Azure OpenAI** | GPT-4o, GPT-4              | `AZURE_OPENAI_API_KEY` + endpoint |

### Configuration

#### Auto-detection

Set the relevant environment variable and Amodal auto-detects the provider:

```bash
export ANTHROPIC_API_KEY=sk-ant-...
amodal dev   # uses Anthropic automatically
```

#### Explicit config

Specify in `amodal.json`:

```json
{
  "provider": "anthropic",
  "model": "claude-sonnet-4-20250514"
}
```

### Failover

The `FailoverProvider` cascades between providers with retry logic and linear backoff:

```json
{
  "provider": "failover",
  "providers": ["anthropic", "openai"],
  "retries": 2,
  "backoffMs": 1000
}
```

If the primary provider fails, the runtime automatically tries the next one.

### Streaming

All providers support SSE streaming via `chatStream()`. The streaming interface is unified — your client code works identically regardless of which provider is active.

### Multi-Model Comparison

Use `amodal eval` or `amodal experiment` to compare providers:

```bash
amodal eval --providers anthropic,openai,google
```

This runs the same eval suite against each provider and reports quality, latency, and cost differences.


## Security & Guardrails

Security is built into the runtime at every layer. Six components work together to protect sensitive data and control agent behavior.

### Security Pipeline

```
API Response
  → Field Scrubber    (strip restricted fields)
  → Agent processes data
  → Output Guard      (4-stage filtering before user sees output)
      1. Field Redaction   (replace scrubbed values with [REDACTED])
      2. Pattern Scanner   (detect SSN, credit cards, bank accounts)
      3. Leak Detector     (compare output against tracked scrubbed values)
      4. Scope Checker     (flag unqualified aggregate claims)
  → Action Gate       (confirm/review/block write operations)
```

### Field Scrubber

Strips restricted fields from API responses before the LLM sees them. Configured per connection in `access.json`:

| Policy                | Effect                                                |
| --------------------- | ----------------------------------------------------- |
| `never_retrieve`      | Field completely removed from response                |
| `retrieve_but_redact` | Kept in data, replaced with `[REDACTED]` in output    |
| `role_gated`          | Removed if user lacks `allowedRoles`, else redactable |

### Output Guard

Four-stage filter on every agent response:

**1. Field Redaction** — replaces `retrieve_but_redact` and denied `role_gated` values with `[REDACTED]`.

**2. Pattern Scanner** — regex detection of:

* SSN (`XXX-XX-XXXX`)
* Credit cards (13-19 digits, Luhn-validated)
* Bank accounts (8-17 digits near keywords like "account", "routing")

**3. Leak Detector** — compares agent output against all scrubbed values tracked in the session. `pii_identifier` values are always flagged. `pii_name` values only if entity context is nearby.

**4. Scope Checker** — detects unqualified aggregate claims ("all devices", "every contact") about scoped entities. Flags when the agent says "all X" but only has access to a subset.

Critical findings block output entirely.

### Action Gate

Controls write operation confirmations based on `access.json`:

| Tier      | Behavior                        |
| --------- | ------------------------------- |
| `allow`   | Execute without confirmation    |
| `confirm` | Ask user for approval           |
| `review`  | Show full plan before executing |
| `never`   | Block entirely                  |

**Threshold escalation**: endpoint tiers can escalate based on request parameters:

```json
{ "field": "body.amount", "above": 10000, "escalate": "review" }
```

**Delegation escalation**: if `isDelegated` (sub-agent acting on behalf), `confirm` escalates to `review`.

### Role-Based Access Control

Roles filter tools and skills **at the SDK layer** — the LLM never sees tools it doesn't have access to:

```json
{
  "name": "analyst",
  "tools": ["request", "load_knowledge", "dispatch"],
  "skills": ["triage", "deep-dive"],
  "automations": { "can_view": true, "can_create": false },
  "constraints": {}
}
```

Use `"*"` as wildcard to allow all tools/skills.

### Plan Mode

When active, all writes are blocked until the user approves a plan:

1. Agent enters plan mode (triggered by security rules or manually)
2. Agent proposes a plan
3. User approves → plan is injected into context, writes re-enabled
4. Agent executes the approved plan

### Session Limits

| Limit          | Default      | Description                  |
| -------------- | ------------ | ---------------------------- |
| Max turns      | 15           | Prevents infinite loops      |
| Timeout        | configurable | Hard time limit              |
| Loop detection | always on    | Pattern matching + LLM-based |

### Audit Logging

Every action is logged with immutable hash chains:

| Event                           | Logged                        |
| ------------------------------- | ----------------------------- |
| `tool_call`                     | Tool name, params, duration   |
| `write_op`                      | Write operations specifically |
| `session_start` / `session_end` | Session lifecycle             |
| `version_load`                  | Config version loaded         |
| `kb_proposal`                   | Knowledge update proposals    |

Three sinks: Console (dev), File (JSON), Remote (batch POST to platform API).

Each entry includes a SHA-256 hash of the previous entry, creating a tamper-evident chain.


## Skills

Skills are reasoning frameworks defined as Markdown. Each skill is a directory in `skills/` containing a `SKILL.md` file.

```
skills/
├── triage/
│   └── SKILL.md
├── deep-dive/
│   └── SKILL.md
└── incident-response/
    └── SKILL.md
```

### SKILL.md Format

Two formats are supported:

#### Heading-based (recommended)

```markdown
# Skill: Incident Response

Gather context, assess impact, and coordinate response for active incidents.
Trigger: When the user reports an outage, service degradation, or active incident.

## Behavior

1. Identify the affected service and symptoms
2. Assess blast radius — which systems and users are impacted?
3. Gather context:
   - Recent deployments or config changes
   - Monitoring metrics (error rates, latency)
   - Dependent service health
4. Correlate: what changed right before the incident?
5. Recommend immediate mitigation and investigation path

## Constraints

- Do not restart services without explicit user confirmation
- Do not dismiss alerts as false positives without evidence
```

#### Frontmatter-based

```markdown
---
name: incident-response
description: Gather context, assess impact, coordinate response
trigger: When the user reports an outage or active incident
---

## Methodology

...body content...
```

### Parsed Fields

| Field         | Source                                       | Description         |
| ------------- | -------------------------------------------- | ------------------- |
| `name`        | `# Skill: Name` or frontmatter `name`        | Skill identifier    |
| `description` | First paragraph or frontmatter `description` | What the skill does |
| `trigger`     | `Trigger:` line or frontmatter `trigger`     | When to activate    |
| `body`        | Everything after name/description            | The methodology     |

### Skill Activation

The agent sees all installed skill names and triggers. It activates the most relevant skill based on the user's question. Skills chain naturally — the agent transitions between frameworks as findings evolve.

### Best Practices

* **Be specific about reasoning steps.** Not "investigate the issue" but "query deployment logs for changes within 30 minutes of the anomaly."
* **Include decision points.** "If no deployments found, check for scaling events."
* **Specify dispatching.** "Dispatch a task agent to query Datadog" uses context isolation.
* **Define when to stop.** "If confidence is below 60%, report as inconclusive."


## Stores

Stores give your agent persistent, typed data storage. Define a schema in `stores/` and the runtime auto-generates CRUD tools the agent can use.

### Store Definition

Each store is a JSON file in `stores/`:

```json
{
  "name": "active-alerts",
  "entity": {
    "name": "ClassifiedAlert",
    "key": "{event_id}",
    "schema": {
      "event_id": { "type": "string" },
      "title": { "type": "string" },
      "severity": {
        "type": "enum",
        "values": ["P1", "P2", "P3", "P4"]
      },
      "confidence": {
        "type": "number",
        "min": 0,
        "max": 1
      },
      "timestamp": { "type": "datetime" },
      "metadata": {
        "type": "object",
        "fields": {
          "category": { "type": "string" },
          "tags": { "type": "array", "item": { "type": "string" } }
        }
      },
      "relatedAlert": {
        "type": "ref",
        "store": "active-alerts"
      },
      "notes": {
        "type": "string",
        "nullable": true
      }
    }
  },
  "ttl": 86400,
  "ttl_conditional": {
    "default": 86400,
    "override": [
      { "condition": "severity IN ['P1', 'P2']", "ttl": 300 }
    ]
  },
  "failure": {
    "mode": "partial",
    "retries": 3,
    "backoff": "exponential",
    "deadLetter": true
  },
  "history": { "versions": 3 },
  "trace": true
}
```

### Field Types

| Type       | Description                | Extra Fields                              |
| ---------- | -------------------------- | ----------------------------------------- |
| `string`   | Text                       | —                                         |
| `number`   | Numeric                    | `min`, `max`                              |
| `boolean`  | True/false                 | —                                         |
| `datetime` | ISO 8601 timestamp         | —                                         |
| `enum`     | One of predefined values   | `values: string[]`                        |
| `array`    | List of items              | `item: FieldDefinition`                   |
| `object`   | Nested structure           | `fields: Record<string, FieldDefinition>` |
| `ref`      | Reference to another store | `store: string`                           |

Any field can set `nullable: true`.

### TTL

Simple TTL (seconds):

```json
{ "ttl": 86400 }
```

Conditional TTL:

```json
{
  "ttl_conditional": {
    "default": 86400,
    "override": [
      { "condition": "severity IN ['P1', 'P2']", "ttl": 300 }
    ]
  }
}
```

### Failure Handling

| Mode             | Behavior                               |
| ---------------- | -------------------------------------- |
| `partial`        | Continue on individual write failures  |
| `all-or-nothing` | Rollback entire batch on first failure |
| `skip`           | Skip failed writes silently            |

### Auto-Generated Tools

Store names (kebab-case) become tool names (snake\_case with `store_` prefix):

* `active-alerts` → `store_active_alerts`
* `deal-health` → `store_deal_health`

The agent uses these tools to get, put, list, delete, and query history on store documents.

### Storage Backend

Configured in `amodal.json`:

```json
{
  "stores": {
    "backend": "pglite",
    "dataDir": ".amodal/store-data"
  }
}
```

| Backend    | Description                                           |
| ---------- | ----------------------------------------------------- |
| `pglite`   | SQLite-compatible, in-memory or file-backed (default) |
| `postgres` | PostgreSQL via `postgresUrl`                          |

### History & Tracing

* `"history": { "versions": 3 }` — retain 3 previous versions of each document
* `"trace": true` — store reasoning traces alongside documents


## Tools

Custom tools live in `tools/` as directories with a handler file and optional metadata.

```
tools/
└── create_ticket/
    ├── tool.json       ← (optional) metadata, parameters, confirmation
    ├── handler.ts      ← handler code
    ├── package.json    ← (optional) npm dependencies
    └── requirements.txt ← (optional) Python dependencies
```

### Platform Tools (built-in)

These are always available — you don't define them:

| Tool                   | What It Does                                                                                |
| ---------------------- | ------------------------------------------------------------------------------------------- |
| **request**            | HTTP calls to connected systems with automatic auth. Declares `intent: 'read' \| 'write'`.  |
| **shell\_exec**        | Run scripts (sandboxed in hosted environments via Daytona).                                 |
| **load\_knowledge**    | Pull KB documents into context on demand.                                                   |
| **propose\_knowledge** | Propose knowledge base updates.                                                             |
| **present**            | Render widgets: entity-card, timeline, data-table, score-breakdown, metric, info-card, etc. |
| **dispatch**           | Delegate work to task agents with isolated context.                                         |
| **explore**            | Query connected systems and gather context.                                                 |
| **store\_\***          | Auto-generated from [store definitions](/guide/stores).                                     |

### Custom Tool Definition

#### Option A: tool.json + handler.ts

**tool.json:**

```json
{
  "name": "create_ticket",
  "description": "Create a Jira issue in the ops project",
  "parameters": {
    "type": "object",
    "properties": {
      "summary": { "type": "string" },
      "priority": { "type": "string", "enum": ["P1", "P2", "P3", "P4"] }
    },
    "required": ["summary"]
  },
  "confirm": "review",
  "timeout": 30000,
  "env": ["JIRA_API_TOKEN"]
}
```

| Field              | Type                                   | Default        | Description                         |
| ------------------ | -------------------------------------- | -------------- | ----------------------------------- |
| `name`             | string                                 | directory name | Tool name (snake\_case)             |
| `description`      | string                                 | **required**   | Shown to the LLM                    |
| `parameters`       | JSON Schema                            | `{}`           | Input parameters                    |
| `confirm`          | `false \| true \| "review" \| "never"` | `false`        | Confirmation tier                   |
| `timeout`          | number                                 | `30000`        | Timeout in ms                       |
| `env`              | string\[]                              | `[]`           | Allowed env var names               |
| `responseShaping`  | object                                 | —              | Transform response before returning |
| `sandbox.language` | string                                 | `"typescript"` | Handler language                    |

**handler.ts:**

```typescript
export default async (params, ctx) => {
  const result = await ctx.request('jira', '/rest/api/3/issue', {
    method: 'POST',
    data: {
      fields: {
        project: { key: 'OPS' },
        summary: params.summary,
        priority: { name: params.priority },
        issuetype: { name: 'Task' },
      },
    },
  })
  return { ticketId: result.key, url: result.self }
}
```

#### Option B: defineToolHandler (single file)

```typescript
import { defineToolHandler } from '@amodal/core'

export default defineToolHandler({
  description: 'Calculate weighted pipeline value',
  parameters: {
    type: 'object',
    properties: {
      deal_ids: { type: 'array', items: { type: 'string' } },
    },
    required: ['deal_ids'],
  },
  confirm: 'review',
  timeout: 60000,
  env: ['STRIPE_API_KEY'],
  handler: async (params, ctx) => {
    const deals = await ctx.request('crm', '/deals', {
      params: { ids: params.deal_ids.join(',') },
    })
    return { total: deals.reduce((sum, d) => sum + d.amount, 0) }
  },
})
```

### Handler Context

The `ctx` object available in every handler:

| Method                                        | Description                      |
| --------------------------------------------- | -------------------------------- |
| `ctx.request(connection, endpoint, options?)` | Make an authenticated API call   |
| `ctx.exec(command, options?)`                 | Run a shell command              |
| `ctx.env(name)`                               | Read an allowed env var          |
| `ctx.log(message)`                            | Log a message                    |
| `ctx.user`                                    | User info: `{ roles: string[] }` |
| `ctx.signal`                                  | AbortSignal for cancellation     |

### Naming Convention

Tool names must be **snake\_case**: lowercase letters, digits, and underscores, starting with a letter. Example: `create_ticket`, `fetch_deals`, `calculate_risk`.


## amodal chat

Open an interactive terminal chat with your agent. The chat UI is a React-based TUI built with Ink.

```bash
amodal chat
```

### Modes

#### Local mode (default)

Boots a local runtime server from your repo and connects to it:

```bash
amodal chat
```

#### Remote mode

Connect to an already-running server:

```bash
amodal chat --url http://localhost:3847
amodal chat --url https://my-agent.amodal.ai
```

#### Snapshot mode

Load from a snapshot file (built with `amodal build`):

```bash
amodal chat --config snapshot.json
```

### Options

| Flag                    | Description                   |
| ----------------------- | ----------------------------- |
| `--url <remote>`        | Connect to remote server      |
| `--config <file>`       | Load from snapshot            |
| `--tenant-id <id>`      | Tenant identifier             |
| `--port <number>`       | Local server port             |
| `--resume <id\|latest>` | Resume a previous session     |
| `--fullscreen`          | Use alternate terminal buffer |

### Features

* **Streaming responses** — see the agent think in real-time
* **Tool call display** — watch tool invocations as they happen
* **Skill activation** — see which reasoning framework is active
* **Session resume** — pick up where you left off
* **Session browser** — navigate previous conversations
* **Markdown rendering** — formatted output in the terminal
* **Responsive layout** — adapts to terminal size


## amodal connect & sync

### connect

Add a connection to your agent. Connections give the agent API access and documentation for external systems.

```bash
# Install a pre-built plugin
amodal connect slack
amodal connect datadog
amodal connect github

# List available plugins
amodal search --type connection
```

This creates a directory in `connections/` with the plugin's spec and access configuration.

### sync

Sync API specifications from remote sources. Useful for custom APIs or keeping specs up to date.

```bash
# Sync from an OpenAPI spec
amodal sync --from https://api.example.com/openapi.json

# Sync from a GraphQL schema
amodal sync --from https://api.example.com/graphql

# Check for drift without updating (useful in CI)
amodal sync --check
```

#### Drift Detection

The `--check` flag compares your local specs against the remote source and reports differences without making changes. This is useful in CI pipelines to catch API changes early.

### Connection Structure

Each connection directory contains:

```
connections/slack/
├── spec.json       ← endpoints, auth config, entity list
├── access.json     ← field restrictions, action tiers, scoping rules
└── credentials     ← (gitignored) API keys, tokens
```

#### spec.json

Machine-readable configuration:

* **source** — API base URL and type
* **auth** — Authentication method (bearer, OAuth, API key)
* **entities** — Available API entities and endpoints
* **sync** — Sync configuration and filters

#### access.json

Security rules:

* **Field restrictions** — which fields are readable/writable
* **Action tiers** — confirm/review/never for different operations
* **Scoping rules** — tenant-level access control

### Available Plugins

See [Plugins](/guide/plugins) for the full list of 20+ pre-built connection plugins.


## amodal deploy

Deploy your agent to the Amodal platform or build for self-hosting.

### Platform Deployment

```bash
amodal login          # authenticate first
amodal link           # link to a platform project
amodal deploy         # deploy current config
```

#### Deployment Lifecycle

```bash
amodal status         # check deployment status
amodal deployments    # list all deployments
amodal rollback       # revert to previous version
amodal promote        # promote a snapshot to production
```

### Snapshot-Based Deployment

Amodal uses snapshot-based deployment. The `build` command captures the entire agent configuration into an immutable snapshot:

```bash
amodal build          # create a snapshot
amodal serve --config snapshot.json  # serve from snapshot
```

### Docker

Build a Docker image for self-hosting:

```bash
amodal docker
```

### Secrets

Manage encrypted secrets for deployment:

```bash
amodal secrets set DATADOG_API_KEY=xxx
amodal secrets list
```

Secrets are encrypted at rest and only decrypted in the runtime's memory during a session.

### Hosted Runtime

The platform offers a hosted runtime with Daytona sandbox execution for isolated agent environments. This provides:

* Sandboxed `shell_exec` tool execution
* Session persistence
* Automatic scaling


## amodal dev

Start a local runtime server for development. The server watches your repo config files and hot-reloads on every change.

```bash
amodal dev
```

### Options

| Flag     | Default     | Description |
| -------- | ----------- | ----------- |
| `--port` | `3847`      | Server port |
| `--host` | `localhost` | Server host |

### What It Does

1. Loads your repo configuration from the project root
2. Starts an Express server with SSE streaming
3. Watches all config files with 300ms debounce
4. Manages sessions with TTL-based cleanup

### Endpoints

The dev server exposes:

| Method | Path               | Description                             |
| ------ | ------------------ | --------------------------------------- |
| `POST` | `/chat`            | Send a message, receive SSE stream      |
| `POST` | `/task`            | Start a background task                 |
| `GET`  | `/task/:id`        | Get task status                         |
| `GET`  | `/task/:id/stream` | Stream task output                      |
| `GET`  | `/inspect/context` | View compiled context with token counts |
| `GET`  | `/health`          | Health check                            |

### SSE Events

The chat endpoint returns a Server-Sent Events stream with these event types:

| Event                         | Description                    |
| ----------------------------- | ------------------------------ |
| `text`                        | Assistant text output          |
| `tool_call`                   | Tool invocation                |
| `ExploreStart` / `ExploreEnd` | System exploration phase       |
| `PlanMode`                    | Planning phase                 |
| `skill_activated`             | Skill activation               |
| `FieldScrub`                  | Sensitive field redaction      |
| `ConfirmationRequired`        | Write operation needs approval |
| `kb_proposal`                 | Knowledge base update proposal |

### Hot Reload

Edit any config file and the runtime picks up changes immediately:

* Connection specs, skills, knowledge, tools, automations
* Config changes (provider, model, etc.)
* No server restart needed


## amodal eval

Run evaluation suites against your agent to measure quality, compare models, and track regressions.

```bash
amodal eval
```

### Eval Files

Evals live in `evals/` as YAML files:

```yaml
name: triage-accuracy
description: Test alert triage quality
cases:
  - input: "Review recent security alerts"
    rubric:
      - "Correctly identifies critical alerts"
      - "Filters known false positives"
      - "Provides severity ranking"
    expected_tools:
      - request
      - load_knowledge
```

### Evaluation Methods

| Method            | Description                                              |
| ----------------- | -------------------------------------------------------- |
| **LLM Judge**     | An LLM evaluates the agent's response against the rubric |
| **Tool usage**    | Verify expected tools were called                        |
| **Cost tracking** | Track token usage and cost per eval                      |

### Experiments

Compare different configurations side-by-side:

```bash
amodal experiment
```

Experiments let you test:

* Different LLM providers or models
* Different skill configurations
* Different prompt variations
* Different knowledge documents

Results include cost comparison, quality scores, and latency metrics.

### Multi-Model Comparison

Run the same eval suite against multiple providers to find the best model for your use case:

```bash
amodal eval --providers anthropic,openai,google
```

### Platform Integration

Eval results can be sent to the platform API for tracking trends, baselines, and comparisons over time.


## CLI

The `amodal` CLI is the primary interface for building, running, and deploying agents. Install it globally or use via `npx`:

```bash
npm install -g @amodal/cli
# or
npx amodal <command>
```

### Commands

#### Project

| Command                | Description                             |
| ---------------------- | --------------------------------------- |
| [`init`](/cli/init)    | Scaffold a new agent project            |
| [`dev`](/cli/dev)      | Start local dev server with hot reload  |
| `validate`             | Check config for errors                 |
| `inspect`              | Show compiled context with token counts |
| `build-manifest-types` | Generate TypeScript types from manifest |

#### Connections & Packages

| Command                   | Description                         |
| ------------------------- | ----------------------------------- |
| [`connect`](/cli/connect) | Add a connection (plugin or custom) |
| [`sync`](/cli/connect)    | Sync API specs from remote sources  |
| `install`                 | Install marketplace packages        |
| `uninstall`               | Remove packages                     |
| `list`                    | List installed items                |
| `update`                  | Update packages                     |
| `diff`                    | Show package changes                |
| `search`                  | Search the marketplace              |
| `publish`                 | Publish to the registry             |

#### Runtime

| Command             | Description                               |
| ------------------- | ----------------------------------------- |
| [`chat`](/cli/chat) | Interactive terminal chat with your agent |
| `serve`             | Run from a snapshot file                  |
| `test-query`        | Fire a one-off query against the agent    |

#### Platform & Deployment

| Command                 | Description                      |
| ----------------------- | -------------------------------- |
| [`deploy`](/cli/deploy) | Deploy to the platform           |
| `login`                 | Authenticate with the platform   |
| `link`                  | Link local project to platform   |
| `status`                | Show deployment status           |
| `rollback`              | Revert to a previous deployment  |
| `deployments`           | List deployments                 |
| `promote`               | Promote a snapshot to production |
| `secrets`               | Manage encrypted secrets         |
| `docker`                | Build a Docker image             |
| `audit`                 | View audit logs                  |
| `automations`           | Manage automation schedules      |

#### Testing & Evaluation

| Command             | Description                         |
| ------------------- | ----------------------------------- |
| [`eval`](/cli/eval) | Run evaluation suites               |
| `experiment`        | Compare models, prompts, or configs |
| `test-query`        | Test a single query                 |

### Authentication

The CLI resolves credentials in order:

1. Command-line flags
2. `amodal.json` `platform` field (from `amodal link`)
3. `~/.amodalrc` (from `amodal login`)
4. Environment variables (`PLATFORM_API_URL`, `PLATFORM_API_KEY`)


## amodal init

Initialize a new Amodal agent project. Creates the config file and directory structure with starter files based on your product type.

```bash
amodal init
```

### What It Creates

```
my-agent/
├── amodal.json         ← agent name, provider, model
├── skills/             ← starter skill template
├── knowledge/          ← sample knowledge document
├── connections/        ← empty, ready for connections
├── tools/              ← empty, ready for custom tools
└── evals/              ← empty, ready for test cases
```

### Interactive Mode

The init command uses interactive prompts to configure your project:

* Product type (operations, finance, security, custom)
* Agent name and description
* LLM provider preference

Templates are customized based on the product type — a security agent gets different starter skills than a finance agent.

### Non-Interactive

For CI or scripting:

```bash
amodal init --name "Ops Agent" --type operations
```