Are you an LLM? Read llms.txt for a summary of the docs, or llms-full.txt for the full context.
Skip to content

IT Operations

IT operations teams use Amodal to respond to incidents, scan environments for changes, and monitor infrastructure health. The agent connects to cloud providers, monitoring tools, and incident management systems.

Incident Response

User: "The checkout API is returning 500s"
 
Agent activates: Incident Response skill
  → Dispatches parallel task agents:
    1. Query Datadog for checkout-api error metrics
    2. Check recent deployments to checkout-api
    3. Query dependent service health (database, cache, payment gateway)
    4. Pull PagerDuty alert history
 
  → Identifies: database connection pool exhaustion after a traffic spike
  → Recommends: increase pool size, add connection retry logic
  → Offers to create a Jira ticket with findings

Proactive Monitoring

Set up automations to catch issues before users notice:

await client.automations.create({
  name: 'Infrastructure Health Check',
  prompt: `Scan all monitored services for:
    - Error rates above baseline
    - Latency degradation
    - Resource utilization above 80%
    - Certificate expirations within 30 days
    Report findings by severity.`,
  schedule: '0 */4 * * *', // Every 4 hours
  output: { channel: 'slack', target: '#ops-alerts' },
  skills: ['triage'],
})

Key Connections

SystemWhat It Provides
AWS / GCP / AzureCloud resource state, logs, metrics
Datadog / New RelicAPM, infrastructure metrics, log analytics
PagerDuty / OpsGenieAlert management, oncall routing
Jira / LinearTicket creation, incident tracking

Relevant Skills

  • Incident Response — gather context, assess impact, coordinate response
  • Triage — scan, prioritize, filter noise
  • Deep Dive — exhaustive service profiling