Are you an LLM? Read llms.txt for a summary of the docs, or llms-full.txt for the full context.
Skip to content

Tutorial: Research Assistant

The problem

Ask ChatGPT "what's trending on Hacker News?" and it'll apologize — it can't browse the web. Ask it tomorrow the same question and it won't remember you asked yesterday. Ask it to email you a daily digest and it'll tell you it can't do that either.

A vanilla LLM is smart but isolated. It has no live data, no memory, no way to act on a schedule, and no custom UI. That's what an agent runtime fixes.

This tutorial uses a toy problem — a Hacker News research assistant — but the pattern scales to real systems. An internal support agent that queries your ticketing API, follows your escalation playbook, and logs resolutions to a store. A compliance agent that monitors Slack channels, applies your organization's policy rules, and files structured reports. The moving parts are the same: connections to live systems, skills that encode how to reason about domain problems, stores that persist structured output, and memory that carries context across sessions.

The key insight is that configuration and intelligence are orthogonal. The LLM provides reasoning; Amodal provides the rails — which APIs it can call, what data it can access, how it should format output, what it's allowed to do. Upgrading the model makes the agent smarter. Editing the config changes what it knows and what it can reach. You tune them independently.

What you're building

A research assistant that:

  • Pulls live data from Hacker News and Wikipedia (no API keys needed)
  • Remembers your interests across sessions ("I care about Rust and systems programming")
  • Saves research notes to a persistent store you can query later
  • Runs a daily digest every morning — fetching top HN stories on autopilot
  • Has a custom dashboard showing all your saved notes in one place

By the end, you'll have an agent that does things no chatbot can: call real APIs, remember who you are, persist data, run on a schedule, and render a custom UI. You'll build most of it by chatting with the admin agent — not by hand-editing config files.

Features covered: connections, skills, knowledge, stores, memory, evals, arena, automations, pages, and the Studio UI.

Prerequisites

  • Node.js 20+ and npm
  • An LLM API key — Anthropic, OpenAI, or Google (see Providers)
  • PostgreSQL running locally (for memory and stores):
docker run -d --name amodal-pg -p 5433:5432 -e POSTGRES_PASSWORD=postgres -e POSTGRES_DB=research_agent postgres:16

1. Scaffold the project

npm install -g @amodalai/amodal
mkdir research-agent && cd research-agent
amodal init

This creates:

research-agent/
├── amodal.json
├── connections/
├── skills/
├── knowledge/
├── automations/
├── evals/
├── package.json
├── .env
└── .gitignore

2. Configure

The admin agent can create connections, skills, stores, and more — but it can't edit amodal.json or .env (security boundary). You'll set these up by hand.

Open amodal.json — the scaffolded config just has a name and version. Add memory:

{
  "name": "research-agent",
  "version": "1.0.0",
  "memory": {
    "enabled": true,
    "maxEntries": 50,
    "nudgeInterval": 5
  }
}

That's it. No model config needed — the runtime auto-detects your provider from whichever API key is set in .env and picks an inexpensive default model. You can pin a specific model later if you want (see Providers).

FieldPurpose
memoryAgent remembers facts across sessions. nudgeInterval = how many turns between save prompts.

Open the .env file that init created. Fill in an API key for any provider and the database URL:

# .env (generated by amodal init)
# Uncomment one provider and add your key:
# ANTHROPIC_API_KEY=
# OPENAI_API_KEY=
GOOGLE_API_KEY=your-key-here
 
# Required — Postgres connection for memory and stores
DATABASE_URL=postgresql://postgres:postgres@localhost:5433/research_agent

3. Start the dev server

amodal dev

You'll see two URLs:

Leave this running. Everything hot-reloads when files change.

4. Meet Studio

Open http://localhost:3848. This is Studio — the control panel for your agent.

The sidebar has sections you'll use throughout this tutorial:

ScreenWhat it shows
OverviewAgent name, models, provider status, runtime info
FilesBrowse and edit your project files in the browser
MemoryView/edit what the agent remembers about users
StoresBrowse persisted data the agent has saved
PromptThe full system prompt with token budget breakdown
EvalsRun and review test assertions
ArenaCompare models side-by-side on the same evals
SecretsEnvironment variable status (set/missing)

On the right side of the screen is the admin agent — a chat panel that can read and write files in your project. This is how you'll build the agent.

5. Add connections via the admin agent

Connections teach the agent how to call external APIs. Each connection needs a spec.json (protocol and URL), access.json (allowed endpoints), and surface.md (plain-English docs the agent reads).

Instead of creating these files by hand, tell the admin agent. In the Studio chat panel, send:

Add a connection called "hackernews" for the Hacker News API. Base URL is https://hacker-news.firebaseio.com/v0, protocol is REST, no auth needed. It should have these endpoints:

  • GET /topstories.json — returns an array of up to 500 top story IDs, ordered by rank
  • GET /newstories.json — returns an array of up to 500 newest story IDs
  • GET /item/{id}.json — returns a single item (story, comment, job, or poll) with title, url, score, by, time, descendants, and kids fields
  • GET /user/{id}.json — returns a user profile with id, created, karma, about, and submitted fields

Item IDs are integers. To get the top 10 stories, fetch /topstories.json, take the first 10 IDs, then fetch each with /item/{id}.json.

The admin agent creates connections/hackernews/ with all three files. You can verify in the Files screen or click any file to inspect it.

Now add Wikipedia:

Add a connection called "wikipedia" for the Wikipedia REST API. Base URL is https://en.wikipedia.org/api/rest_v1, protocol is REST, no auth needed. Endpoints:

  • GET /page/summary/{title} — returns article summary with title, extract (plain text), description, thumbnail, and content_urls. The title parameter uses underscores for spaces (e.g., Neural_network).
  • GET /page/related/{title} — returns related articles, each with the same summary fields.

Article titles are case-sensitive. The extract field is a concise paragraph suitable for direct use.

Check the Files screen — you should now see both connection directories. Each has a spec.json, access.json, and surface.md.

What just happened?

The admin agent used its write_repo_file tool to create files in the allowlisted connections/ directory. The runtime detected the new files and hot-reloaded. Your agent can now call both APIs.

6. Add skills

Skills are markdown reasoning frameworks that guide the agent's behavior. Tell the admin agent:

Create a skill called "research". It should activate when the user asks to research a topic, investigate something, or wants to know what's trending. The process is:

  1. Check Hacker News — search top stories for relevant discussions, fetch the top 10-20 and filter by relevance
  2. Check Wikipedia — look up background info via page summary, follow related pages if needed
  3. Synthesize — combine findings, lead with what's current (HN), then background context (Wikipedia)

Output format: one-sentence summary first, markdown table for HN stories (title, score, comments, link), bullet points for Wikipedia background, always cite sources with links.

Then:

Create a skill called "summarize". It should activate when the user asks to summarize or condense something. Process: identify key claims, group related points, produce a summary no longer than 1/3 of the original. Output: lead with the most important takeaway, bullet points for supporting details, end with a Sources section. Never introduce information not in the source.

Check Files — you'll see skills/research/SKILL.md and skills/summarize/SKILL.md. Click into them to read what the admin agent wrote.

You can also see these in the sidebar under Skills. Click one to see its content and how many tokens it uses.

7. Add knowledge

Knowledge files provide static context — facts and rules the agent always has access to. Unlike skills (which guide reasoning), knowledge teaches the agent what to do.

Create a knowledge file called "formatting-rules" with these rules for presenting information:

  • Use markdown tables for structured data (stories, comparisons, lists with multiple fields)
  • Use bullet points for unstructured lists and summaries
  • Always cite sources with links — HN story URLs and Wikipedia article URLs
  • Use bold for key terms on first mention
  • Keep paragraphs to 2-3 sentences max
  • Use tables with clear column headers when comparing items

Check the Knowledge section in the sidebar to see it loaded.

8. Add a store

Stores let the agent persist structured data. Tell the admin agent:

Create a store called "notes" for saving research notes. The entity name is "ResearchNote", keyed by "{topic}". Schema fields:

  • topic: string
  • summary: string
  • sources: array of strings
  • saved_at: datetime

This creates stores/notes.json. The runtime automatically generates three tools from it:

  • store_notes — save a research note
  • get_notes — retrieve a note by topic
  • list_notes — list all saved notes

The key is {'{topic}'}, so saving a note with the same topic overwrites the previous one.

9. Check the prompt

Before chatting, open the Prompt screen in Studio. This shows the exact system prompt the LLM receives, with a token breakdown:

You can see how many tokens each skill, knowledge doc, and connection surface contributes to the context budget. This is useful for debugging ("why didn't the agent use Wikipedia?") and optimizing ("this connection surface is 3K tokens — can I trim it?").

10. Chat with your agent

Open http://localhost:3847 (the chat UI). Try these conversations in order:

a. Test the Hacker News connection

What's trending on Hacker News right now?

The agent activates the research skill, calls GET /topstories.json, fetches details for the top stories with GET /item/{'{id}'}.json, and presents a table. Watch the tool calls in the response — you'll see each API request.

b. Test Wikipedia + synthesis

Tell me about the history of neural networks

The agent checks both Wikipedia (GET /page/summary/Neural_network) and Hacker News for related discussions, then synthesizes background with current community conversation.

c. Test memory

Remember that I'm particularly interested in Rust and systems programming

The agent calls update_memory to persist this. Go to the Memory screen in Studio — you'll see the new entry. You can edit or delete it from here.

d. Test the store

Save a research note about today's top HN stories

The agent compiles a summary and calls store_notes. Go to the Stores screen in Studio, click into notes, and you'll see the persisted document with topic, summary, sources, and timestamp.

e. Test memory persistence

Start a new session (refresh the page), then:

What am I interested in?

The agent loads memory from the previous session and recalls your interest in Rust and systems programming — without you telling it again.

f. Test store reads

List my saved research notes

The agent calls list_notes and displays all persisted notes.

11. Write an eval

Evals are test assertions for your agent. They verify that the agent behaves correctly — which APIs it calls, what it includes in responses, what it avoids.

Tell the admin agent:

Create an eval called "hn-trending" that tests the agent's ability to fetch Hacker News trends. The query is: "What's trending on Hacker News?" and the assertions are:

  • Should call the Hacker News API to fetch top stories
  • Should present results in a table format
  • Should include story titles and scores
  • Should cite sources with links
  • Should NOT fabricate story titles or scores
  • Should NOT use Wikipedia for this query

Then create a second eval:

Create an eval called "memory-recall" that tests memory persistence. Setup context: "The user previously told the agent they are interested in Rust and systems programming." The query is: "What topics am I interested in?" and the assertions are:

  • Should reference Rust
  • Should reference systems programming
  • Should NOT ask the user to clarify their interests
  • Should NOT say it doesn't know the user's interests

Check the Files screen — you'll see evals/hn-trending.md and evals/memory-recall.md.

12. Run evals

From the CLI

Stop the dev server (Ctrl+C) and run:

amodal eval

This starts a temporary runtime, runs each eval, judges the response against every assertion using the LLM, and prints results:

  Eval Results
  ──────────────────────────────────────────────
  hn-trending      PASS  (6/6 assertions)  $0.02
  memory-recall    PASS  (4/4 assertions)  $0.01
  ──────────────────────────────────────────────
  2 passed, 0 failed                       $0.03

Save a baseline for future comparison:

amodal eval --save baseline

Later, after making changes, compare against it:

amodal eval --diff baseline

From Studio (Arena)

Restart the dev server (amodal dev), open Studio, and go to the Evals screen. You can run the same eval suite from the browser and see detailed results — assertion-level pass/fail, tool calls, full agent responses.

Now open Arena. Arena lets you compare how different models perform on the same evals:

  1. Select two or more models (e.g., Claude Sonnet, GPT-4o, Gemini Flash)
  2. Click Run
  3. See results side-by-side: pass rate, latency, cost per eval

This is how you decide which model to use — or validate that a cheaper model still passes your assertions before swapping.

13. Add an automation

Automations run your agent on a schedule or in response to webhooks. Tell the admin agent:

Create an automation called "daily-hn-digest". It should run every weekday at 9am (cron: 0 9 * * 1-5). The prompt is: "Fetch the current top 10 Hacker News stories. For each story, include the title, score, comment count, and URL. Format as a markdown table sorted by score descending. End with a one-sentence summary of today's themes."

This creates automations/daily-hn-digest.json. Check the Automations screen in Studio — you'll see it listed with its schedule.

You can trigger it immediately from Studio to test without waiting for the schedule. The automation runs a fresh agent session with the configured prompt and produces a result.

Note: For delivery to Slack or other services, add a delivery.targets array with webhook URLs. See Automations for details.

14. Build a page

Pages are React components that give your agent a custom UI. They use hooks like useStoreList to read from stores and display data. Tell the admin agent:

Create a page called "research-dashboard" that displays all saved research notes. It should:

  • Import useStoreList from @amodalai/react
  • Export a page config with description "Research Notes", stores: ["notes"], automations: ["daily-hn-digest"], and icon "book-open"
  • List documents from the "notes" store sorted by saved_at descending, limit 20
  • Show each note as a card with the topic as heading, summary as body text, source links, and the saved_at timestamp
  • Use Tailwind classes: text-foreground, text-muted-foreground, bg-card, border-border, text-primary for links
  • Handle loading and error states

The admin agent writes pages/research-dashboard.tsx with the React component. The dev server hot-reloads and Research Notes appears in the sidebar of the chat UI at http://localhost:3847. Click it to see your persisted research notes rendered in a card layout.

If the page doesn't look right, open it in the Files screen in Studio and ask the admin agent to fix it — "the source links aren't showing up" or "add a count of total notes at the top." It can iterate on the code just like any other file.

The page reads from the notes store using useStoreList. Every time the agent saves a new research note (or the daily automation runs), the page updates automatically.

15. Explore Studio

Now that everything is wired up, take a tour of the Studio screens:

Overview — Shows your agent's name, configured models, and provider status. Check that your API key shows as "set" under Secrets.

Prompt — See the full system prompt with token counts per section. Your two skills, one knowledge doc, and two connection surfaces all contribute to the budget. This is where you optimize if the context gets too large.

Memory — Every fact the agent saved with update_memory. Edit entries inline or delete stale ones.

Stores → notes — Browse all research notes. Click a document to see the full payload, metadata (when it was written, which model, token cost), and version history.

Evals — Run assertions, review results, drill into failed assertions to see what the agent actually said vs. what was expected.

Arena — Model comparison. If you're considering switching from Claude to GPT-4o, run your evals against both and compare quality, speed, and cost.

Automations — See scheduled runs, trigger them manually, check results from previous executions.

Files — Full file browser. Edit any skill, knowledge doc, or connection surface directly from the browser.

Next steps

You now have a working agent with connections, skills, knowledge, stores, memory, evals, an automation, and a custom page. Here's where to go deeper:

  • Connections — Add more APIs (GitHub, Stripe, any REST or MCP server)
  • Skills — Advanced skill patterns (multi-step workflows, conditional activation)
  • Stores — TTL, history tracking, failure modes, cross-store refs
  • Automations — Webhook triggers, delivery targets, failure alerts
  • Pages — Available hooks (useStore, useSkillAction, useAmodalChat) and widget components
  • Tools — Custom tools with TypeScript handlers
  • MCP Servers — Connect to MCP-compatible tool servers
  • Evals — Advanced assertions, CI integration (--ci flag), baselines
  • Security — Field restrictions, output guards, row scoping
  • Admin Agent — What it can and can't do, customizing it