Agent

Agent as a Tool

The agent is split into two tools: agent (policy) and an agent provider (LLM loop). The kernel has no special agent integration. agent-execute reads prompt/config/allowedTools/returns from the served context and calls ctx.manager.invoke('agent', { prompt, config, allowedTools, returns }). The agent skill discovers available tools, builds a single invoke tool, creates serializable callback refs for invocation and lifecycle hooks, and calls the agent provider with those refs.

The agent provider name is configurable via model.agent in the cascade metadata (default: agent-strands). Alternative providers shadow it via the search path.

The agent enforces configurable limits from model config:

Config key Default Description
maxTurns 30 Maximum LLM round-trips before the agent stops. Set to Infinity in persistent mode.
maxSteps 30 Maximum invoke tool calls across all turns. When exhausted, the invoke callback returns an error and the next turn-end throws.
persistent false When true, the agent loops indefinitely. See Persistent Mode.
contextEdit false When true, the agent provider can edit its own conversation context to manage long sessions.
agent agent-strands Agent provider skill name.

Agent Interface

The agent skill receives these args (a superset of what the provider sees):

Field Type Description
prompt string System prompt for the LLM.
config JsonObject Model configuration (model ID, region, temperature, etc.).
allowedTools string Space-delimited glob patterns for tool filtering. Callers can constrain this arg via allowed-tools specifiers (e.g., agent(allowedTools=file-read)) to prevent the agent from granting a subagent broader access than intended.
returns JsonObject JSON Schema for structured return value.
events string Topic to emit agent events to.
name string Display name for this agent invocation (used in logs and reporter).
memory Record<string, unknown> Serializable data to seed into the sandbox's memory global before the first turn.

The agent provider receives fully serializable args — no functions cross the boundary:

Field Type Description
prompt string System prompt for the LLM.
config JsonObject Model configuration.
invokeRef string Callback ref for the invoke tool — a programmatic skill the provider calls for code execution.
hookRef string Callback ref for the lifecycle hook — a programmatic skill the provider calls for events.
userMessage string Initial user message for the LLM.
skillName string Name of the skill whose pipeline triggered this agent invocation.
agentSignal AbortSignal Signal aborted on fail() or pipeline cancellation — providers should stop the LLM.

Both invokeRef and hookRef are created via createCallbackRef (a utility that registers ephemeral programmatic tools) and are invokable like any skill. This makes the agent interface fully serializable — it works over MCP and across process boundaries.

Code Execution Model

The agent has a single tool: invoke. This tool accepts JavaScript code as a string and executes it as an anonymous skill through the full pipeline. The code has ctx in scope and uses the standard programmatic API:

// The agent writes code like this:
const result = await ctx.manager.invoke("skill-name", { key: "value" });
ctx.manager.finish(result);  // return a result
throw new Error("reason");   // signal failure

This design means:

  • Every call goes through the real pipeline. Schema validation, middleware, directives, trust — all apply.
  • The agent composes freely. Loops, conditionals, async/await, try/catch, parallel invocations — all work naturally because the tool IS JavaScript.
  • No bridge tools. The context-manager skill exists for programmatic use, but the agent doesn't call it directly. It calls ctx.manager.finish(), ctx.manager.invoke(), etc. through code.
  • Middleware mode works. When the agent serves a middleware skill, the code can call target.manager.next() to advance the served pipeline.

The invoke tool's callback wraps the code in an anonymous skill definition and invokes it through ctx.manager.invoke(anonDef), which runs the full pipeline including validate-allowed-tools enforcement.

Sandbox Ref Restriction

Agent code runs in an isolated-vm V8 isolate with no Node.js APIs. All host interaction goes through an RPC channel to ContextManager. However, ctx.manager.invoke(ref) accepts any tool reference — including inline://code,... and file:// URIs, which cause the host to load and execute new code via import() outside the sandbox.

To prevent this escape, the agent invoke callback sets locals.internals.untrustedRef = true on the agent-code context. ContextManager.invoke() checks this flag and rejects any ref that is not a bare tool name (validated by isPathRef). URIs (inline://, file://), filesystem paths, and any string that doesn't match the [a-z0-9-] tool name pattern are blocked.

The flag lives on locals.internals — per-invocation state that does not propagate to child contexts. When agent code calls ctx.manager.invoke("file-read"), file-read gets its own context without the flag. If file-read internally uses an inline:// delegate, that works normally. The restriction applies only to the immediate invoke from untrusted code.

Tools that accept refs as arguments (schedule-create, skill-register, event-subscribe) are not restricted by this mechanism — the ref is in the args, not the invoke target. Skill authors control these via allowed-tools arg constraints (e.g., schedule-create(ref=my-*)).

Tool Discovery

The agent skill discovers available tools to document them in the system prompt. The LLM sees tool names, descriptions, and params — but calls them through the invoke tool's code, not as individual tool calls.

Discovery uses a shared filter (resolveVisibleTools) that applies consistently across the agent and the toolchain:

  • visibility: hidden tools excluded unless explicitly named in allowed-tools
  • Filtered by allowed-tools if declared
  • Self-recursion excluded unless explicitly allowed
  • The agent skill and agent provider are always excluded
  • Tools without params accept no arguments but are still discoverable

Discovery operates in two modes based on the allowed-tools spec:

Constrained mode (explicit tool list, no bare *): All matching tools are inlined in the system prompt with full params. The agent knows exactly what's available.

Discovery mode (no allowed-tools, or spec contains bare *): Only explicitly named tools are inlined. The agent also receives skill-list and skill-describe tools so it can explore available capabilities on demand. This prevents prompt bloat when hundreds of tools are available.

Middleware tools (role: middleware with params) are documented separately with instructions for activating them via ctx.manager.invoke() with frontmatter options.

Persistent Agent Scope

The agent establishes a persistent scope on served.nonlocals.agent that survives across turns:

  • ctx.nonlocals.agent.events — The event topic for agent progress events.
  • ctx.nonlocals.agent.contextId — The served context's ID.
  • ctx.nonlocals.agent.ancestorContextId — The parent agent's context ID (for nested agent trees).

Cross-turn state is managed by the sandbox's memory global — a plain object inside the V8 isolate that persists across turns. Values set in one invoke call are available in the next. Can hold any JS value: objects, closures, functions, timer IDs. The memory arg on the agent skill seeds this global before the first turn via Object.assign.

// In agent code:
memory.count = (memory.count ?? 0) + 1;

Self-Recursion Prevention

When a skill does not declare allowed-tools, agent-execute excludes the current tool from the tool documentation. The LLM cannot call the skill that invoked it (e.g., greeter → agent → greeter → ∞).

To enable self-recursion, explicitly include the skill's own name in allowed-tools.

Structured Returns

When a tool declares metadata.returns with a JSON Schema, agent-execute passes it to the agent. The agent's hook callback enforces the return — if the LLM doesn't call ctx.manager.finish() with a value matching the schema, the hook re-invokes with a reminder. The invoke tool validates the result against the schema before accepting it.

metadata:
  returns:
    type: object
    properties:
      count: { type: number }
    required: [count]

Persistent Mode

Persistent mode is activated by setting the persistent flag in model configuration:

metadata:
  model:
    persistent: true

With persistent mode enabled, the hook callback always returns { continue: true } at turn-end, so the agent never terminates. Errors are retried with exponential backoff (capped at 30 seconds). Persistent mode is incompatible with metadata.returns.

Bundled Agent (Strands/Bedrock)

The library includes a Strands-based agent provider at skills/extensions/agent/strands/agent-strands.skill.ts. It uses Amazon Bedrock via @strands-agents/sdk. Discovered automatically via the search path.

The implementation is minimal: build a BedrockModel, create a Strands FunctionTool for the invoke callback ref, run the loop, consult the hook callback ref. No business logic — all tool filtering, returns enforcement, and middleware handling live in the agent skill.

Writing a Custom Agent Provider

To use a different LLM provider, create a skill with any name and configure it via model.agent in the cascade metadata, or shadow agent-strands by placing a replacement earlier in the search path.

A custom agent provider must:

  1. Accept AgentArgs as args: { prompt, config, invokeRef, hookRef, userMessage }.
  2. Run an LLM loop: send the prompt, dispatch tool callbacks, collect results.
  3. Use invokeRef to execute agent code: await ctx.manager.invoke(invokeRef, { code }).
  4. Use hookRef to report lifecycle events and receive directives:
    • await ctx.manager.invoke(hookRef, { type: 'turn-start', turnNumber }) → report turn boundary.
    • await ctx.manager.invoke(hookRef, { type: 'message', text, delta? }) → report model text output. When delta is true, the text is a streaming chunk.
    • await ctx.manager.invoke(hookRef, { type: 'tool-call', tool, args }) → check for { deny } before executing.
    • await ctx.manager.invoke(hookRef, { type: 'tool-result', tool, args, result }) → report tool callback result.
    • await ctx.manager.invoke(hookRef, { type: 'turn-end', result, turnNumber? }) → follow the directive (stop/continue).
    • await ctx.manager.invoke(hookRef, { type: 'error', error, attempt }) → retry or stop.
  5. Return the result from the hook's stop directive, or the LLM's text response.

Hook directives: { stop: true, result? } to end, { continue: true, message? } to re-invoke, { deny: string } to block a tool call, { allow: true } to permit, or undefined for default behavior.

Example skeleton:

import type { Context, JsonObject } from 'agent-apps';

export const frontmatter = {
  name: 'my-agent-provider',
  description: 'Custom agent provider',
  metadata: { tags: ['meta-internal'], visibility: 'hidden' },
};

export default async function(ctx: Context, args: unknown) {
  const { prompt, config, invokeRef, hookRef, userMessage } = args as {
    prompt: string; config?: JsonObject;
    invokeRef?: string; hookRef?: string; userMessage?: string;
  };
  const model = createModel(config ?? {});  // your LLM client

  let message = userMessage ?? 'Execute the task.';

  while (true) {
    const response = await model.invoke(prompt, message);

    // Report model text
    if (hookRef) await ctx.manager.invoke(hookRef, { type: 'message', text: response.text });

    // Dispatch tool calls
    for (const call of response.toolCalls) {
      if (hookRef) {
        const d = await ctx.manager.invoke(hookRef, { type: 'tool-call', tool: call.name, args: call.input });
        if (d && typeof d === 'object' && 'deny' in d) { /* feed denial back to LLM */ continue; }
      }
      if (invokeRef) {
        const result = await ctx.manager.invoke(invokeRef, { code: call.input.code });
        if (hookRef) await ctx.manager.invoke(hookRef, { type: 'tool-result', tool: call.name, args: call.input, result });
      }
    }

    // Consult lifecycle hook
    if (hookRef) {
      const d = await ctx.manager.invoke(hookRef, { type: 'turn-end', result: response.text }) as Record<string, unknown> | undefined;
      if (d && 'stop' in d && d.stop) return d.result ?? response.text;
      if (d && 'continue' in d && d.continue) { message = (d.message as string) ?? 'Continue.'; continue; }
    }
    return response.text;
  }
}

Agent Providers

Agent providers can be separate npm packages. An extension package contains a skill that shadows the bundled implementation via search path priority or model.agent selection. The agent skill (policy layer) handles tool building, hook callback creation, and returns enforcement. The agent provider (LLM loop) is the only thing a provider needs to replace.

Two providers ship with the framework:

Provider Skill name Backend Description
Strands/Bedrock agent-strands Amazon Bedrock via @strands-agents/sdk Default provider.
Kiro agent-kiro Kiro ACP protocol Alternative provider using Kiro CLI as the backend. Tools served via in-process HTTP MCP server.

Agent Trace

The agent skill captures all hook callback events to target.locals.agent.trace — a timestamped array of every event that occurred during agent execution. Each entry is { type, timestamp, ...eventFields }.

The trace includes:

  • turn-start — turn boundary with turn number (fired before each model invocation)
  • message — model text output (one per turn without streaming, many with streaming when delta: true)
  • reasoning — chain-of-thought text (when the model supports extended thinking)
  • tool-call — tool invocation with args (before execution)
  • tool-result — tool callback result (after execution)
  • turn-end — turn boundary with aggregate model text and turn number
  • error — model errors with attempt count
  • retry — transient error retry with attempt count

The trace is available to any middleware that runs after the agent (via the Koa onion pattern). The session middleware captures it by default via the $agent sentinel. Code skills can read it from ctx.locals.agent.trace.

When events is passed (or nonlocals.agent.events is set), the agent emits events to that topic in real time via the event bus. Emitted event types use agent-prefixed names: agent-start, turn-start, turn-end, tool-call, tool-result, tool-error, agent-message, agent-error, and agent-complete. This is how the MCP server streams agent events to connected clients.

Ask AI