@roostjs/ai Guides

Task-oriented instructions for agents, tools, streaming, and testing. Powered by Cloudflare Workers AI — no API keys required.

How to create an AI agent

Extend Agent and implement instructions() to define the system prompt. The AI binding in wrangler.jsonc is the only credential needed.

{
  "ai": { "binding": "AI" }
}
import { Agent } from '@roostjs/ai';

export class SupportAgent extends Agent {
  instructions(): string {
    return 'You are a helpful customer support agent for Acme Inc. Be concise and professional.';
  }
}

// Usage in a server function
const agent = new SupportAgent();
const response = await agent.prompt('How do I reset my password?');
console.log(response.text);

Each agent instance maintains its own conversation history. Create a new instance for each independent user session.

How to define and register tools

Implement the Tool interface and add the tool to your agent's tools() method.

import { type Tool, type ToolRequest } from '@roostjs/ai';
import { schema } from '@roostjs/schema';

export class OrderStatusTool implements Tool {
  constructor(private db: Database) {}

  description(): string {
    return 'Look up the status of a customer order by order ID';
  }

  schema(s: typeof schema) {
    return {
      orderId: s.string().description('The order ID to look up'),
    };
  }

  async handle(request: ToolRequest): Promise<string> {
    const orderId = request.get<string>('orderId');
    const order = await this.db.findOrder(orderId);
    if (!order) return 'Order not found.';
    return `Order ${orderId}: status=${order.status}, updated=${order.updatedAt}`;
  }
}
import { Agent, type HasTools } from '@roostjs/ai';
import { OrderStatusTool } from '../tools/OrderStatusTool';

export class SupportAgent extends Agent implements HasTools {
  constructor(private db: Database) {
    super();
  }

  instructions(): string {
    return 'You are a helpful customer support agent.';
  }

  tools() {
    return [new OrderStatusTool(this.db)];
  }
}

How to configure the model and parameters

Use class decorators to set defaults. All decorators can be overridden per-prompt via the options argument.

import { Agent, Model, MaxSteps, Temperature, MaxTokens } from '@roostjs/ai';

@Model('@cf/meta/llama-3.1-70b-instruct')
@Temperature(0.9)     // More creative responses
@MaxTokens(4096)      // Allow longer outputs
@MaxSteps(3)          // Max tool calls before final answer
export class WritingAgent extends Agent {
  instructions(): string {
    return 'You are a creative writing assistant.';
  }
}

// Override per-prompt
const response = await agent.prompt('Write a haiku', {
  temperature: 0.3,  // Override to more deterministic
  maxTokens: 100,
});

Available Cloudflare models include @cf/meta/llama-3.1-8b-instruct, @cf/meta/llama-3.1-70b-instruct, and @cf/mistral/mistral-7b-instruct-v0.2.

How to stream agent responses

Use agent.stream() to get an async iterable of text chunks. Useful for real-time UI updates.

const agent = new WritingAgent();
const stream = await agent.stream('Write a short story about a robot');

// Collect and forward as a streaming response
const encoder = new TextEncoder();
const body = new ReadableStream({
  async start(controller) {
    for await (const chunk of stream) {
      controller.enqueue(encoder.encode(chunk));
    }
    controller.close();
  },
});

return new Response(body, {
  headers: { 'content-type': 'text/plain; charset=utf-8' },
});

On the client, read the stream with the Fetch API's response.body reader or use a library like ai for React streaming hooks.

How to manage conversation memory

Agent instances maintain in-memory conversation history. For persistent cross-request memory, serialize and restore the history manually.

const agent = new SupportAgent(db);

// Conversation within a single request lifecycle
const r1 = await agent.prompt('My order #1234 is late.');
const r2 = await agent.prompt('Can you check the status?');
// Agent remembers the order number from the first turn
// For persistent sessions, pass prior history as context in instructions
export class SupportAgent extends Agent {
  constructor(private history: string[]) {
    super();
  }

  instructions(): string {
    const context = this.history.length
      ? '\n\nPrevious context:\n' + this.history.join('\n')
      : '';
    return 'You are a helpful support agent.' + context;
  }
}

How to route inference through AI Gateway

Use GatewayAIProvider when you want Cloudflare AI Gateway observability, request caching, or per-request fallback behaviour. The gateway provider wraps a direct CloudflareAIProvider and calls it automatically if the gateway is unavailable.

  1. Create an AI Gateway in the Cloudflare dashboard and note the account ID and gateway ID.
  2. Add both as secrets or config values:
CF_ACCOUNT_ID=abc123
AI_GATEWAY_ID=my-gateway
  1. Wire up the provider at application startup:
import { GatewayAIProvider } from '@roostjs/ai';
import { CloudflareAIProvider, AIClient } from '@roostjs/cloudflare';

const direct = new CloudflareAIProvider(new AIClient(env.AI));
const gateway = new GatewayAIProvider(
  { accountId: env.CF_ACCOUNT_ID, gatewayId: env.AI_GATEWAY_ID },
  direct,
);

// Apply to a specific agent class
MyAgent.setProvider(gateway);

Session affinity is applied automatically — once a conversation has more than one turn, subsequent requests carry x-session-affinity: true so the gateway routes them to the same cached context.

How to run async inference

Use queued: true to enqueue a long-running inference request and return a task ID immediately. Poll for the result separately.

import { SummarizeAgent } from '../agents/SummarizeAgent';

export async function startSummary(text: string): Promise<string> {
  const agent = new SummarizeAgent();
  const result = await agent.prompt(text, { queued: true });

  if (!result.queued) throw new Error('Expected queued result');
  return result.taskId; // Store this in KV or return to the client
}

Poll with AIClient.poll() from a separate endpoint or Durable Object alarm:

import { AIClient } from '@roostjs/cloudflare';

export async function checkSummary(taskId: string, env: Env) {
  const client = new AIClient(env.AI);

  // fetch must carry a CF API token for the REST API
  const authenticatedFetch = (url: string) =>
    fetch(url, { headers: { Authorization: `Bearer ${env.CF_API_TOKEN}` } });

  const result = await client.poll(taskId, authenticatedFetch, env.CF_ACCOUNT_ID);

  if (result.status === 'done') {
    return result.result; // The model output
  }

  return null; // Still running
}

How to build a RAG pipeline

A RAG pipeline chunks documents, embeds the chunks, stores them in Vectorize, then retrieves the most relevant chunks for a query.

1. Configure the Vectorize index and AI binding

{
  "ai": { "binding": "AI" },
  "vectorize": [
    { "binding": "VECTORIZE", "index_name": "my-docs-index" }
  ]
}

The index must match the embedding model's dimensionality. @cf/baai/bge-base-en-v1.5 (the default) produces 768-dimensional vectors. Create the index with:

npx wrangler vectorize create my-docs-index --dimensions=768 --metric=cosine

2. Ingest documents

import { RAGPipeline, SemanticChunker, EmbeddingPipeline } from '@roostjs/ai/rag';
import { AIClient, VectorStore } from '@roostjs/cloudflare';
import type { Document } from '@roostjs/ai/rag';

const client = new AIClient(env.AI);
const store = new VectorStore(env.VECTORIZE);
const chunker = new SemanticChunker({ chunkSize: 400 });
const embeddings = new EmbeddingPipeline(client);

const pipeline = new RAGPipeline(store, embeddings, chunker);

const docs: Document[] = [
  { id: 'doc-1', text: '...', metadata: { source: 'handbook' } },
  { id: 'doc-2', text: '...', metadata: { source: 'handbook' } },
];

const { inserted } = await pipeline.ingest(docs);
console.log(`Inserted ${inserted} vectors`);

3. Query and inject context into an agent

import { Agent } from '@roostjs/ai';
import { RAGPipeline, SemanticChunker, EmbeddingPipeline } from '@roostjs/ai/rag';
import { AIClient, VectorStore } from '@roostjs/cloudflare';

export class DocsAgent extends Agent {
  private rag: RAGPipeline;

  constructor(env: Env) {
    super();
    const client = new AIClient(env.AI);
    const store = new VectorStore(env.VECTORIZE);
    this.rag = new RAGPipeline(store, new EmbeddingPipeline(client), new SemanticChunker());
  }

  instructions(): string {
    return 'You are a helpful assistant. Answer questions using the context provided.';
  }

  async answerWithContext(question: string): Promise<string> {
    const results = await this.rag.query(question);
    const context = results.map((r) => r.chunk.text).join('\n\n');

    const result = await this.prompt(
      `Context:\n${context}\n\nQuestion: ${question}`,
    );

    if (result.queued) throw new Error('unexpected queued result');
    return result.text;
  }
}

4. Test the RAG pipeline without a real index

import { RAGPipeline } from '@roostjs/ai/rag';
import type { QueryResult } from '@roostjs/ai/rag';

const fakeResults: QueryResult[] = [
  { chunk: { id: 'doc-1:0', documentId: 'doc-1', text: 'Roost uses D1.', tokenCount: 5 }, score: 0.92 },
];

RAGPipeline.fake([fakeResults]);

// ... run your agent test ...

RAGPipeline.assertQueried((text) => text.includes('database'));
RAGPipeline.restore();

How to test agents without calling the AI provider

Use Agent.fake() to inject predetermined responses. Always call Agent.restore() after each test.

import { describe, it, expect } from 'bun:test';
import { SupportAgent } from '../../src/agents/SupportAgent';

describe('SupportAgent', () => {
  it('responds to password reset questions', async () => {
    SupportAgent.fake(['To reset your password, visit /account/reset.']);

    const agent = new SupportAgent(fakeDb);
    const response = await agent.prompt('How do I reset my password?');

    expect(response.text).toContain('reset');
    SupportAgent.restore();
  });

  it('was prompted with the user input', async () => {
    SupportAgent.fake(['Order found.']);

    const agent = new SupportAgent(fakeDb);
    await agent.prompt('Check order 5678');

    SupportAgent.assertPrompted('5678');
    SupportAgent.restore();
  });
});