@roostjs/ai Guides
Task-oriented instructions for agents, tools, streaming, and testing. Powered by Cloudflare Workers AI — no API keys required.
How to create an AI agent
Extend Agent and implement instructions() to define the system prompt. The AI binding in wrangler.jsonc is the only credential needed.
{
"ai": { "binding": "AI" }
}import { Agent } from '@roostjs/ai';
export class SupportAgent extends Agent {
instructions(): string {
return 'You are a helpful customer support agent for Acme Inc. Be concise and professional.';
}
}
// Usage in a server function
const agent = new SupportAgent();
const response = await agent.prompt('How do I reset my password?');
console.log(response.text);Each agent instance maintains its own conversation history. Create a new instance for each independent user session.
How to define and register tools
Implement the Tool interface and add the tool to your agent's tools() method.
import { type Tool, type ToolRequest } from '@roostjs/ai';
import { schema } from '@roostjs/schema';
export class OrderStatusTool implements Tool {
constructor(private db: Database) {}
description(): string {
return 'Look up the status of a customer order by order ID';
}
schema(s: typeof schema) {
return {
orderId: s.string().description('The order ID to look up'),
};
}
async handle(request: ToolRequest): Promise<string> {
const orderId = request.get<string>('orderId');
const order = await this.db.findOrder(orderId);
if (!order) return 'Order not found.';
return `Order ${orderId}: status=${order.status}, updated=${order.updatedAt}`;
}
}import { Agent, type HasTools } from '@roostjs/ai';
import { OrderStatusTool } from '../tools/OrderStatusTool';
export class SupportAgent extends Agent implements HasTools {
constructor(private db: Database) {
super();
}
instructions(): string {
return 'You are a helpful customer support agent.';
}
tools() {
return [new OrderStatusTool(this.db)];
}
}How to configure the model and parameters
Use class decorators to set defaults. All decorators can be overridden per-prompt via the options argument.
import { Agent, Model, MaxSteps, Temperature, MaxTokens } from '@roostjs/ai';
@Model('@cf/meta/llama-3.1-70b-instruct')
@Temperature(0.9) // More creative responses
@MaxTokens(4096) // Allow longer outputs
@MaxSteps(3) // Max tool calls before final answer
export class WritingAgent extends Agent {
instructions(): string {
return 'You are a creative writing assistant.';
}
}
// Override per-prompt
const response = await agent.prompt('Write a haiku', {
temperature: 0.3, // Override to more deterministic
maxTokens: 100,
});Available Cloudflare models include @cf/meta/llama-3.1-8b-instruct, @cf/meta/llama-3.1-70b-instruct, and @cf/mistral/mistral-7b-instruct-v0.2.
How to stream agent responses
Use agent.stream() to get an async iterable of text chunks. Useful for real-time UI updates.
const agent = new WritingAgent();
const stream = await agent.stream('Write a short story about a robot');
// Collect and forward as a streaming response
const encoder = new TextEncoder();
const body = new ReadableStream({
async start(controller) {
for await (const chunk of stream) {
controller.enqueue(encoder.encode(chunk));
}
controller.close();
},
});
return new Response(body, {
headers: { 'content-type': 'text/plain; charset=utf-8' },
});On the client, read the stream with the Fetch API's response.body reader or use a library like ai for React streaming hooks.
How to manage conversation memory
Agent instances maintain in-memory conversation history. For persistent cross-request memory, serialize and restore the history manually.
const agent = new SupportAgent(db);
// Conversation within a single request lifecycle
const r1 = await agent.prompt('My order #1234 is late.');
const r2 = await agent.prompt('Can you check the status?');
// Agent remembers the order number from the first turn// For persistent sessions, pass prior history as context in instructions
export class SupportAgent extends Agent {
constructor(private history: string[]) {
super();
}
instructions(): string {
const context = this.history.length
? '\n\nPrevious context:\n' + this.history.join('\n')
: '';
return 'You are a helpful support agent.' + context;
}
}How to route inference through AI Gateway
Use GatewayAIProvider when you want Cloudflare AI Gateway observability, request caching, or per-request fallback behaviour. The gateway provider wraps a direct CloudflareAIProvider and calls it automatically if the gateway is unavailable.
- Create an AI Gateway in the Cloudflare dashboard and note the account ID and gateway ID.
- Add both as secrets or config values:
CF_ACCOUNT_ID=abc123
AI_GATEWAY_ID=my-gateway- Wire up the provider at application startup:
import { GatewayAIProvider } from '@roostjs/ai';
import { CloudflareAIProvider, AIClient } from '@roostjs/cloudflare';
const direct = new CloudflareAIProvider(new AIClient(env.AI));
const gateway = new GatewayAIProvider(
{ accountId: env.CF_ACCOUNT_ID, gatewayId: env.AI_GATEWAY_ID },
direct,
);
// Apply to a specific agent class
MyAgent.setProvider(gateway);Session affinity is applied automatically — once a conversation has more than one turn, subsequent requests carry x-session-affinity: true so the gateway routes them to the same cached context.
How to run async inference
Use queued: true to enqueue a long-running inference request and return a task ID immediately. Poll for the result separately.
import { SummarizeAgent } from '../agents/SummarizeAgent';
export async function startSummary(text: string): Promise<string> {
const agent = new SummarizeAgent();
const result = await agent.prompt(text, { queued: true });
if (!result.queued) throw new Error('Expected queued result');
return result.taskId; // Store this in KV or return to the client
}Poll with AIClient.poll() from a separate endpoint or Durable Object alarm:
import { AIClient } from '@roostjs/cloudflare';
export async function checkSummary(taskId: string, env: Env) {
const client = new AIClient(env.AI);
// fetch must carry a CF API token for the REST API
const authenticatedFetch = (url: string) =>
fetch(url, { headers: { Authorization: `Bearer ${env.CF_API_TOKEN}` } });
const result = await client.poll(taskId, authenticatedFetch, env.CF_ACCOUNT_ID);
if (result.status === 'done') {
return result.result; // The model output
}
return null; // Still running
}How to build a RAG pipeline
A RAG pipeline chunks documents, embeds the chunks, stores them in Vectorize, then retrieves the most relevant chunks for a query.
1. Configure the Vectorize index and AI binding
{
"ai": { "binding": "AI" },
"vectorize": [
{ "binding": "VECTORIZE", "index_name": "my-docs-index" }
]
}The index must match the embedding model's dimensionality. @cf/baai/bge-base-en-v1.5 (the default) produces 768-dimensional vectors. Create the index with:
npx wrangler vectorize create my-docs-index --dimensions=768 --metric=cosine2. Ingest documents
import { RAGPipeline, SemanticChunker, EmbeddingPipeline } from '@roostjs/ai/rag';
import { AIClient, VectorStore } from '@roostjs/cloudflare';
import type { Document } from '@roostjs/ai/rag';
const client = new AIClient(env.AI);
const store = new VectorStore(env.VECTORIZE);
const chunker = new SemanticChunker({ chunkSize: 400 });
const embeddings = new EmbeddingPipeline(client);
const pipeline = new RAGPipeline(store, embeddings, chunker);
const docs: Document[] = [
{ id: 'doc-1', text: '...', metadata: { source: 'handbook' } },
{ id: 'doc-2', text: '...', metadata: { source: 'handbook' } },
];
const { inserted } = await pipeline.ingest(docs);
console.log(`Inserted ${inserted} vectors`);3. Query and inject context into an agent
import { Agent } from '@roostjs/ai';
import { RAGPipeline, SemanticChunker, EmbeddingPipeline } from '@roostjs/ai/rag';
import { AIClient, VectorStore } from '@roostjs/cloudflare';
export class DocsAgent extends Agent {
private rag: RAGPipeline;
constructor(env: Env) {
super();
const client = new AIClient(env.AI);
const store = new VectorStore(env.VECTORIZE);
this.rag = new RAGPipeline(store, new EmbeddingPipeline(client), new SemanticChunker());
}
instructions(): string {
return 'You are a helpful assistant. Answer questions using the context provided.';
}
async answerWithContext(question: string): Promise<string> {
const results = await this.rag.query(question);
const context = results.map((r) => r.chunk.text).join('\n\n');
const result = await this.prompt(
`Context:\n${context}\n\nQuestion: ${question}`,
);
if (result.queued) throw new Error('unexpected queued result');
return result.text;
}
}4. Test the RAG pipeline without a real index
import { RAGPipeline } from '@roostjs/ai/rag';
import type { QueryResult } from '@roostjs/ai/rag';
const fakeResults: QueryResult[] = [
{ chunk: { id: 'doc-1:0', documentId: 'doc-1', text: 'Roost uses D1.', tokenCount: 5 }, score: 0.92 },
];
RAGPipeline.fake([fakeResults]);
// ... run your agent test ...
RAGPipeline.assertQueried((text) => text.includes('database'));
RAGPipeline.restore();How to test agents without calling the AI provider
Use Agent.fake() to inject predetermined responses. Always call Agent.restore() after each test.
import { describe, it, expect } from 'bun:test';
import { SupportAgent } from '../../src/agents/SupportAgent';
describe('SupportAgent', () => {
it('responds to password reset questions', async () => {
SupportAgent.fake(['To reset your password, visit /account/reset.']);
const agent = new SupportAgent(fakeDb);
const response = await agent.prompt('How do I reset my password?');
expect(response.text).toContain('reset');
SupportAgent.restore();
});
it('was prompted with the user input', async () => {
SupportAgent.fake(['Order found.']);
const agent = new SupportAgent(fakeDb);
await agent.prompt('Check order 5678');
SupportAgent.assertPrompted('5678');
SupportAgent.restore();
});
});