@roostjs/cloudflare
Why Roost wraps Cloudflare's native bindings, how typed clients improve the developer experience, and how binding names are resolved.
The Raw Binding Problem
Cloudflare injects bindings into the Worker's env object. Accessing KV looks
like env.SESSION_KV.get(key), D1 like env.DB.prepare(sql).
This works, but it has friction. The binding names are strings that must match
wrangler.toml exactly. The types are broad Cloudflare platform types that do
not carry information about which database or namespace they represent. And when application
code reaches directly into env, it becomes harder to test — tests need to
fabricate an env object with the right shape.
@roostjs/cloudflare solves this by providing thin typed wrappers — KVStore,
D1Database, R2Bucket, Queue, AIClient —
and registering them in the service container under the binding's configured name. Application
code resolves the binding by name from the container rather than reading env
directly. The container is responsible for wrapping the raw binding object. Tests can register
fake implementations under the same name.
Binding Name Resolution
The CloudflareServiceProvider reads binding configuration from the application's
config and registers each binding in the container. When auth session storage wants a KV store,
it resolves it from the container by the configured name — not by reaching into env.
This indirection lets binding names be configured in one place (the config or environment
variables), and lets the container act as the source of truth for what bindings the
application uses.
The naming convention follows Cloudflare's own conventions: KV namespaces in SCREAMING_SNAKE_CASE, because that is how Cloudflare workers.toml defines them. Roost does not rename them — it reads the name from config and registers the wrapper under that exact name. This keeps the container names in sync with the Cloudflare configuration without transformation.
Why a Single AIClient.run()
Cloudflare Workers AI exposes a single method: ai.run(model, inputs). Every
Workers AI task — text generation, image classification, embeddings, speech-to-text — uses
this one method with different model strings and input shapes. Roost's AIClient
wraps this directly: client.run(model, inputs). There are no per-task methods,
no client.generateText() or client.embed(). This is intentional.
The alternative — a method per task — would require updating the client every time Workers AI
adds a new task type. The pass-through design stays current without changes, and the
@roostjs/ai package builds the higher-level agent abstraction on top of
AIClient.run(). The cloudflare package stays thin; the AI package adds
structure.
Binding Auto-Detection
CloudflareServiceProvider iterates over every key in the Worker env object and
attempts to identify the binding type by duck-typing — checking for the presence of
specific methods. This means you do not need to declare which bindings exist in your
Roost config; the provider discovers them at boot time and wraps each one in the
appropriate typed client.
The detection order matters because some binding shapes overlap. R2 buckets have all the
methods that KV namespaces have plus head, so R2 is checked before KV. Durable Object
namespaces have a get method, so they are distinguished from KV by also checking for
idFromName. Dispatch namespaces have get but not fetch on the namespace itself,
so they are checked before the catch-all Fetcher guard. The comments in provider.ts
document the reasoning for each guard's position in the chain.
The practical consequence is that renaming a binding in wrangler.jsonc automatically
renames it in the container — no Roost config change needed. The wrapper is registered
under the binding's key, which is whatever SCREAMING_SNAKE_CASE name Cloudflare assigns.
Worker-to-Worker Service Architecture
Cloudflare service bindings let one Worker call another over an internal network without
going through the public internet. The callee Worker runs in the same Cloudflare data
center as the caller, so latency is negligible. The ServiceClient wrapper provides
HTTP convenience methods (get, post, put, patch, delete) that handle
JSON serialization and the internal URL scheme (http://service{path}).
The call() method layers a lightweight RPC convention on top: it POST to
/rpc/{method} with { args } as the body and returns the parsed JSON response. This
is useful when the remote Worker is designed as an internal service rather than a
user-facing API. Whether to use the HTTP methods or call() is a choice about how the
called Worker is designed — there is no framework-level enforcement.
Multi-Tenant Compute with Workers for Platforms
Workers for Platforms is Cloudflare's model for running tenant-owned code at scale. A platform operator deploys a dispatch namespace; tenants upload their own Worker scripts into that namespace; the operator's Worker dispatches inbound requests to the appropriate tenant script by name.
DispatchNamespaceClient.dispatchClient() returns a ServiceClient for a named script,
which means all the same HTTP and RPC patterns apply to tenant scripts as to first-party
service bindings. The trust option controls whether the dispatched script receives the
raw request headers or a sanitized version — use 'trusted' only for scripts you own.
Edge HTML Transformation
HTMLRewriter is a Cloudflare-specific streaming HTML parser. It processes the response
body as a byte stream and applies registered element handlers without buffering the full
document. This is important for large pages: transformation latency is proportional to
handler complexity, not document size.
HtmlTransformer wraps HTMLRewriter with a chainable interface. Each method
(injectScript, setMetaTag, replaceElement, removeElement, abTest) registers
one or more element handlers. Calling transform() passes the accumulated rewriter over
the response stream.
The A/B test support is a deliberate design choice: the assignment function runs inside
the element handler, which executes during streaming. The request object must be
threaded into the handler state before streaming begins — that is why transform()
accepts the request as a second argument rather than reading it from a closure at
construction time.
Rate Limiting at the Edge
Rate limiting in a distributed system is a coordination problem: each Worker isolate is independent, so a per-isolate counter would be meaningless. Roost provides two backends that solve this with different consistency trade-offs.
KVRateLimiter uses a fixed-window counter stored in KV. KV is eventually consistent —
reads may not immediately reflect the latest write. Under heavy concurrent traffic from
the same key, multiple isolates may read the same count and all increment it, allowing
brief bursts above the limit. For most applications this is acceptable; the over-run is
bounded to the replication lag of the KV write.
DORateLimiter routes each check to a named Durable Object instance. Durable Objects
serialize requests to a single instance, so the counter is always accurate. The trade-off
is latency: every rate limit check requires a round-trip to the DO, which may be in a
different data center if the request originates far from the DO's home location.
RateLimiterFake exists for the same reason as FakeLogger: rate limiters are a
side-effecting external dependency. Tests that use fakeRateLimiter() get deterministic
behavior without any KV or DO infrastructure. The fake integrates at the point where
KVRateLimiter and DORateLimiter check for an active fake, so no test-specific code
path is needed in the middleware itself.
Content-Addressed Caching with VersionedKVStore
KV's consistency model — eventual consistency with no atomic compare-and-swap — makes
cache invalidation tricky. The conventional approach (write a new value under the same
key and delete the old one) has a race window: a reader between the write and the delete
may get the old value or a null. VersionedKVStore sidesteps this by separating content
storage from pointer storage.
Content is written under a SHA-256 hash of its serialized form: the same content always produces the same key, and the key is immutable once written. A separate pointer key holds the current hash. Readers follow the pointer; writers update the pointer atomically after writing the content. A reader that follows a stale pointer still gets valid data — it just gets the previous version rather than the latest.
The content TTL controls how long orphaned content entries survive after the pointer moves away. Setting it too low means a pointer update and the subsequent read could race with content expiry on the old value. The default of 24 hours is conservative for data that changes at most once per day.