Architecture
01 — Architecture
FlareDispatch deploys as a single Worker — the Dispatcher — into the user’s own Cloudflare account. Around it sit three tiers: a control plane that routes triggers and orchestrates durable execution, a data plane where test code actually runs, and a storage tier for cache, artifacts, logs, and metadata.
This spec describes those pieces and the design model that ties them together — components, responsibilities, data flow, and the platform constraints that shape the design. It stays at the conceptual level: concrete binding names, wrangler config, and the D1 DDL are deployment artifacts and live in 05-byoc; the roadmap and V0 build order are in pm/plan.
Components
flowchart TB
subgraph GH[GitHub]
GHA[Actions workflow]
APP[GitHub App<br/>installation]
CHK[Check Runs API]
end
subgraph CF[Cloudflare account — self-hosted]
subgraph CP[Control plane]
DSP[Dispatcher Worker<br/>authenticate, route, dedup]
WF[Workflow Engine<br/>one durable instance per execution]
CO[Coordinator DO<br/>fan-out result aggregation]
end
subgraph DP[Data plane]
SB[Sandbox / Container<br/>git, install, test]
BR[Browser Rendering<br/>Playwright sessions]
end
subgraph ST[Storage]
R2[(R2 — cache, artifacts, logs)]
D1[(D1 — executions, steps)]
KV[(KV — config, idempotency,<br/>install-token cache)]
Q[[Queue — fan-out backpressure]]
end
end
GHA -->|HMAC POST| DSP
APP -->|App-signed webhook| DSP
Admin[Operator<br/>via CF Access] -->|admin POST| DSP
DSP --> WF
WF -->|aggregate shard results| CO
WF --> SB
WF --> BR
WF --> R2
WF --> D1
WF -.->|only at very high shard counts| Q
Q -.-> WF
WF -->|installation token| CHK
APP -.->|provides token| WFControl plane
Dispatcher Worker
The single public entry point. Its responsibilities are deliberately narrow — authenticate, route, deduplicate, instantiate a Workflow — and nothing else. No business logic and no long-running calls run on this path, so each invocation stays well within the Worker CPU budget. LLM calls, Octokit fetches, and container starts all happen later, inside the Workflow.
It exposes four kinds of endpoint:
| Surface | Responsibility |
|---|---|
| Dispatch | Start an execution from an HMAC-signed POST (Action mode). |
| Webhook | Start an execution from a FlareDispatch GitHub App webhook (Webhook mode). |
| Inspection | Return execution metadata; redirect to signed artifact / log URLs. |
| Admin | Operator surface — execution list, force-cancel, replay, signalling a paused Workflow. Gated by Cloudflare Access. |
Trigger modes, the request/response contracts, and the literal route paths are in 04-gha-integration.
Workflow Engine
Each execution is one durable Cloudflare Workflow instance. The Workflow body is an Effect program (see 03-dsl) composed of steps; each step is a checkpoint — durable across Worker restarts and retried by the platform. An evicted Worker resumes from the last completed step rather than restarting the execution.
Coordinator
A Durable Object used only by fan-out runs to aggregate child-shard results. A matrix run spawns N child Workflows; each reports its result into a Coordinator keyed by the parent execution. Single-writer semantics let shard-completion handlers race without conflict. Once every shard has reported, the Coordinator triggers check-run finalization. Spawning the children does not itself need the Coordinator — see § Fan-out model.
Data plane
Sandbox / Container
Every step that runs arbitrary code — git clone, pnpm install, pytest, cargo test, bash scripts — acquires a container from a pool. Containers come from versioned, per-language-stack images (Node, Playwright, Rust, Python); a run may override the image via its image: input. Base images are kept thin: run-level installs happen at runtime and are cached to R2. The concrete image registry and tags are a deployment concern — see 05-byoc.
Browser Rendering
Browser-centric runs (playwright-e2e, cdp-acceptance) use Cloudflare Browser Rendering — a managed Chromium with no container overhead. Two access modes:
- REST mode — Puppeteer against the managed pool; fast for short, stateless page interactions.
- CDP mode — direct CDP WebSocket attach for fine-grained instrumentation (request interception, heap snapshots, network events).
A run picks whichever mode its assertions need.
Storage
Four stores, each with a distinct role:
| Store | Holds | Notes |
|---|---|---|
| R2 | Package cache, artifacts, per-step logs | Zero egress within Cloudflare. Cache keys are content-addressed by lockfile hash + image digest, so cross-environment poisoning is impossible. |
| D1 | Execution and step metadata | Metadata and pointers only — logs and artifacts live in R2. |
| KV | Config, receiver-level idempotency keys, App install-token cache | Three namespaces, kept separate so an audit shows config never co-mingles with idempotency state. |
| Queue | Fan-out backpressure | Engaged only when shard creation would exceed the platform’s instance-creation rate. |
R2 layout
cache/<repo>/<key> immutable; key derived from lockfile hash
artifacts/<execution-id>/ per-execution; retention via R2 lifecycle policy
logs/<execution-id>/ per-step structured logs
Cache entries have no TTL; an R2 lifecycle policy on the deploy controls eviction.
Data model
D1 holds two entities:
- Execution — one row per run invocation: which run, the repo / ref / sha, status, start and end timestamps, the input payload, a result summary, the GitHub check-run id, and an optional parent execution (for matrix children).
- Step — one row per step within an execution: name, status, timing, exit code, a pointer to the step’s R2 log, and an attempt counter. A Step belongs to exactly one Execution.
The literal CREATE TABLE schema is a deployment artifact — see 05-byoc § D1 schema.
Per-execution lifecycle
Once the Dispatcher accepts a request (Action or Webhook mode — both in 04-gha-integration), the Workflow runs to completion in the background:
sequenceDiagram
autonumber
participant DSP as Dispatcher
participant WF as Workflow
participant SB as Sandbox
participant R2 as R2
participant GH as GitHub Checks
DSP->>WF: Instantiate Workflow (execution id + params)
WF->>GH: Open check-run (status=in_progress)
WF->>SB: Acquire container, git clone
SB-->>WF: ok
WF->>R2: Restore package cache
R2-->>SB: cache archive
WF->>SB: install + test command
SB-->>WF: exit code, results
WF->>R2: Save cache, upload artifacts
WF->>GH: Finalize check-run (conclusion + summary)Because every step is a checkpoint, an evicted Worker resumes mid-execution rather than restarting. The check run is the source of truth for whether the work passed — required status checks on the PR reference the check-run name, not whatever trigger fired the execution.
Fan-out model
Matrix runs use a parent-child Workflow tree. The platform’s batch-create primitive spawns up to 100 child Workflow instances in one bound call — idempotent on a user-supplied id — invoked directly from a parent step. No intermediate Queue or spawner is needed for the common case.
flowchart LR
P[Parent Workflow<br/>run = playwright-e2e<br/>step = spawn-shards] -->|spawn N children| C1[Child Workflow shard 1]
P -->|spawn N children| C2[Child Workflow shard 2]
P -->|spawn N children| Cn[Child Workflow shard N]
C1 -->|report| CO[Coordinator DO]
C2 -->|report| CO
Cn -->|report| CO
CO -->|all done| P
P --> CHK[Check Run finalize]The parent decides shard count from inputs or by auto-detecting test files. Each shard is a fresh child Workflow with its own check-run sub-check, annotated under the parent’s summary. The parent does not block on children: batch-create returns immediately with instance handles, and the parent either subscribes via the Coordinator or polls handle status in a follow-up step.
Only when shard counts exceed the platform’s per-workflow instance-creation rate does the parent pace creation through the Queue instead of batching directly.
If a child shard fails:
- The shard’s check-run conclusion is
failure. - The parent’s overall conclusion becomes
failureonce any shard fails — or once all complete, depending onfailureBehavior. - Each shard’s logs and reports are independent R2 paths, linked from the summary.
Durability and dedup
Two design disciplines keep executions correct under retries and redeliveries:
- Durability — every
stepis a Workflow checkpoint. Non-determinism (time, UUIDs, env reads) flows through theio.*DSL primitives (see 03-dsl) so checkpoint replay stays consistent. - Two-layer dedup — a receiver-level idempotency key collapses redelivery storms before any Workflow is touched; a Workflow-level semantic instance id collapses two distinct deliveries naming the same logical work (same repo + head SHA) onto one execution. Full discipline in 04-gha-integration § Receiver dedup.
Long-running test handling
Workflow steps have unlimited wall-clock duration — a step body can await I/O for as long as needed — but each step is bounded by Worker CPU time (a few minutes maximum). Container exec, where the test command actually runs, counts as I/O against the parent Worker, so a 25-minute test is fine as long as the step body is mostly awaiting the container. Runs still split work for two reasons:
- Chunked execution — the run splits work into multiple steps (e.g. per test file or per Playwright project). Each step is independently checkpointed; a failure mid-suite restarts only the failed step. This is about retry granularity, not duration caps.
- Detached container — for a genuinely indivisible long execution (e.g. one integration test that takes 25 minutes), the run starts a container in detached mode, returns immediately from the Worker step, and polls the container’s exit status from later steps. The DSL exposes this as
sandbox.runDetached/sandbox.waitForExit(see 03-dsl).
Both patterns are checkpointed by Workflows, so the Worker process can be evicted mid-execution and resume cleanly.
Platform limits — design constraints
The architecture is shaped by Cloudflare platform limits. The ones that matter, and how the design accommodates each:
| Limit | Documented value (Workers Paid, 2026-05) | How the design accommodates it |
|---|---|---|
| Worker CPU per request | 30 s default, configurable to 5 min | Workflow steps are I/O-bound: spawn container, await exit, store result. Heavy CPU lives in Sandbox containers. |
| Workflow step CPU time | Same as Worker CPU; wall-clock per step is unlimited | Chunked execution for retry granularity; detached containers for long indivisible executions. |
| Workflow steps per instance | 10,000 default, configurable to 25,000 | Parent workflows for >25k-shard matrices use child-of-child nesting; sleeps don’t count against the quota. |
| Workflow concurrent instances | 50,000 per account; creation rate 100/s per workflow, 300/s per account | The fan-out Queue paces creation only when shard count × dispatch rate exceeds the per-workflow rate. |
| Workflow step result size | 1 MiB per non-stream step result; larger payloads stream | Logs / artifacts go to R2; steps return pointers, not blobs. |
| Browser Rendering: session duration | No fixed max while active; 60 s idle timeout (extendable) | Runs rotate sessions per test file rather than holding one open for a whole suite. |
| Browser Rendering: concurrent sessions | 120 per account (higher on request) | Shard cap derived from this number minus headroom for other runs. |
| Browser Rendering: free included | 10 browser-hours/month; 10 concurrent browsers | Runs prefer managed Browser Rendering for short tests; in-container Playwright for sessions that would blow the free tier. |
| Container concurrency per account | 1,500 vCPU, 6 TiB memory, 30 TB disk aggregate | Run metadata declares maxConcurrency; the Dispatcher rejects with 429 + Retry-After when account headroom is gone. |
| Container registries | Cloudflare’s registry, Docker Hub, Amazon ECR — GHCR is not a supported pull source | Base images are mirrored to Cloudflare’s registry at release time; see 05-byoc. |
| D1 write rate | Sequential per-database; bounded queries per Worker invocation | Hot-path writes batched per step; checkpoints write once per step transition, not per log line. |
| D1 database size | 10 GB per database; 1 TB account storage | Logs and artifacts live in R2; D1 stores only execution metadata and pointers. |
| R2 lifecycle | Per-prefix expiration rules | Cache / artifact / log retention set by lifecycle policy; see 05-byoc § Retention. |
| Queues | 5,000 msg/s per queue; batched sends | Fan-out shards published in batched sends when the Queue path is taken at all. |
| GitHub API rate limit | 5,000 req/h per installation token | Check-run updates throttled to ~1/sec per execution via the Coordinator. |
Source: Workflows limits, Browser Rendering limits, Containers limits, D1 limits, Queues limits. Values current as of 2026-05.
Observability
- Logs — every step writes structured NDJSON to R2, one object per step. Streamable via Logpush if the user configures it.
- Metrics — Workflow built-in metrics (step duration, retry count) export to Workers Analytics Engine.
- Traces — each step is an OpenTelemetry span; the execution is the root span. Traces export to whatever OTel collector the user configures.
- Execution inspection — the PR’s Checks tab shows status; the check-run detail page links to logs, artifacts, and the trace.
There is intentionally no custom web UI in v0–v2 — the GitHub check-run page is the operator surface.