Skip to content

ADR-0007 — RDS Postgres as the AWS catalog; EFS-mounted SQLite rejected

The AWS topology needs a catalog backend with the same single-writer-per-pipeline invariant the Cloudflare topology gets from Durable Object SQLite (ADR-0006). Three candidates:

  • EFS-mounted SQLite — share a SQLite file across Fargate tasks via EFS. Cheapest, mirrors the local-development shape.
  • RDS Postgres — managed relational DB with proper concurrency primitives.
  • DynamoDB — pay-per-request KV with conditional writes.

Symmetry with DO SQLite would suggest the EFS-SQLite path. Closer inspection shows it is unsafe.

RDS Postgres is the default AWS catalog. EFS-mounted SQLite is not a shipped option. DynamoDB lands in M2.5 as an additional adapter.

Each backend implements the trait Catalog port from hakiri-core (introduced in M0), so swapping is a configuration choice, not a code change.

Positive

  • RDS Postgres has battle-tested concurrency control: row-level locks, SELECT FOR UPDATE, advisory locks. The lease semantics from ADR-0005 map cleanly onto Postgres primitives.
  • Single shared catalog across all Fargate tasks in a deployment — no per-task local state to keep in sync.
  • Familiar operationally: snapshots, point-in-time recovery, IAM auth, parameter groups, all standard.

Negative

  • RDS adds a fixed monthly cost even for low-volume deployments. The Lambda-only sub-shape (sub-shape a in 06-deployment.md) can substitute DynamoDB to keep costs proportional to use.
  • Requires a SQL port of the catalog DDL from the local SQLite shape. The port is mechanical (small dialect differences) and lives in the hakiri-context crate as the Postgres adapter.
  • Adds a VPC consideration — Fargate tasks need a route to RDS, either same-VPC or via PrivateLink.

Neutral

  • The catalog backend choice is per-deployment. A team starting on DynamoDB can migrate to RDS later via hakiri catalog migrate.

EFS-mounted SQLite (rejected). Three specific problems make this unsafe, in increasing order of severity:

  1. NFSv4 byte-range locks are advisory. SQLite relies on cooperative honor of locks; on NFS the cooperation is best-effort and documented to have race windows. Two ECS tasks can both believe they hold the write lock.
  2. SQLite WAL mode is explicitly unsupported on networked filesystems. SQLite’s documentation is unambiguous. Falling back to rollback-journal mode kills concurrency performance.
  3. ECS task replacement during a deploy creates a window where the old task is still flushing and the new task starts taking writes. SQLite’s locking model has no answer for this.

The combination is fatal for catalog semantics — a single missed cursor advance can cause silent data duplication or loss. RDS Postgres avoids all three failure modes by construction.

DynamoDB as default. Excellent pay-per-request economics and AWS-native, but it’s a different consistency model (no SELECT FOR UPDATE, conditional writes only) that requires more catalog code than the Postgres adapter. Ships in M2.5 as an adapter for teams already on Dynamo or wanting strictly pay-per-request costs; not the default because the Postgres path is the better mental model for new deployments.

Aurora Serverless v2. Same query semantics as RDS, lower fixed cost at idle. Worth considering as a default once the cold-start story (~15s warm-up from zero ACU) is acceptable for hourly-or-more-frequent reconciliation.