Collocation: indexed context wherever the agent runs
The compute-and-data-collocation property from PRD Pillar 3’s “MCP-native context store” — agents run in many places (laptop, edge Worker, Fargate task, on-prem VM, air-gapped enclave), and the context layer brings indexed slices next to wherever they land. Round-tripping every prompt to us-east-1 is broken for offline use, broken for sovereign deploys, and a 50–200ms tax even when it works.
Related:
- Storage layout + sidecar indexes:
04-context-store.md. - Access control across replicas:
09-access-control.md. - The compliance dimension:
11-compliance.md.
Three collocation shapes
Section titled “Three collocation shapes”| Shape | Replica location | p99 read latency | Refresh path |
|---|---|---|---|
| Agent-local | Same machine as the agent (laptop SSD, on-prem box) | Sub-ms | Periodic hakiri sync pull over WAN |
| Edge-local | Replica in object storage in the same region as the edge runtime (CF Worker + R2 same region, Lambda + S3 same region) | Single-digit ms | Push-driven; replica updates as snapshots commit |
| Region-pinned | Replica in the same VPC as the agent (Fargate task + EFS, EC2 + S3 same region) | Single-digit ms; no egress | Same as edge-local |
The same hakiri.toml produces all three shapes — the difference is where hakiri sync pull --mode replica materializes the snapshot. Pillar 1’s single binary makes this uniform; Pillar 3’s cat-able store makes the replica itself a tar you can move around.
Replica modes
Section titled “Replica modes”[sync.replica]mode = "full" # full | partial | proxytables = ["github_issues", "linear_issues"] # which tables to materialize locallyindexes = ["fts-body", "vec-body-bge-large-en-v1.5"] # which sidecars to pullpartitions = ["recent_90d"] # only these partitions (see § Sharding)mode = "full"
Section titled “mode = "full"”Pulls every snapshot and every declared index for the listed tables. Best for laptops doing serious agent work; worst for resource-constrained edges.
mode = "partial"
Section titled “mode = "partial"”Pulls only the partitions and indexes the replica declares. A laptop can pull the FTS index but skip the 4 GB HNSW; a Worker can pull only the recent_90d partition. The runtime refuses to start an agent query that requires an index or partition the replica doesn’t have — it suggests hakiri sync pull --add-index <id> instead of silently degrading.
mode = "proxy"
Section titled “mode = "proxy"”The replica is empty locally; the agent’s context.query calls round-trip to the canonical store (via the hakiri sync serve query proxy). Used when:
- The table is too large to replicate (multi-TB CDC).
- The table is too sensitive to replicate (
replicate = falseon the source table). - The replica’s footprint budget can’t hold the data.
The proxy enforces capability-token policy (09-access-control.md) on every query — the agent never sees raw rows it isn’t authorized for. Latency is the cost.
Sharding by access pattern
Section titled “Sharding by access pattern”A table can declare an access_pattern so the writer partitions Parquet in a shape that lets replicas pull just the slice they need:
[[pipeline.tables]]name = "github_issues"access_pattern = "by_repo" # one partition per repo# oraccess_pattern = "recent_90d" # rolling window; older data goes to "archive" partition# oraccess_pattern = "by_account" # one partition per tenantReplicas declare the partitions they want:
[sync.replica]tables = ["github_issues"]partitions = ["repo=torvalds/linux", "repo=openhackersclub/gctrl"]A research agent that only ever queries one repo doesn’t drag the rest across the wire. The catalog’s per-partition cursors mean incremental refresh is per-partition, not per-table — a partition the replica doesn’t subscribe to never enters its refresh budget.
Incremental refresh
Section titled “Incremental refresh”Refresh is a diff, not a re-sync. The replica’s local catalog tracks per-snapshot content hashes. A pull:
- Fetches the remote top-level manifest (a few KB).
- Diffs against local content hashes.
- Downloads only changed snapshot directories (Parquet + sidecars together; see § Atomic snapshot + sidecar commit in 04-context-store.md).
- Atomically swaps the
currentsnapshot pointer in the local catalog. - Old snapshots stay around for the retention window.
A laptop offline for a week comes back, pulls one new snapshot per affected table (plus its sidecars), and is current in seconds. The replica never re-downloads unchanged Parquet.
Replicate vs proxy — when to choose which
Section titled “Replicate vs proxy — when to choose which”Per-table choice driven by data sensitivity, size, and policy:
Choose replicate when… | Choose proxy when… |
|---|---|
| Table is < ~10 GB | Table is > 100 GB |
| Data is not subject to per-replica policy variance | Data has tenant-scoped or region-scoped access policies |
| Replica is in a trusted environment (operator-controlled) | Replica is on a laptop or third-party-controlled host with weaker trust |
| Compaction can keep snapshots small enough to refresh on bandwidth | Refresh-bandwidth or storage-on-replica is constrained |
| Latency budget requires local reads | Latency budget tolerates a hop |
For tables with pii_type columns (declared in the manifest), replicate-by-default is off — write-time redaction strips the PII columns before Parquet hits the bucket; the replica gets the redacted Parquet. If a replica needs PII access, it uses proxy mode and the proxy enforces the token policy.
Provenance through replication
Section titled “Provenance through replication”Every replica carries the lineage edges (which run, which connector, which agent authored the connector). An agent querying a replica sees the same provenance an operator would see at the source. The lineage table is part of the catalog and replicates with it; it is not a separate concern.
Query-proxy security model
Section titled “Query-proxy security model”When a replica runs in proxy mode, the proxy (hakiri sync serve running near the canonical store):
- Verifies the requesting client’s capability token (biscuit), including
cnf.jktproof-of-possession. - Plans the agent’s
context.queryagainst the canonical Parquet + indexes. - Applies RLS / CLS / k-anonymity per the token’s grants (see
09-access-control.md). - Returns only the projected rows, never raw Parquet handles.
- Emits an OTel audit span + appends to the local hash-chained audit log.
The agent never sees raw bytes it isn’t authorized for. The proxy is the trust boundary; everything past it is the agent’s runtime.
What this spec deliberately leaves out
Section titled “What this spec deliberately leaves out”- Active-active multi-writer replicas. v0 replicas are read-only (pull-side); writes go through the canonical store. Active-active is a v2 concern that needs CRDT-shaped conflict resolution — out of scope per ADR-0005.
- Replica-side recompaction. Replicas pull pre-compacted snapshots; they don’t compact locally. If a replica’s local needs differ (smaller HNSW, different partitioning), they declare it as a partial-mode replica with explicit choices, not silent local recompaction.
- CDN-fronted replicas. Object-store + CDN (R2 free egress, CloudFront for S3) is the operator’s choice; Hakiri doesn’t manage it.
Open questions
Section titled “Open questions”- Manifest-format for partial replicas. A partial replica advertises “I hold partitions X, Y, indexes A, B” — should this be in the catalog or in a separate
replica.tomlper replica? Leaning catalog so it’s queryable centrally. - Refresh prioritization. When a replica is bandwidth-constrained, which snapshot should refresh first? “Most recently queried table” is one heuristic; “freshly committed snapshot” is another. M2 ships LRU + freshness combo; revisit with usage data.
- Replica trust attestation. A laptop replica claims to be in a particular policy zone; what verifies it? Tied to the Subject attestation story — host attestation v0 is self-asserted, M3+ is SPIFFE-flavored.