Skip to content
Architecture Decisions

Decision Index

Architecture Decision Records

An architectural decision record (ADR) documents an important architectural choice along with its context and consequences.

Immutable history

ADRs are append-only. Once accepted, content is never changed. Superseded decisions are marked Deprecated with a cross-reference. New decisions always get the next sequential number.

Full decision text → adr.md


Decision Index

# Decision Date Status
ADR 1 — Build a failover lib Build a reusable annotation-driven failover library 10-NOV-2021 Accepted
ADR 2 — @Failover Annotations Dedicated @Failover annotation instead of reusing @FeignClient 10-NOV-2021 Accepted
ADR 3 — Metadata for referential : As Of , Up To Date ? Referential / ReferentialAware carry upToDate and asOf 15-NOV-2021 Accepted
ADR 4 — Recovered Payload Handler RecoveredPayloadHandler SPI for null/default handling 15-NOV-2021 Accepted
ADR 5 — Failover Store FailoverStore abstraction with InMemory, Caffeine, JDBC impls 16-NOV-2021 Accepted
ADR 6 — Failover Execution FailoverExecution SPI; BASIC (try/catch) and RESILIENCE variants 17-NOV-2021 Accepted
ADR 7 — Auto Cleanup Scheduled expiry cleanup via ExpiryCleanupScheduler 17-NOV-2021 Accepted
ADR 8 — Monitoring FailoverReporter with logger and Micrometer publishers 17-NOV-2021 Accepted
ADR 9 — Key Generator KeyGenerator SPI; default derives key from method args 30-DEC-2021 Accepted
ADR 10 — DefaultFailoverStore — Defensive Copy for Immutability Store clones ReferentialPayload to prevent caller mutation 25-MAY-2026 Accepted
ADR 11 — FailoverStoreBeanPostProcessor — Uniform Store Wrapping via BeanPostProcessor BeanPostProcessor wraps stores uniformly at startup 25-MAY-2026 Deprecated — superseded by ADR 16, ADR 18, ADR 19
ADR 12 — MethodExceptionPolicy — Pluggable Exception Handling Strategy ExceptionPolicy enum: RETHROW, NEVER_THROW, CUSTOM 26-MAY-2026 Accepted
ADR 13 — JDBC Native Merge/Upsert — Dialect Detection and Runtime Fallback Dialect-specific upsert with ANSI fallback 26-MAY-2026 Accepted
ADR 14 — DatabaseResolver — Strategy Interface for Database Product Detection DatabaseResolver SPI detects DB product at runtime 26-MAY-2026 Accepted
ADR 15 — FailoverStoreQueryResolver — Single-Responsibility Co-location of All JDBC Query Concerns All JDBC query building delegated to FailoverStoreQueryResolver 26-MAY-2026 Accepted
ADR 16 — Removal of BeanPostProcessor-based Store Wrapping (Supersedes ADR 11) BeanPostProcessor removed; auto-config assembles store chain explicitly 02-JUN-2026 Accepted — supersedes ADR 11
ADR 17 — TenantStoreFactory SPI — Abstracting Store Creation from Store Assembly TenantStoreFactory decouples per-tenant store creation 02-JUN-2026 Accepted
ADR 18 — FailoverStoreAutoConfiguration — Central Assembler Single auto-config class assembles the complete store chain 02-JUN-2026 Accepted
ADR 19 — FailoverStoreAsync — Explicit TaskExecutor Replacing @Async AsyncFailoverStore wraps delegate with explicit executor; drops @Async 02-JUN-2026 Accepted
ADR 20 — MultiTenantFailoverStore — Outermost Per-Tenant Routing Decorator Multi-tenant routing sits outside async decorator 02-JUN-2026 Accepted
ADR 21 — FailoverStoreMultiTenantAutoConfiguration — Multi-Tenant Auto-Configuration and TenantResolver SPI Separate auto-config for multi-tenant; TenantResolver SPI 02-JUN-2026 Accepted
ADR 22 — FailoverKeyGenerator — UUID-Based Key Normalisation for Fixed-Width Store Keys MD5/UUID key hash prevents VARCHAR(256) overflow 03-JUN-2026 Accepted
ADR 23 — PayloadSplitter — Scatter/Gather Storage for Composite-Key Failover PayloadSplitter<T,R> splits collection results into per-entity store entries 04-JUN-2026 Accepted
ADR 24 — Parallel Scatter/Gather — CompletableFuture with Injected Executor Scatter slices dispatched concurrently via injected Executor 04-JUN-2026 Accepted
ADR 25 — ContextPropagator SPI — Thread-Local Context Propagation for Parallel Scatter ContextPropagator captures and restores thread-local context on executor threads 04-JUN-2026 Accepted
ADR 26 — Replace LocalDateTime with Instant for Timezone-Aware Expiry Timestamps Instant eliminates timezone ambiguity in expiry across multi-node/multi-timezone deployments 06-JUN-2026 Accepted
ADR 27 — Migrate Deprecated JdbcTemplate Overloads in FailoverStoreJdbc Varargs overloads replace deprecated Object[] + int[] forms; removes java.sql.Types usage 06-JUN-2026 Accepted
ADR 28 — domain Attribute — Shared Store Partitioning Across @Failover Annotations domain enables scatter/gather slices and single-entity endpoints to share a store partition 07-JUN-2026 Accepted
ADR 29 — Observability Layer — Observer, Publisher SPI and MDC Logger Refactor Rename reporter stack to observer; MDC-safe publish via ObservablePublisher SPI; composite publisher 07-JUN-2026 Accepted
ADR 30 — SpringContextFailoverScanner — Replacing Reflections-Based Classpath Scanning Spring bean enumeration replaces Reflections; removes package-to-scan config and Guava dep 07-JUN-2026 Accepted
ADR 31 — failover-observable-micrometer — Micrometer Extension as an Optional Module Micrometer meters and Actuator health indicator extracted to an optional opt-in module 07-JUN-2026 Accepted
ADR 32 — PayloadSplitterExecutionException — Wrapping User-Splitter Failures with Diagnostic Context All PayloadSplitter call failures wrapped in PayloadSplitterExecutionException with splitter name and operation context 10-JUN-2026 Accepted
ADR 33 — doRecoverAll All-Slices Iteration — User-Controlled Slice Count doRecoverAll iterates over all slices returned by splitOnRecover; slice count is user-controlled via PayloadSplitter 10-JUN-2026 Accepted
ADR 34 — ScatterGatherFailoverHandler.recoverAll() Override — Clear Error for Scatter Case ScatterGatherFailoverHandler.recoverAll() overrides default with UnsupportedOperationException to prevent silent wrong-path execution 10-JUN-2026 Accepted
ADR 35 — Empty splitOnRecover Guard — Null Return Instead of merge([]) Guard against empty splitOnRecover result returns null rather than merging an empty list 10-JUN-2026 Accepted
ADR 36 — splitOnRecover RecoverAll Contract — Single Placeholder for DefaultFailoverHandler splitOnRecover must return exactly one placeholder context when delegating to DefaultFailoverHandler.recoverAll 10-JUN-2026 Accepted
ADR 37 — Payload Deserialization Allowlist — Secure-by-Default Class Loading JsonSerializer.toClass restricted to an allowlist auto-derived from @Failover payload packages plus failover.store.jdbc.allowed-payload-classes 14-JUN-2026 Accepted
ADR 38 — Scatter/Gather Per-Slice Timeout — Bounded Parallel Join failover.scatter.timeout bounds parallel slice joins; timed-out recover slice = not recovered, store slice surfaces 14-JUN-2026 Accepted
ADR 39 — Error Propagation — Never Recover on a Failing JVM Error rethrown unwrapped by the aspect; recovery never runs on a dying JVM 14-JUN-2026 Accepted
ADR 40 — Multi-Tenant Strict Mode — Reject Unconfigured Tenants failover.store.multitenant.strict rejects (or WARNs on) tenants absent from the configured map 14-JUN-2026 Accepted
ADR 41 — Async Store Failure Metric — Visibility for a Silently-Degraded Layer FailoverStoreAsync publishes failover.store.async.failed on executor-side failures 14-JUN-2026 Accepted
ADR 42 — FailoverScanner Relocation to a Neutral Core Package FailoverScanner SPI moved core.observable.scannercore.scanner; shared by observability and store security 14-JUN-2026 Accepted
ADR 43 — Dialect Integration Tests via Testcontainers Real PostgreSQL/MySQL/MariaDB merge ITs, profile-gated (dialect-its) and excluded from the default build; Oracle stays string-asserted 15-JUN-2026 Accepted
ADR 44 — Concurrency Test Coverage for Multi-Tenant Routing and the Async Store Contention tests for computeIfAbsent one-store-per-tenant and the FailoverStoreAsync executor path 15-JUN-2026 Accepted
ADR 45 — ArchUnit Architecture Tests Enforce no-ThreadLocal-in-async, *Store naming, and acyclic slices; split-package rule deferred to Phase 4 15-JUN-2026 Accepted
ADR 46 — PIT Mutation Testing on Expiry and Key Logic Profile-gated (mutation) PIT over all of failover-core; mandated 95% gate (blocking), currently 96% / 99% test strength 15-JUN-2026 Accepted
ADR 47 — JDBC insert→update Race — Bounded Retry over Silent Drop INSERT/UPDATE fallback re-INSERTs once when a concurrent expiry delete drops the UPDATE; bounded to 2 attempts, abandons at warn 15-JUN-2026 Accepted
ADR 48 — Failover Lifecycle Logging — INFO Event, DEBUG Payload Body Store/recover lifecycle stays at INFO (name only); full ReferentialPayload body moved to DEBUG 15-JUN-2026 Accepted
ADR 49 — ScatterGatherFailoverHandler — Extract Scatter/Gather Collaborators Thin facade + PayloadScatter / PayloadGather / SliceDispatcher / SplitterInvoker; public API and behaviour unchanged 15-JUN-2026 Accepted
ADR 50 — Metrics Builder Helper — Cheaper Metric Construction on the Recover Path Metrics concatenates keys (no String.format) + typed collect overloads; ~3.6× faster recover-bag build (JMH 744→204 ns/op) 15-JUN-2026 Accepted
ADR 51 — Per-Method Failover Outcome Metric + Method-Identity Threading failover.recovery.outcome.total{name,domain,method,outcome} for failover/recovery/non-recovery rates 15-JUN-2026 Accepted — threading mechanism refined by ADR 52
ADR 52 — FailoverHandler Method-Aware Contract + AbstractFailoverHandler Single method-aware FailoverHandler contract (@NonNull Method); AbstractFailoverHandler bridges method-agnostic handlers; method threaded through scatter to slices. Breaking SPI 15-JUN-2026 Accepted
ADR 53 — Overall JaCoCo Coverage Gate Cross-module jacoco:check in the failover-test-report module (unpack classes + merge all exec) fails verify below 95% line / 95% branch 15-JUN-2026 Accepted
ADR 54 — FailoverStore Assembly — Collapse Four Property-Gated Beans into One Single failoverStore bean replaces the 4 async × multitenant `` variants; per-tenant-async made explicit; refines ADR 18 16-JUN-2026 Accepted
ADR 55 — Embedded Failover Dashboard (failover-dashboard) Opt-in, secure-by-default embedded UI + read-only JSON API over the existing scanner config and failover.* meters; separate starter, no new instrumentation, fail-closed access gate 17-JUN-2026 Accepted
ADR 56 — Payload-at-rest Encryption for the JDBC Store PayloadCipher SPI + EncryptingSerializer over the JDBC store; ENC(<cipherId>:…) envelope, write-switch only, failover.store.jdbc.encryption.*; opt-in, b64 default marked non-secure 17-JUN-2026 Accepted
ADR 57 — Async Executor Back-pressure BoundedTaskExecutor (semaphore admission guard, keeps virtual threads) makes the async-store and scatter executors optionally bounded; concurrency-limit + rejection-policy (DISCARD default), opt-in unbounded by default (audit R-2) 17-JUN-2026 Accepted
ADR 58 — Non-Durable Store in Production — Advisory Warning over Fail-Fast Non-durable store WARN now names jdbc as the recommended production store + store decision/topology docs; rejected profile-aware fail-fast as brittle (audit A1) 22-JUN-2026 Accepted
ADR 59 — Async Store Submit-Time Rejection — Count Saturation, Don't Drop It Silently Submit-time executor rejection (ABORT/shutdown) now emits the existing failover.store.async.failed meter; closes the saturation blind spot, no new meter/tag (audit A2) 22-JUN-2026 Accepted
ADR 60 — Deserialization Allowlist Strict Mode — Fail-Closed on an Empty Allowlist Opt-in failover.store.jdbc.strict-allowlist denies all deserialization on an empty allowlist (was fail-open allow-all); secure default unchanged (audit A3) 22-JUN-2026 Accepted
ADR 61 — Built-in AES-GCM Payload Cipher — Usable Encryption-at-Rest Out of the Box Built-in AesGcmPayloadCipher (id aesgcm) auto-registered from failover.store.jdbc.encryption.key; real encryption-at-rest with no consumer crypto code; extends ADR 56 (audit A4) 22-JUN-2026 Accepted
ADR 62 — Opt-in JDBC Live-Entries Gauge — Capacity Visibility Without a COUNT(*) Tax FailoverStoreJdbc is FailoverStoreSizeAware; failover.store.jdbc.live-entries-gauge-enabled (default off) exposes failover.live.entries via opt-in COUNT(*) per scrape for capacity monitoring (audit A7) 25-JUN-2026 Accepted
ADR 63 — Startup Warning for Un-advisable @Failover Placement Scanner WARNs at startup when a discovered @Failover can't be advised (interface-only, non-public/static/final method, final class); self-invocation documented (audit A8) 25-JUN-2026 Accepted
ADR 64 — Startup Warning for recoverAll Without a payloadSplitter Scanner WARNs when @Failover(recoverAll=true) has no payloadSplitter (silently falls back to single-key recover); adoption guidance documented (audit A10) 25-JUN-2026 Accepted
ADR 65 — Event-Driven Snapshot Publishing with Throttle and Backoff Cluster snapshot pushes fire on metric events (throttled to one per interval-seconds, WARN-once backoff on failure) via SnapshotPublisher/SnapshotPushClient; no polling scheduler 28-JUN-2026 Accepted
ADR 66 — Decoupled Heartbeat Liveness Tracking Lightweight opt-in heartbeat ({"instanceId"} POST) drives LiveStatus LIVE/DOWN/UNKNOWN, decoupled from snapshot freshness; DOWN instances keep contributing last-known metrics 28-JUN-2026 Accepted
ADR 67 — Reset-Aware Shared-Store Aggregate and Bounded Instance Retirement SnapshotBaseline carry-forward makes the instant cluster aggregate monotonic across peer restarts (counter resets); unseen instances retire after instance-retention into a bounded tombstone aggregate — counts never drop, heap stays bounded under pod churn 03-JUL-2026 Accepted