Skip to main content

Reproducibility: How Was It Built?

Attestation proves what's running. But a hash is just a number until you can answer: where did that binary come from, and can anyone independently verify it?

This is the R dimension of the KRAB model. It measures reproducibility — how much of the software stack can be traced from source code to deployed binary, and at what level of trust.

The spectrum

R is scored per-layer. Every component in the stack — firmware, OS, libraries, application — gets its own R grade:

LevelNameWhat it means
R0OpaqueNo source, no build instructions. The binary is a black box. You trust whoever built it — completely.
R1Source AvailableSource is published and builds are documented. You can audit the code, but you cannot prove the deployed binary was built from it.
R2Maintainer-SignedA maintainer cryptographically asserts the binary was built from the published source. Trust shifts to the maintainer's key.
R2+Threshold Multi-Party SignedBinary signed by M-of-N independent maintainers. All M must collude to forge the claim. Correspondence remains asserted, not independently verifiable, but collusion resistance is qualitatively stronger than single-key R2.
R3Provenance-VerifiedSigned build provenance from a CI/CD pipeline (e.g., SLSA). The build process is auditable, but the CI system is now in your trust chain.
R4Deterministic / ReproducibleAnyone can rebuild from source and get the identical hash. No trust in any builder, maintainer, or pipeline required.

Each level is useful. Each level has limits.

The trust chain shifts, it doesn't vanish

R0 means blind trust. You run the binary because someone gave it to you. This is where most CSP firmware sits today.

R1 gives you auditability. You can read the code. But "the source looks fine" and "the binary matches the source" are different statements. Azure's OpenHCL paravisor is R1: the source is public, but the production builds are not reproducible.

R2 adds a cryptographic claim: someone with a signing key asserts the binary matches the source. This is better — but if the key is compromised, the claim collapses back to R1. R2+ extends this with M-of-N threshold signing: all M maintainers must collude to forge the claim, making single-key compromise insufficient. The source-to-binary correspondence is still asserted rather than independently verifiable — R2+ is not R4 — but collusion resistance is qualitatively stronger.

R3 means you have signed build provenance — typically from a CI/CD system following the SLSA framework. SLSA records the source repo, commit hash, build environment, dependencies, and output hash. The chain is auditable, but the CI pipeline itself is now trusted infrastructure. If the pipeline is compromised, the provenance is worthless.

R4 is the gold standard: deterministic builds. Anyone can clone the source, run the build, and get the exact same binary hash. No maintainer keys, no CI trust, no "just believe me." AWS provides Nix-reproducible OVMF firmware at this level.

Where SLSA fits

SLSA (Supply-chain Levels for Software Artifacts) is the practical tool for R2-R3. It doesn't replace hardware attestation — it complements it.

What SLSA covers: your application binary. Source repo, commit, build environment, dependencies, output hash — all signed and auditable.

What SLSA doesn't cover: the OS kernel, firmware, paravisor, hypervisor, CPU, or TDX module. SLSA operates above the hardware trust boundary.

The powerful move is binding SLSA to hardware attestation: hash the SLSA provenance document and include it in the attestation quote's report_data field. Now the verification chain runs from source code, through the build pipeline, into the binary hash, through the hardware measurement, all the way to the silicon vendor's certificate chain. One unbroken line.

For most teams, SLSA 2-3 for the application layer is the practical target. It gives you auditable provenance without requiring deterministic builds for everything.

R is per-layer — and that's the point

Unlike Attestation (which is a single platform ceiling), Reproducibility is scored at every layer. This creates layered profiles:

ComponentR levelWhat it means
ApplicationR4Deterministic build, anyone can verify
OS imageR3SLSA provenance from CI
FirmwareR0CSP black box

The system's R profile is the full column, not a single number. A perfectly reproducible application and libraries sitting on an opaque OS and firmware is written as R[f0/o0/l4/a4] — each slot showing exactly what can and cannot be verified.

Verification gaps

An R[f0/o0/l4/a4] profile is common on public clouds. It means: "I can prove my application binary is exactly what I intended, but the OS and firmware beneath it? Opaque." On Azure TDX the firmware is source-available (f1), giving a slightly better R[f1/o0/l4/a4] — but still with a blind OS layer.

This isn't necessarily a failure. If your threat model trusts the CSP (and you've made that explicit in your Attestation ceiling), an R[f0/o0/l4/a4] profile with CSP trust explicitly documented is a coherent engineering choice. But the gap must be visible. The notation forces it into the open.

The worst outcome isn't a gap — it's a gap nobody knows about.

Trust assumption

R2 shifts trust to the maintainer's key. R3 shifts trust to the CI/CD pipeline. Only R4 eliminates the build system as a trust dependency. Know what you're trusting at each level.

Practical tip

Start with R4 for your own application (Nix, Bazel, or Go's reproducible builds). Target R3 (SLSA provenance) for dependencies you don't control. Accept R0-R1 for CSP firmware only if your threat model explicitly trusts the provider — and write it down.