Question 1

What is Deterministic Replay for AI agents?

Accepted Answer

Deterministic Replay re-executes historical production decisions from Tenet's Reasoning Ledger against a candidate agent version — a new model, prompt, or policy — using the exact stored context snapshot from the original decision. This allows pre-deployment validation on real production data, exposing regressions that synthetic benchmarks miss.

Question 2

How is Deterministic Replay different from running an eval suite?

Accepted Answer

Eval suites test scenarios you constructed before deployment — they cover the cases your team anticipated. Deterministic Replay tests every scenario that actually occurred in production, including the long-tail edge cases and unusual input combinations that users generated but your team did not include in test sets. These are precisely the cases where AI agents regress unexpectedly.

Question 3

Does Deterministic Replay satisfy EU AI Act Article 9 testing requirements?

Accepted Answer

Yes. EU AI Act Article 9 requires high-risk AI systems to be tested under realistic conditions before deployment. Replaying historical production decisions against a candidate version is the closest available approximation to realistic conditions — it uses the actual edge cases, distributions, and input patterns from your real user base. The replay report provides documented evidence of behavioral testing that satisfies Article 9 risk management requirements.

Question 4

What data does Tenet need to run Deterministic Replay?

Accepted Answer

Tenet needs the context snapshots stored in your Reasoning Ledger — captured automatically by the Ghost SDK during normal production operation. Each snapshot records the exact input state, policy context, and intermediate reasoning at decision time. No separate data pipeline or additional instrumentation is required; snapshots accumulate as your agent operates.

Question 5

How does Deterministic Replay detect prompt regression?

Accepted Answer

When you update a system prompt and run Deterministic Replay against 30 days of production decisions, Tenet re-executes each decision with the new prompt against the same historical context snapshot. The comparison shows which decisions produce different outcomes, where reasoning diverged in the chain, and what percentage of production decisions are affected — quantified by decision type and severity.

Question 6

What SR 11-7 requirements does Deterministic Replay address?

Accepted Answer

SR 11-7 model risk management guidance requires validation to address the model's actual use and to test in the context of specific decision types the model will support. Deterministic Replay against production decision history directly satisfies this standard — it tests the new version on the specific real-world cases that define the model's actual use, not on synthetic benchmarks assembled by the development team.