What does EU AI Act Article 12 specifically require?

Article 12 requires high-risk AI systems to automatically log events sufficient to enable post-hoc reconstruction of the system's operation. This means recording what the system decided, the inputs and reference data used, who was involved, and covering the full period of use. The key standard is post-hoc reconstruction — not just that a decision occurred, but that an auditor can re-derive what the system did and why.

How long must records be retained under EU AI Act?

Article 12(4) requires logs to be kept for the period appropriate to the purpose and at minimum 6 months after the AI system is decommissioned. For high-risk AI in financial services or healthcare, applicable sector regulations may impose longer retention — HIPAA requires 6 years, MiFID II requires 5 years for order records. Use the strictest requirement across all applicable frameworks.

Does EU AI Act apply to companies outside the EU?

Yes. The EU AI Act has explicit extraterritorial scope under Article 2: it applies to providers placing AI systems on the EU market or putting them into service in the EU, and to deployers using AI systems where the output affects persons located in the EU. A US company whose AI agent makes credit decisions for EU residents is in scope regardless of where servers are located.

What is the difference between a log and an Article 12 audit trail?

A log records that something happened — a request was received, a function returned, a value was written. An Article 12 audit trail proves what happened — the exact inputs, the reasoning the system applied, the specific decision reached, and cryptographic evidence that the record has not been altered since capture. Most logging systems satisfy the first requirement but not the second. Post-hoc reconstruction requires the latter.

What are the penalties for Article 12 non-compliance?

Failing to meet Article 12 logging obligations for high-risk AI systems can result in fines up to €15 million or 3% of total worldwide annual turnover, whichever is higher. Market surveillance authorities can also prohibit or restrict use of the AI system in the EU market pending compliance. Individual member state competent authorities can impose additional national-level enforcement.

What does post-hoc reconstruction mean in practice?

Post-hoc reconstruction means an auditor or regulator can take your records and re-derive exactly what your AI system did in a specific decision — the inputs it received, the reasoning it applied, the options it considered, the action it chose, and who was involved. This requires storing the full context state at decision time, not just the final output. It also requires tamper-evidence: proof the records have not been modified since the decision was made.

Why don't LangSmith or Datadog satisfy Article 12?

LangSmith captures LLM call traces for development debugging — these are not tamper-evident, not structured for regulatory reconstruction, and are designed for developer use rather than compliance artifacts. Datadog captures infrastructure events — latency, errors, resource usage — at the system level, not the decision level. Neither applies cryptographic signing to records. An auditor requesting Article 12 documentation would find both insufficient for post-hoc reconstruction.

What qualifies as a high-risk AI system under Annex III?

Annex III identifies eight high-risk categories by use case, not technical architecture: critical infrastructure, education and vocational training, employment and worker management, access to essential private and public services (credit, insurance, healthcare, emergency services), law enforcement, migration and border control, justice administration, and biometric identification. AI systems used in credit scoring, prior authorization, claims adjudication, hiring, or fraud detection in the EU typically fall within Annex III.

How to Prove AI Agent Decisions for EU AI Act Article 12 Compliance

EU AI Act Article 12 requires automatic logging enabling post-hoc reconstruction of AI system operation. Most teams read that as "add logging." It is not. Logging records that something happened. Proof demonstrates what happened, why, and that the record has not been altered since capture. This article explains what Article 12 actually requires, why standard logs and LLM traces do not satisfy it, and how to implement compliant decision audit trails for high-risk AI systems.

What EU AI Act Article 12 Actually Says

Article 12(1): High-risk AI systems shall technically allow for the automatic recording of events (logs) over the lifetime of the system. Article 12(2): Logging capabilities shall ensure a level of traceability adequate to the purpose and commensurate with the risks. Article 12(3): Logs shall include the period of each use, the reference database against which the input data has been checked, input data, and identification of persons involved. The key phrase is post-hoc reconstruction — not recording that a decision occurred, but recording enough to re-derive what the system did and why.

Which AI Systems Are In Scope (Annex III)

EU AI Act Annex III defines eight high-risk categories: (1) biometric identification, (2) critical infrastructure management, (3) education and vocational training, (4) employment and workers management, (5) access to essential private and public services — including credit scoring, insurance pricing, and medical triage, (6) law enforcement, (7) migration, asylum, and border control, (8) justice and democratic processes. If your AI agent makes decisions in any of these domains affecting EU residents, Article 12 applies.

Why Logs Do Not Prove Decisions

Standard application logs record events — a request occurred, a function was called, a response was returned. They do not record: the reasoning chain the agent used, the options it weighted, the confidence behind the chosen action, the exact context state at decision time, or whether the record has been modified since capture. Without tamper-evidence, a log is a record that something was written, not proof of what the agent decided. Article 12 requires the latter.

Eight Fields Required for Post-Hoc Reconstruction

A compliant EU AI Act Article 12 record must contain: (1) Decision intent — the triggering event and objective. (2) Context snapshot — the exact input state, including retrieved data, at decision time. (3) Reasoning chain — how the agent evaluated the situation. (4) Options considered — what alternatives were weighted. (5) Chosen action and confidence score. (6) Outcome — what the execution produced. (7) Provenance — model version, prompt version, agent ID, timestamp. (8) Cryptographic signature — SHA-256 hash + Ed25519 signature proving the record has not been modified.

Implementation with Tenet AI

Tenet captures all eight fields using the TenetClient intent context manager. Initialize once with your API key. Use tenet.intent() to wrap each decision: call intent.snapshot_context() to capture state, intent.decide() to record options and chosen action, and intent.execute() to record the outcome. Every record is automatically SHA-256 hashed and Ed25519 signed at capture time. Records are stored in an append-only ledger with no DELETE path. Retention policy, replay engine, and compliance PDF export are available out of the box.

LangSmith and Datadog Do Not Satisfy Article 12

LangSmith captures LLM call traces for development debugging — it does not apply cryptographic signing, does not capture context snapshots at the required fidelity, and produces developer-readable output rather than compliance-structured records. Datadog captures infrastructure events — span duration, error rate, memory usage — not decision-level reasoning. Neither tool is designed as a compliance artifact. Using them as Article 12 evidence creates regulatory risk: an auditor or regulator who requests post-hoc reconstruction documentation will find the records insufficient.