AI Incidents
AI incident reporting methodology
news

How We Log AI Security Incidents: Source Verification, Dating, and What Doesn't Make the Cut

The methodology behind AI Incidents — how we verify sources, date-stamp claims, and decide what's news vs noise in the AI security incident beat.

By Theo Voss · · 7 min read

When a reader trusts an incident publication, what they’re actually trusting is the methodology behind the headlines — the editorial decisions about what to publish, what to verify, and what to leave out. That methodology should be visible. This post is ours.

What counts as an AI security incident

We classify an event as an AI security incident if it meets at least two of these criteria:

  1. AI-system-mediated harm: the harm flows through the behavior of an ML model or LLM, not just from a system that happens to use AI internally.
  2. Adversarial or unintended cause: the incident was caused by deliberate attack (prompt injection, data poisoning, model extraction) or unintentional misalignment (hallucinated outputs, reward hacking, safety failures).
  3. Material impact: actual users, customers, or third parties were affected — not just a published vulnerability that was patched before exploitation.
  4. First-party or credible third-party reporting: there’s a primary source we can cite — vendor advisory, court filing, regulator action, peer-reviewed disclosure, or reporting from a publication that demonstrates verification.

If an event meets fewer than two, we may track it internally but won’t publish until the criteria converge. This filters out a lot of “AI did a bad thing on Twitter” incidents that don’t have enough signal to verify.

Source tiers

We rank sources in five tiers; an incident publishes only when we have at least one Tier 1-2 source.

Tier 1 — Primary: Vendor advisories, regulatory actions, court filings, organizational incident reports, peer-reviewed papers documenting the event.

Tier 2 — Established secondary: Reporting from publications with demonstrated verification practice (Reuters, Bloomberg, Wired security desk, KrebsOnSecurity, 404 Media, The Markup) where the reporter has independently confirmed the claim.

Tier 3 — Specialty secondary: Reporting from AI-specific publications, research org statements, social media posts from named researchers with reputation at stake.

Tier 4 — User reports: Individual users describing experiences. Triangulated only — a single Twitter thread is not enough; a pattern across multiple independent accounts may be.

Tier 5 — Speculation and aggregator content: AI-summarized news posts, anonymous Reddit claims, unverified screenshots. Not citable.

When we publish from Tier 3 or 4 sources alone, we mark the incident as verification: partial and update if a stronger source emerges.

Dating

Every incident gets four dates when available:

The four dates often differ by weeks or months; conflating them leads to misleading timelines. A vendor disclosure on April 30 about an incident from January 12 is two events, not one.

For undated claims (especially in academic disclosures), we mark the date as approximate and explain the reasoning in the entry.

What we do not publish

This is not a comprehensive list. Editorial judgment applies on novel cases.

Corrections

When we get something wrong, we issue a correction with a visible diff at the bottom of the affected post. Silent edits are not allowed. The correction includes:

The correction-log page lists every correction across the publication, oldest first.

External links go to:

We don’t link to aggregators, AI summaries, or auto-generated news sites, even when they have higher search ranking.

Cross-references

Where applicable, every incident is cross-referenced to:

The goal is that a reader investigating an incident can rebuild the full evidence trail without leaving documented sources.

Update policy

Incidents are not static. We update entries when:

Each update carries a date and a one-line summary at the top of the entry. We do not retroactively edit out earlier framings — the history of how an incident was understood is itself part of the record.

Why this matters

The AI security beat is full of high-velocity, low-rigor reporting. Twitter threads claim breaches that didn’t happen. AI-summarized news sites republish each other’s errors. A serious reader who wants to understand what’s actually happening needs a publication that does the verification work and shows its sources.

That’s our job. The methodology is the product.

Sources

  1. AI Incident Database (Partnership on AI)
  2. OECD AI Incidents Monitor
  3. MITRE ATLAS Case Studies
#methodology #journalism #incident-tracking #ai-incidents #source-verification
Subscribe

AI Incidents — in your inbox

AI incidents, model failures, and adversarial-use cases — dated and sourced. — delivered when there's something worth your inbox.

No spam. Unsubscribe anytime.

Related

Comments