137AI > AI Incidents & Management

AI Incidents & Management

Incident documentation is how the field learns. Aviation built decades of safety practice on mandatory reporting, independent investigation, and the public availability of findings. Cybersecurity has matured through CVE registries, vendor disclosure programs, and incident reporting frameworks that let defenders learn from attacks they did not personally experience. AI agent incident documentation is younger, more uneven, and partly aspirational at this stage of the field. Some categories have substantial documented case bases. Others have early research demonstrations and near-misses but few production incidents yet. Some are forward-positioned, documenting the analytical frame and watchpoints for incident categories the field expects to emerge but has not yet seen at scale. Beyond the archive itself, AI incident management is also a working discipline — detection, reporting, response, resolution, and prevention — required by regulations including the EU AI Act, NIST AI Risk Management Framework, and ISO/IEC 42001, especially for high-risk applications in robotaxis, humanoid robots, medical AI, and financial systems. Proper incident management reduces liability, strengthens operator trust, and produces audit-ready evidence for regulators.

Types of AI Incidents

Cutting across the agent-category structure, AI incidents fall into a smaller set of failure-mode types that recur across physical, personal and ambient, and software agents. The same five types appear whether the agent is a robotaxi, a smart home assistant, or an enterprise software agent, and operators benefit from recognizing the type independently of the agent in which it occurred.

Incident Type	Description	Examples
Safety failures	Incidents where the agent endangers health, safety, or physical wellbeing	Robotaxi collision with pedestrian, robotic surgery malfunction, humanoid drop or strike incident
Bias and fairness issues	Agent produces discriminatory or systematically unfair outcomes across populations	Hiring algorithm rejecting candidates disproportionately, lending model with disparate impact, biased clinical decision support
Data breaches	Exposure or misuse of training data, telemetry, captured material, or model internals	Training data leak, unauthorized model access, exposure of cabin AI recordings, exfiltration through compromised software agent
Operational failures	System outage, degraded performance, or unexpected behavior that disrupts normal operation	Chatbot downtime, model drift producing errors, fleet-wide service disruption, agent loop or escalation incident
Compliance breaches	Violation of legal or regulatory obligations, including disclosure, consent, and record-keeping requirements	Undisclosed deepfake content, GDPR violations, missing EU AI Act conformity assessment, biometric capture without consent

Physical Agent Incidents

Physical agent incidents have the most developed case base. Robotaxi and autonomous vehicle incidents include the Cruise pedestrian drag of October 2023 and subsequent operating suspension, Waymo collision and stranding cases, Tesla autopilot and FSD incidents in their various classifications, and a long tail of less-publicized fleet operational events. Industrial robot incidents include the small but real population of cobot-related injuries and the few documented fatalities in industrial settings over the past two decades. Delivery and mobile robot incidents are mostly minor and underreported but include early Starship and Knightscope cases. Early humanoid incidents are limited by deployment scale but include several documented near-misses in industrial trials. Physical Agent Incidents covers the case base, the patterns observed across types, and the regulatory and operational responses that followed.

Personal & Ambient Agent Incidents

Personal and ambient agent incidents include voice assistant misfire cases where Alexa, Siri, or Google Assistant took unintended action; cabin AI privacy controversies in connected vehicles where recording practices became public; smart glasses recording incidents involving bystanders who did not consent; chatbot harm cases including Character.AI litigation, Replika emotional dependence reports, and the various LLM-attributed harms reaching media attention. Medical device AI incidents are partially captured in FDA adverse event reporting but with inconsistent coverage of AI-specific failure modes. Personal & Ambient Agent Incidents covers the documented cases, the patterns specific to user-colocated AI systems, and the gaps in reporting that leave significant portions of this category undocumented.

Software Agent Incidents

Software agent incidents in production are quieter than in research, because organizations do not publicize agentic AI failures that touch their customers or operations. The publicly documented cases include the Air Canada chatbot tribunal ruling holding the airline accountable for its agent's promises, the multiple ChatGPT-cited-fake-cases legal incidents that produced sanctions, GitHub Copilot output disputes around licensed code reproduction, and a growing population of prompt injection demonstrations that crossed from research into production-impacting events. Enterprise agentic AI incidents are emerging but mostly remain inside organizations rather than reaching public reporting. Software Agent Incidents covers the public case base, the patterns inferable from research and from incidents that did surface, and the reporting gap between research demonstration and production impact.

OTA and Training Data Incidents

This category is forward-positioned. Publicly documented production incidents involving training data poisoning, OTA pipeline compromise, or model update integrity failures in deployed AI agent fleets are rare. Research demonstrations, by contrast, are abundant: data poisoning attacks against image classifiers, text models, and reinforcement learning systems; demonstrated supply chain attacks on model registries and pretrained model distribution; and analytical work on the OTA attack surface across automotive and IoT contexts. The category serves today as a watchlist for incidents the field expects as deployment scale and adversary sophistication grow, framed by the research demonstrations that already show what is possible. OTA & Training Data Incidents covers the documented research, the adjacent OTA incidents from automotive and IoT that did not involve AI specifically but illuminate the surface, and the watchpoints the field is tracking for when this category begins to populate with production cases.

Coordinated Multi-Agent Events

Coordinated multi-agent events at the scale the framework anticipates have not yet been publicly documented. The thesis is that autonomous infrastructure enables coordinated criminal logistics and coordinated misbehavior across many agents at once, with qualitative differences from single-agent events. Early signals include research demonstrations of fleet-coordinated attacks, drone swarm exercises with offensive capability, and the analytical work showing how compromise of an orchestration layer reaches every agent it controls. This category exists today as the framework piece plus the early signals, with the expectation that it will populate as fleet deployment scales reach the threshold where coordinated misuse becomes operationally attractive. Coordinated Multi-Agent Events covers the analytical framework, the early signals, and the watchpoints for when this category begins to see documented incidents at meaningful scale.

Critical Infrastructure Incidents (AI-Connected)

Most documented attacks on critical infrastructure are not AI-mediated. The narrower category covered here is incidents where AI agents, AI sensors, AI telemetry pipelines, or AI decision-support systems were the vector or the corrupted component. Documented cases at this intersection are few but growing: AI-mediated phishing reaching infrastructure operators, AI-driven reconnaissance against control system environments, demonstrated tampering with AI forecasting and predictive maintenance systems. The category also surfaces relevant work from the broader ICS security community — Dragos public reporting, CISA advisories, MITRE ATT&CK for ICS — where the analysis illuminates how AI components would extend known attack patterns. Critical Infrastructure Incidents covers the documented AI-connected cases, the cross-references to conventional ICS security work that contextualizes them, and the governance gap between AI regulators and infrastructure regulators that leaves this intersection inadequately covered.

Regulatory Responses

Regulatory action on AI agents has been more substantial than incident reporting alone would suggest, because regulators have moved on the basis of capability and projected risk rather than waiting for documented harm at scale. NHTSA has issued multiple standing general orders on autonomous vehicle incident reporting. The FTC has pursued enforcement against deceptive AI claims and inadequate AI governance. State attorneys general have settled cases involving AI bias, privacy violations, and consumer protection. The EU AI Act has begun enforcement against high-risk AI systems. International coordination through G7, OECD, and Council of Europe instruments has produced policy commitments that shape national action. Regulatory Responses covers the documented enforcement actions, the policy responses to specific incidents, and the patterns of regulatory action that operators should track as the field matures.

The Incident Management Lifecycle

Beyond the archive of what has happened, AI incident management is the operational discipline of handling incidents as they occur. The lifecycle runs from detection through reporting, response, resolution, and prevention, with each stage producing artifacts that feed downstream activity. Effective incident management is continuous rather than ad hoc, with established processes, defined ownership, and integration into the broader risk management and compliance programs the operator runs.

Stage	Activities	Deliverables
Detection	Identify anomalies, errors, user-reported harms, regulator inquiries, and adverse outcomes	Alerts, incident tickets, initial classification
Reporting	Log incident details, notify internal stakeholders, file required regulator notifications	Incident report form, regulator notifications, customer disclosures where required
Response	Contain the incident, take corrective action, escalate as needed, preserve evidence	Corrective action logs, escalation records, evidence preservation
Resolution	Restore normal operation, mitigate ongoing impact, update operational documentation	Resolution plan, updated risk register, post-incident report
Prevention	Update processes, controls, training, and design to reduce recurrence likelihood	Lessons-learned report, updated protocols, control improvements

Cross-Sector Examples

The same incident type and lifecycle apply across sectors, but the specific manifestation and the regulatory drivers vary substantially. Healthcare incidents engage FDA and medical device regulation. Financial services incidents engage fair lending laws, SEC oversight, and model risk management requirements. Mobility incidents engage NHTSA, FMCSA, DOT, UNECE, and the EU AI Act high-risk provisions. Employment and HR incidents engage EEOC, state and municipal AI hiring laws including the New York City bias audit requirement, and disability rights frameworks. The cross-sector view illustrates how the same underlying agent failure can present, be reported, and be remediated differently depending on the sector in which it occurred.

Sector	Incident Example	Regulatory Drivers
Healthcare	AI misdiagnosis leads to delayed treatment	FDA SaMD framework, EU Medical Device Regulation, HIPAA, clinical decision support guidance
Finance	Unexplained denial of loans due to model bias	Fair lending laws, SEC, CFPB, model risk management guidance including SR 11-7
Mobility and transport	Robotaxi fails to respond to emergency vehicles	NHTSA, FMCSA, DOT, UNECE, EU AI Act high-risk provisions
Employment and HR	AI screening excludes candidates with certain accents or backgrounds	EEOC, NYC bias audit law, EU AI Act, state and municipal AI hiring statutes

Related Coverage

Agents | Risks & Management | Governance | Compliance & Conformity