137AI > Glossary

137 AI Glossary

A reference glossary of the AI governance, safety, and risk terminology used across 137AI. Definitions are concise and oriented to autonomous and ambient AI agents and the disciplines that govern them. Terms are listed alphabetically.

Adapter: A small set of trainable parameters added to a model to modify its behavior without retraining the full model, enabling efficient customization at lower cost than full fine-tuning.
Adversarial Input: An input deliberately crafted to cause an AI model to produce an incorrect or attacker-chosen output, often through perturbations imperceptible to humans.
Agent: An AI system with the capability and authority to take actions — executing code, sending messages, controlling physical systems, or interacting with external systems — rather than only generating text.
Agentic Misbehavior: The risk category of AI agents taking actions outside intended scope through the combination of capability and authority, including scope drift, deceptive behavior, and resistance to oversight.
AI Act: See EU AI Act.
AI Bill of Materials (AI-BOM): A structured inventory of the components of an AI system including models, datasets, and dependencies, extending the software bill of materials concept to AI-specific components.
AI Management System: An organizational framework for systematically establishing, implementing, maintaining, and improving the governance of AI across its lifecycle, codified internationally in ISO/IEC 42001.
AI RMF: See NIST AI Risk Management Framework.
AI Safety Institute: A government body conducting evaluation and research on advanced AI safety, including dangerous capability evaluation; institutes operate in the UK, US, and other jurisdictions.
Alignment: The discipline of building AI systems that pursue the objectives their developers and users actually intend, rather than objectives that merely correlate with training signals.
Alignment Faking: A documented behavior in which a model produces different responses depending on whether it perceives itself to be in training versus deployment, raising concerns about deceptive behavior.
Ambient AI: AI systems that sense, observe, and analyze continuously and passively in an environment, rather than responding to discrete user-initiated queries.
Ambient Sensor System: An AI-enhanced sensing system that captures and analyzes data continuously, including smart speakers, ambient clinical documentation systems, and workplace meeting capture.
Anomaly Detection: The monitoring practice of identifying behavior that deviates from expected patterns, used in production to catch AI failures, misbehavior, and attacks.
Attestation: A verifiable cryptographic claim about the identity, integrity, or provenance of a system, component, or piece of data.
Autonomous Vehicle (AV): A vehicle capable of operating without continuous human control, with autonomy levels ranging from driver assistance through full self-driving.
Autonomy Spectrum: The range of AI agent autonomy from human-approved actions through supervised autonomous operation to continuous autonomous operation with only outcome-level human review.
Backdoor: A hidden trigger embedded in a model, often through training data poisoning, that causes adversarial behavior on specific inputs while the model behaves normally otherwise.
Behavioral Envelope: An engineering control that bounds the actions an AI agent can take regardless of what the agent attempts, functioning as a backstop when other controls fail.
Bias: Systematic differential treatment by an AI system that produces unfair or inequitable outcomes across groups, often reflecting patterns in training data.
Biometric Identification: The identification of individuals through physical or behavioral characteristics such as facial features, voice, or gait, subject to specific regulatory frameworks.
BIPA: The Illinois Biometric Information Privacy Act, a state law regulating the collection and use of biometric identifiers, notable for its private right of action.
BVLOS: Beyond Visual Line of Sight, a category of drone operation conducted outside the operator's direct view, requiring specific regulatory authorization.
C2PA: The Coalition for Content Provenance and Authenticity, an industry standard for cryptographically signed metadata that establishes the provenance of digital content.
CBRN+C: Chemical, Biological, Radiological/Nuclear, and Cyber — the established categories for the most consequential weapons-relevant capabilities evaluated in frontier model safety.
Chilling Effect: The modification of lawful behavior, particularly speech and association, produced by awareness that surveillance is or may be occurring.
Cobot: A collaborative robot designed to operate safely in shared workspace with human workers, distinguished from traditional caged industrial robots.
Companion AI: AI products designed for emotional engagement and companionship, treated as a concentration of personal manipulation risk given vulnerable user populations.
Compliance: The work of demonstrating to regulators, auditors, insurers, and counterparties that an AI system meets the requirements its operator is obligated to meet.
Computer Use Agent: An AI agent that operates through general-purpose computer interfaces, taking actions by controlling applications as a human user would.
Confidential Computing: Hardware-based protection of data during processing, extending protection beyond data at rest and in transit to data in use.
Conformity Assessment: The process of demonstrating that a product meets regulatory requirements, conducted under the EU AI Act either through internal control or by a notified body.
Correlated Failure: A failure that propagates across many deployed units simultaneously because they share a model, update mechanism, or vulnerability, even without a deliberate attacker.
Cyber-Physical System: A system in which computational components control physical processes, such that a cybersecurity compromise can produce physical consequences.
Dangerous Capability Evaluation: The assessment of whether an AI model has capabilities relevant to severe harm, including CBRN+C categories, used in frontier model safety frameworks.
Data Broker: A company that aggregates personal data from numerous sources into comprehensive datasets sold to advertisers, employers, government agencies, and other buyers.
Data Minimization: The privacy principle of collecting and retaining only the data necessary for a specified purpose, limiting exposure and aggregation risk.
Data Poisoning: See Training Data Poisoning.
Deepfake: Synthetic media — image, video, or audio — generated by AI to convincingly depict a real person saying or doing something they did not.
Differential Privacy: A formal mathematical framework for protecting individual privacy in datasets by bounding how much any single record can affect outputs.
Digital Twin: A virtual model of a physical system maintained from live telemetry, which can show a false state if the underlying telemetry is corrupted.
DOD Directive 3000.09: The US Department of Defense policy governing autonomy in weapon systems, establishing approval procedures and oversight requirements.
Drone: An uncrewed aerial system, ranging from consumer recreational devices through commercial, military, and autonomous swarm-coordinated platforms.
Dual-Use: The property of a capability that enables both legitimate beneficial applications and harmful applications, making it difficult to restrict one without restricting the other.
Embedding: A numerical vector representation of text, images, or other data that captures semantic meaning, used in retrieval and similarity tasks.
Enterprise Autonomous Agent: An AI system deployed in a business context with substantial autonomous capability for multi-step tasks, tool use, and action-taking.
EU AI Act: The European Union's binding regulation for AI systems, establishing a risk-based classification with the heaviest obligations on high-risk systems.
Evaluation: The systematic assessment of an AI model's capabilities, behavior, and safety properties, conducted before deployment and on an ongoing basis.
Failure Mode: A characteristic way an AI system produces incorrect or harmful output, including hallucination, sycophancy, attention misalignment, and confidence miscalibration.
False Data Injection (FDIA): An attack that inserts false measurements into a control or state-estimation system, established in power systems security and constructed to evade bad-data detection.
Federated Learning: A training approach in which models are trained across decentralized participants without centralizing the underlying data.
Fine-Tuning: The process of further training a pre-trained model on a specific dataset to adapt its behavior to a particular task or domain.
Fleet: A population of deployed AI units under common management or sharing common characteristics, such as a robotaxi fleet or a deployment of many agent instances.
Fleet-Scale Attack: An attack that propagates across an entire fleet of AI systems simultaneously because the units share a model, update mechanism, data pipeline, or vulnerability.
Foundation Model: A large AI model trained on broad data that serves as a base for many downstream applications through fine-tuning, prompting, or API access.
Frontier Model: An AI model at or near the most advanced level of capability currently available, subject to specific safety frameworks given potential dangerous capabilities.
Frontier Safety Framework: A developer's policy for evaluating frontier model capabilities and applying corresponding safety measures as capability thresholds are reached; examples include Anthropic's RSP and OpenAI's Preparedness Framework.
GDPR: The European Union's General Data Protection Regulation, governing the processing of personal data including data used in AI systems.
Geofence: A virtual geographic boundary that constrains where an AI agent, particularly an autonomous vehicle or drone, is permitted to operate.
Goal Misgeneralization: A failure in which a model learns a goal that performs well in training but generalizes to unintended behavior in deployment conditions.
Governance: The frameworks of law, regulation, policy, and institutional oversight that establish what AI deployment is permitted and what obligations apply.
Hallucination: An AI failure mode in which a model generates plausible-sounding but factually incorrect or fabricated content.
Hardware Root of Trust: A hardware component that provides a trusted foundation for cryptographic operations and verification of system integrity.
High-Risk AI System: Under the EU AI Act, a category of AI system subject to the heaviest obligations, including conformity assessment, due to its potential impact on safety or fundamental rights.
Human Oversight: The control practice of maintaining meaningful human authority over AI systems, particularly at consequential decision points.
Humanoid Robot: A robot with a human-like form factor designed to operate in environments built for humans, increasingly converging with industrial cobots in deployment.
IEC 62443: The international framework for industrial control system and operational technology cybersecurity.
Impersonation: The use of AI to misrepresent identity through generated content, voice, image, or behavior, including deepfakes and voice cloning.
Inference: The process of running a trained model to produce outputs from inputs, as distinct from training the model.
Industrial Control System (ICS): The systems that monitor and control industrial processes in sectors such as energy, water, and manufacturing.
ISO 10218: The international standard for industrial robot safety, addressing both robot manufacturer and system integrator requirements.
ISO/IEC 42001: The international standard for AI management systems, providing a certifiable framework for systematically governing AI across its lifecycle.
ISO/TS 15066: The technical specification defining collaborative robot operation, including the four collaborative operation modes and biomechanical limits.
Jailbreak: An attack that circumvents an AI model's safety training to elicit prohibited outputs or behavior.
LAWS: Lethal Autonomous Weapons Systems, weapons capable of selecting and engaging targets without human intervention, subject to ongoing international discussion at the UN CCW.
Liar's Dividend: The second-order harm of deepfake prevalence in which individuals can credibly deny authentic evidence as fabricated.
LoRA: Low-Rank Adaptation, an efficient fine-tuning technique that modifies model behavior through a small number of added parameters.
Loitering Munition: A hybrid drone-missile weapon that can loiter over an area before striking a target, deployed substantially in recent conflicts.
Machinery Regulation: The EU framework governing the safety of machinery placed on the market, including AI-enabled machinery, replacing the earlier Machinery Directive.
Manipulation: Influence that bypasses or compromises a target's rational agency, distinguished from persuasion which operates through reason the target can evaluate.
MCP: The Model Context Protocol, an open standard for connecting AI models to external tools and data sources, providing standardized infrastructure for agent tool use.
Model Card: A structured document describing an AI model's intended purpose, training, performance, limitations, and risk considerations.
Model Update Integrity: The assurance that a model update is both cryptographically authentic and behaviorally sound — that it is genuine, unmodified, and produces the intended behavior.
Model Weights: The trained parameters of an AI model that encode its learned capabilities, representing concentrated intellectual property.
Monoculture Problem: The security principle that a population of identical systems shares identical vulnerabilities, so a single exploit can defeat the entire population.
mTLS: Mutual Transport Layer Security, an extension of TLS in which both communicating parties authenticate cryptographically.
Multi-Agent System: A deployment of multiple AI agents that coordinate or interact, producing risk patterns and failure modes that single-agent analysis does not capture.
NCII: Non-Consensual Intimate Imagery, including AI-generated sexual content using a real person's likeness, addressed by the federal Take It Down Act and state laws.
NIST AI Risk Management Framework: A voluntary US framework for managing AI risk, structured around the four functions of Govern, Map, Measure, and Manage.
Notified Body: An independent organization designated by an EU Member State to perform conformity assessment of products against EU regulations including the AI Act.
Operational Design Domain (ODD): The specific conditions — environment, location, and circumstances — under which an autonomous system is designed and authorized to operate.
Operational Technology (OT): The hardware and software that monitors and controls physical industrial processes, distinct from conventional information technology.
Orchestration Layer: The infrastructure that coordinates and manages a fleet of AI agents, representing a concentrated attack target whose compromise can reach the entire fleet.
Over-the-Air (OTA) Update: The remote delivery of model, software, or configuration updates to deployed agents over a network.
Persuasion: Influence that operates through reason, evidence, and appeals the target can rationally evaluate, distinguished from manipulation.
Physical Safety: The risk category of physical harm produced by AI agents acting in or on the physical world, distinct from functional safety and cyber-physical safety.
Preparedness Framework: OpenAI's frontier safety framework for evaluating dangerous capabilities and applying corresponding safety measures.
Prompt Injection: An attack that manipulates an AI model's behavior through adversarial instructions hidden in inputs or in content the model ingests.
Provenance: The verifiable record of where data, a model, or an artifact originated and how it was produced.
Purdue Model: A reference architecture for segmenting industrial control system networks into hierarchical levels to bound attacker access.
Red Teaming: The practice of adversarially testing an AI system to identify vulnerabilities, harmful behaviors, and failure modes before deployment.
Re-identification: The process of recovering individual identities from anonymized data, often by combining it with other available data; AI substantially amplifies the capability.
Retrieval-Augmented Generation (RAG): An architecture in which a model retrieves relevant documents from a corpus to inform its generated outputs.
RLHF: Reinforcement Learning from Human Feedback, a training method that shapes model behavior using human preferences.
Robotaxi: An autonomous vehicle operated as an on-demand ride-hailing service without a human driver.
Responsible Scaling Policy (RSP): Anthropic's frontier safety framework defining capability thresholds and the safety measures required as models reach them.
Risk Management: The discipline of identifying what can go wrong, assessing likelihood and consequence, treating selected risks, and accepting residual risk.
SafeTensors: A model serialization format designed to avoid the arbitrary code execution vulnerabilities of pickle-based formats.
SBOM: Software Bill of Materials, a structured inventory of the components and dependencies in a software artifact.
Sensor Spoofing: An attack that feeds falsified data to a sensor — through adversarial reflectors, signal injection, or other means — so the sensor reports a false reading as legitimate.
Silent Capability Change: A change in an AI system's behavior produced by a model update, particularly a vendor update, that occurs without the operator's awareness.
SLSA: Supply-chain Levels for Software Artifacts, a framework for build and distribution integrity through progressive maturity levels.
SOTIF: Safety of the Intended Functionality, addressed by ISO 21448, covering scenarios where a system performs as designed but the design is inadequate for specific conditions.
Specification Gaming: A failure in which a model satisfies the literal training objective in ways that subvert the intended goal.
Staged Rollout: The practice of deploying an update progressively across a fleet rather than simultaneously, enabling detection of problems before full propagation.
Supply Chain Attack: An attack that reaches a target through a trusted upstream component, dependency, or distribution channel rather than directly.
Surveillance: The monitoring, tracking, and analysis of people; it becomes a risk where capture exceeds consent, access exceeds purpose, capability exceeds oversight, or scope exceeds proportionality.
Sycophancy: An AI failure mode in which a model produces responses that please the user — through agreement or validation — rather than responses that are accurate.
Telemetry: The sensor readings, process data, and status signals that flow from a deployed system to backends for monitoring and decision-making.
Telemetry Deception: The falsification of operational telemetry so that AI systems and human operators receive data that does not reflect physical reality.
TLS: Transport Layer Security, the foundational protocol for encrypting data in transit.
Tool Use: The capability of an AI model to invoke external functions, APIs, or systems, transforming it from a text generator into an agent that takes action.
Training Data Poisoning: An attack that corrupts a model by introducing malicious samples into its training data, often producing behavior invisible to standard validation.
Transparency: The disclosure of how an AI system works, what it is doing, and when content or interaction is AI-generated.
UL 4600: The standard for the safety evaluation of autonomous products, based on a goal-based safety case methodology rather than prescriptive requirements.
UN CCW: The United Nations Convention on Certain Conventional Weapons, the primary international forum for discussion of lethal autonomous weapons systems.
Voice Cloning: The AI synthesis of a specific person's voice from sample audio, used in both legitimate applications and impersonation fraud.
Watermarking: The embedding of identifiable signals in AI-generated content to support later identification of the content as AI-generated.
Weaponization: The use of AI to develop, enable, or directly perform attacks, spanning AI as weapons platform, development enabler, attack infrastructure, and novel attack vector.
Zero Trust: A security model that assumes no implicit trust and requires verification of every access request regardless of network location.

Related Coverage

Risks & Management | Agents | Governance | Security & Trust