137AI > AI Controls

AI Controls

AI controls are the engineering mechanisms that turn policy into practice and risk treatment into operational reality. Governance defines what must be true. Compliance verifies that it is true. Controls are the machinery that makes it true. For autonomous and ambient AI agents, controls have to bound a structural attack surface rather than close it, because the surface is the value the system was built to deliver. The work is to make the surface auditable, attributable, constrained, and recoverable when exploited, through cryptographic identity, behavioral envelopes, access scoping, secure boot and signed updates, telemetry integrity, consent and capture controls, runtime monitoring, intervention authority, fleet coordination, and integration with operational technology environments.

Agent Identity and Cryptographic Attestation

Every autonomous agent needs an identity that is hard to forge, easy to verify, and tied to an accountable owner or operator. The identity supports attribution after an incident, authorization for actions that require trust, and the foundation for every other control that depends on knowing which agent is acting. The mechanisms include hardware roots of trust that bind identity to a specific device, cryptographic attestation that lets an agent prove its identity and configuration to a verifier, signed manifests that bind the agent's software and policy to its identity, and visible registration that lets people in proximity confirm what they are interacting with. A robotaxi without a verifiable identity is a vehicle that no investigator can attribute after an incident. A humanoid without attestation is a robot whose claimed configuration cannot be trusted. A software agent without identity is an action source that cannot be held to permission scope. Agent Identity & Cryptographic Attestation covers the identity primitives, the attestation protocols, and the registration and disclosure practices across physical, personal and ambient, and software agents.

Behavioral Constraints and Safety Envelopes

An agent's capability is broader than its acceptable behavior. A humanoid can apply force; the safety envelope says how much and where. A robotaxi can drive anywhere its planner finds a route; the geofence says where it is allowed to. A software agent can call any API it has credentials for; the action policy says which calls require approval. Behavioral constraints are enforced at the action layer rather than the model layer because model-layer guarantees are not strong enough to rely on alone. A model trained to avoid an action can still be tricked into taking it through prompt injection, adversarial input, or distribution shift. An action-layer constraint that physically refuses the prohibited action holds even when the model fails. The constraint categories include force limits, geofences and zone restrictions, protected action classes that the agent will not perform under any instruction, and approval thresholds that escalate to human review. Behavioral Constraints & Safety Envelopes covers the constraint categories, the enforcement architectures, and the patterns for keeping constraints in sync with the agent's evolving capability.

Access and Authorization

Authorization is the discipline of granting the minimum privilege needed for the agent to do its job and no more. Physical agents need access to facilities, doors, and equipment scoped to their task. Personal and ambient agents need access to data, devices, and accounts scoped to the user's authorization. Software agents need API permissions, repository access, and integration scopes that match the user's intent for the current task. Authorization fails when scopes accumulate without review, when permissions granted for one task persist into others, when delegated authority chains do not have clean termination, and when authorization decisions are based on inputs the agent itself controls. The mechanisms include role-based access, attribute-based access, capability-scoped tokens, time-limited credentials, and explicit per-action approval for high-stakes operations. Access & Authorization covers the authorization primitives, the scope management discipline, and the patterns specific to agent contexts where the boundary between user-delegated and agent-discretionary action is harder to draw than in conventional software.

Cybersecurity Controls

Cybersecurity controls for AI agents extend conventional information security practice into the cyber-physical and AI-specific layers that conventional practice was not designed for. Secure boot ensures the agent starts in a known-good state. Signed updates ensure that only operator-authorized software reaches the agent. Runtime attestation lets the agent prove its current state to a verifier on demand. Hardware roots of trust provide the foundation that the higher-layer controls depend on. Network segmentation and zero-trust architecture bound lateral movement when one agent or one system is compromised. The controls borrow from the IT security and embedded systems disciplines, with adaptations for the unique requirements of agents that operate continuously in untrusted environments, ship frequent updates, and depend on connectivity for core functionality. Cybersecurity Controls covers secure boot, signed updates, runtime attestation, hardware roots of trust, network segmentation, and the adaptations of conventional cybersecurity practice for agent contexts.

Telemetry Integrity and Data Provenance

Telemetry is the data flowing from deployed agents back to operators, training pipelines, and decision-support systems. Telemetry integrity ensures that the data arriving at the destination is the data the sensor produced, unaltered and attributable. Data provenance tracks the lineage of data from capture through processing to use, so that the downstream consumer can verify the data's origin and what has been done to it. The mechanisms include cryptographic signing at the sensor, attestation that the sensor was in a known-good state at capture time, message authentication on transit, immutable logging of processing steps, and signed metadata that travels with the data through its lifecycle. The controls bound the data risk surface and provide the evidence base for incident reconstruction when something goes wrong. Telemetry Integrity & Data Provenance covers the signing and attestation primitives, the provenance tracking patterns, and the practical implementations across physical agent fleets and ambient sensor networks.

Consent and Capture Controls

Personal and ambient agents capture audio, video, biometric signals, and behavioral data as part of normal operation. Consent and capture controls determine when the agent records, what it records, where the data goes, and how long it persists. The controls are partly technical and partly procedural. On the technical side: physical indicators that show when recording is active, on-device processing that keeps raw capture local, selective recording that captures only the events relevant to the agent's task, and retention limits that delete data after its operational purpose is served. On the procedural side: clear disclosure of what the agent records, user authority to review and delete captured data, bystander notification practices in shared environments, and opt-out mechanisms where opt-in cannot be obtained. The controls bound the surveillance material harvesting risk and provide the foundation for the personal data law obligations the agent operates under. Consent & Capture Controls covers the technical and procedural controls, the patterns across smart home, automotive cabin, wearable, and public infrastructure deployments, and the gaps where current practice falls short of what the controls could provide.

Runtime Monitoring and Anomaly Detection

Prevention is incomplete. Some attacks will succeed, some failures will occur, and some misbehavior will emerge that the design did not anticipate. Runtime monitoring is the practice of watching the agent's behavior in production and detecting deviation from expected patterns. The monitoring covers the agent's actions, its decisions, its resource consumption, its sensor inputs, and its communication patterns. Anomaly detection compares observed behavior to baselines built from training and prior production, flagging deviations for review. The mechanisms include behavioral baselines, statistical anomaly detection, machine-learning-based behavioral analysis, and explicit invariants that should hold during normal operation. The control gives operators a chance to catch problems before they escalate and provides the data feed that other controls depend on. Runtime Monitoring & Anomaly Detection covers the monitoring architectures, the detection techniques, the alerting and triage patterns, and the integration with intervention authority for responding to detected anomaly.

Human Oversight and Intervention Authority

Human oversight is the design choice that keeps a human in a position to intervene when the agent encounters a situation it should not handle alone. The choice ranges from human-in-the-loop, where every consequential agent action requires human approval, to human-on-the-loop, where the human supervises and can intervene but does not approve each action, to human-out-of-the-loop, where intervention is only possible after the fact. Different deployments call for different points on this spectrum, depending on the stakes of the agent's actions, the latency tolerance, and the cost of human supervision. Intervention authority is the technical capability that makes oversight meaningful: an emergency stop that physically halts a humanoid, a remote command that pulls a robotaxi over, a kill switch that suspends a software agent's autonomous action, a rollback that reverts an agent's changes. Human Oversight & Intervention Authority covers the oversight modes, the intervention mechanisms, the design patterns for keeping intervention effective at scale, and the failure modes where oversight is nominal but not real.

Fleet-Level Coordination Controls

An individual agent has individual controls. A fleet of agents has emergent properties that no individual-agent control addresses. The fleet management plane that orchestrates a robotaxi service, a delivery robot deployment, a humanoid workforce, or a multi-agent software system is itself a system that needs controls. Fleet-level coordination controls include authority partitioning so that no single compromise can command the entire fleet, blast-radius limits so that misbehavior in one agent does not cascade, staged rollout discipline so that updates and policy changes are validated on a subset before reaching the whole, and fleet-wide intervention authority that lets an operator suspend the entire fleet's autonomous operation if a coordinated event is detected. The controls bound the fleet-scale attack surface and provide the architectural foundation that lets operators run large agent deployments without compounding individual-agent risk into fleet-scale catastrophe. Fleet-Level Coordination Controls covers the partitioning patterns, the rollout discipline, the fleet-wide intervention mechanisms, and the coordination control architectures across the agent categories.

OT/ICS Integration Controls

When AI agents are deployed inside operational technology environments — manufacturing, energy, water, transportation, building management — they connect to industrial control systems that have their own established security discipline. The integration is where governance and controls run into the constraints of the OT environment: deterministic behavior requirements, long lifecycles, limited update windows, network segmentation between IT and OT, and safety-critical operation where misbehavior has physical consequences. Controls for AI in OT environments extend the conventional AI controls into the OT context: behavioral constraints that respect the deterministic requirements of the control loop, telemetry integrity that uses OT-appropriate protocols, network architecture that respects IT-OT segmentation, and intervention authority that aligns with the existing safety instrumented systems. The work also runs in the other direction: OT security controls have to be extended to handle AI components that conventional OT security did not contemplate. OT/ICS Integration Controls covers the integration architectures, the protocol adaptations, the governance arrangements between AI and OT teams, and the emerging practices for safe AI deployment in OT environments.

Related Coverage

Governance | Compliance & Conformity | Risks & Management | Security & Trust