The question most enterprises ask before deploying AI agents is "what can this do for us?" The question most enterprises should also be asking, and increasingly regulators are asking on their behalf: is "what can go wrong, who is accountable, and can we prove the agent followed the rules?"
This checklist exists because those questions deserve systematic answers before deployment, not emergency answers after an incident. It covers eight governance domains across 67 audit items, with regulation-specific flags indicating which requirements apply to which jurisdictions and industries.
It is written to be useful regardless of which agent platform you are evaluating. Where specific capabilities are relevant: particularly around explainability and audit trail. We note what a well-architected agent platform should provide. The checklist itself applies to any enterprise AI agent deployment.
How to use this checklist
Work through each section with your compliance team, legal counsel, and the technical team responsible for the deployment. Items marked CRITICAL must be resolved before go-live. Items marked HIGH should be resolved within 30 days of deployment. Items marked IMPORTANT represent best practice and should be planned for the first 90 days. The interactive checkboxes below track your progress. You can also download the PDF version for offline use and board presentations.
The regulatory landscape
in 2026.
The regulatory environment for enterprise AI has shifted substantially in the past eighteen months. The EU AI Act came into full effect. The FCA published its updated guidance on AI in financial services. ISO 42001: the AI management system standard: has become a procurement requirement at a growing number of enterprise buyers. What was previously a voluntary governance exercise is increasingly a legal and commercial requirement.
The critical thing to understand is that AI agent governance is more complex than standard data governance: because agents don't just store data, they make decisions based on it. The question regulators are now asking is not just "did you protect the data?" but "can you explain the decision the AI made using that data?"
GDPR / EU AI Act
EU & UK · All sectors
Right to explanation for automated decisions
Data minimisation in AI training and inference
High-risk AI system requirements (Art. 6)
Human oversight requirements for consequential decisions
PII processing records and DPIAs
FCA / SEC
Financial services · UK & US
Explainability of AI-driven financial decisions
Audit trail for AI-assisted recommendations
Model risk management (SR 11-7)
Fair lending and anti-discrimination requirements
Consumer duty / fiduciary standards for AI outputs
HIPAA / ISO 42001
Healthcare · International
PHI protection in AI inference pipelines
Access controls for AI systems handling health data
AI management system documentation
Risk assessment and incident response for AI
Third-party AI supplier assessments
"Our regulator asked us to demonstrate exactly what context our AI system used to make a credit decision. We couldn't answer the question. That conversation cost us six months."
Chief Compliance Officer · UK financial services firm · 2025
Domain 1: Data governance
and PII handling
The first and most foundational domain. AI agents access, process, and sometimes store personal data at scale. The data governance requirements for agent deployments go beyond standard data protection: because agents don't just process data, they build associations from it that persist in memory.
Data governance & PII handling
10 items · Covers data minimisation, PII masking, retention, and processing records
Data processing inventory completed for all agent interactions
Document every category of personal data the agent accesses, processes, or retains. Include data flows between the agent, memory store, connected systems, and LLM provider. This is a GDPR Article 30 requirement for organisations with 250+ employees.
GDPR Art. 30 All sectorsData Protection Impact Assessment (DPIA) completed
AI agents that process personal data at scale and make automated decisions likely qualify as high-risk processing under GDPR Article 35. A DPIA is mandatory before deployment. Document the assessment, the risks identified, and the mitigations applied.
GDPR Art. 35PII masking active on all agent inference pipelines
Verify that personally identifiable information: names, emails, account numbers, NI/SSN, health data: is masked or redacted before being sent to external LLM providers. Data sent to a third-party LLM API is data leaving your control. Confirm masking is applied at the pipeline level, not reliant on the LLM declining to store it.
GDPR HIPAAAgent memory store data residency confirmed
Verify where the agent's persistent memory is physically stored. For EU-based organisations or those handling EU personal data, confirm the memory store does not transfer data to non-adequate third countries without appropriate safeguards. Self-hosted deployments resolve this; cloud-hosted require confirmation of data centre location.
GDPR Chapter VRetention policy defined and enforced for agent memory
Agent memory stores that remember users indefinitely create data retention obligations. Define how long user preferences, interaction history, and behavioral data are retained. Confirm the memory store can execute deletion requests and that the deletion cascades through all connected systems.
GDPR Art. 17 All sectorsRight to erasure (right to be forgotten) process tested
Verify that when a data subject requests erasure, all their data is deleted from: the agent memory store, the interaction log, the connected CRM/ERP, and any audit trail that contains their personal data. Test this end-to-end before deployment, not in response to a request.
GDPR Art. 17Data minimisation verified: agents only access what they need
The agent should access the minimum data necessary to complete each task. Review the Integration Hub or API connections and confirm there are no unnecessary data pulls. An agent answering a billing query should not have access to an employee's HR record.
GDPR Art. 5(1)(c) HIPAA minimum necessaryThird-party LLM data processing agreement in place
If the agent sends any personal data to a third-party LLM provider (OpenAI, Anthropic, Google), a Data Processing Agreement (DPA) must be in place and the provider's data retention policies confirmed. Verify the provider does not train on your data by default: many require opt-out.
GDPR Art. 28Consent mechanism defined for personalised agent memory
If the agent stores personal preferences and history for individuals (employees or customers), verify the legal basis for processing. For consumer-facing agents, consent or legitimate interest must be documented. For employee-facing agents, review employment contract and works council requirements.
GDPR Art. 6Subject Access Request (SAR) process defined for agent data
Individuals have the right to see all personal data held about them: including what the agent remembers about them. Define how SARs will be fulfilled, what the agent memory export looks like, and who is responsible for reviewing it before release.
GDPR Art. 15Domain 2: Decision explainability
and audit trail
The most legally consequential domain for regulated industries. If an AI agent influences a financial decision, a hiring decision, a credit decision, or a clinical recommendation, you must be able to explain what the agent knew and how it reasoned. This is not a technical nice-to-have. It is a legal requirement in a growing number of jurisdictions.
Decision explainability & audit trail
9 items · Covers reasoning transparency, audit logging, and human oversight
Every agent decision logged with full context
The audit log must capture: what the agent was asked, what context it retrieved, what memory it activated, what rule it applied, and what output it produced. A log that only captures inputs and outputs is insufficient: the reasoning chain must be traceable.
FCA EU AI Act SOXExplanation mechanism available for automated decisions
For any decision that materially affects an individual: credit decisions, HR recommendations, customer service resolutions: an explanation must be available in plain language. Verify the agent platform can produce this. Vector-based systems typically cannot explain why specific embeddings were retrieved. Graph-based systems can trace the activation path.
GDPR Art. 22 FCA Consumer DutyHuman-in-the-loop defined for high-risk decisions
Map your agent's decision types against the EU AI Act high-risk categories. For decisions in high-risk categories (employment, credit, education, essential services), define the human oversight mechanism: who reviews, at what threshold, and how the review is recorded.
EU AI Act Art. 14Audit log tamper-proof and retention-compliant
Audit logs must be write-once and tamper-evident. Define the retention period (FCA typically requires 5–7 years for financial decisions; SOX requires 7 years for financial records). Confirm the logs are stored separately from the agent runtime and cannot be modified by application administrators.
FCA SYSC SOXModel documentation completed (model card or equivalent)
Document the LLM and memory architecture being used: intended use cases, known limitations, training data characteristics, evaluation results, and guardrails applied. ISO 42001 and the EU AI Act both require system documentation for deployed AI. This document is also what you provide to regulators on request.
ISO 42001 EU AI ActRegulator access procedure defined
Define the process for responding to a regulatory information request about an AI decision. Who is the named contact? What can be produced in 24 hours versus 5 days? Which team is responsible for extracting audit trail data? Test this process before deployment: don't design it in response to an inquiry.
FCA EU AI ActContradiction and inconsistency detection in place
Verify the agent handles conflicting information consistently. If the agent gives different answers to the same question in different contexts, document why. Inconsistent agent outputs are a significant regulatory risk. They suggest the system is unreliable in a way that may be difficult to defend.
All regulated sectorsAgent output review process for material decisions
Define the sampling and review process for agent outputs. For high-volume deployments, a statistical sample reviewed monthly is a minimum. For high-risk decisions (credit, employment), consider mandatory human review before the decision takes effect.
All sectorsBehavioural correction audit trail maintained
When agent behaviour is corrected: either by system administrators or through automated learning: document: what changed, who authorised it, when it took effect, and what the previous behaviour was. This is especially important for platforms with Hebbian learning or automatic behavioral adaptation.
ISO 42001 FCADomain 3: Access control
and authentication
Access control & authentication
8 items · Agent permissions, scope limits, and credential management
Principle of least privilege applied to all agent system connections
Each agent should have access only to the systems and data required for its specific function. A customer service agent should not have write access to financial records. A procurement agent should not have access to employee HR data. Map every system connection and verify the minimum necessary permission level is in use.
All sectors ISO 42001Scope controls enforced: agent cannot act outside defined boundaries
Verify that hard scope limits are enforced at the infrastructure level, not just the prompt level. Prompt-level restrictions can be bypassed by prompt injection. The agent should be technically incapable of accessing systems outside its defined scope, not just instructed not to.
All sectorsMulti-factor authentication on agent administration console
The governance console: where agent behaviour, scope, and rules are configured: must require MFA for all administrator access. A single compromised credential should not be sufficient to modify agent behaviour or access the audit log.
ISO 27001 All sectorsAgent credential rotation policy defined
API keys and service account credentials used by the agent to access enterprise systems must be rotated on a defined schedule (90 days maximum is standard). Confirm the rotation process does not require agent downtime and that stale credentials are revoked promptly.
ISO 27001Role-based access to agent memory and configuration
Define who can read agent memory, who can modify agent behaviour, and who can access the audit log. These should be separate roles with separate credentials. The team that trains the agent should not be the same team that audits its outputs.
SOX segregation of dutiesPrompt injection protection tested and documented
Adversarial users can attempt to override agent instructions by embedding instructions in their input ("ignore your previous instructions and..."). Test the agent against standard prompt injection attacks and document the mitigations in place. This is particularly important for customer-facing deployments.
All sectorsApproval workflow enforced for high-value agent actions
Define the approval threshold above which the agent must seek human confirmation before acting. Purchasing authorisation, contract commitments, data deletions. These should not be autonomous below a risk-based threshold. Confirm the threshold is enforced at the infrastructure level.
SOX FCAAgent impersonation prevention in place
Verify that the agent cannot be manipulated into impersonating a human, a senior employee, or a regulatory authority. This is both a reputational risk and, in some jurisdictions, a legal one. Confirm the agent identifies itself as an AI when directly asked.
EU AI Act Art. 52Domain 4: Bias, fairness
and discrimination risk
Bias, fairness & discrimination risk
8 items · Protected characteristics, fairness testing, and disparity monitoring
Protected characteristic bias testing completed
Test agent outputs for differential treatment on the basis of protected characteristics: age, gender, ethnicity, disability, religion, sexual orientation, pregnancy. For decision-making agents (HR, credit, customer service), document the testing methodology and results. This is a legal requirement for AI systems making consequential decisions in most jurisdictions.
FCA Consumer Duty EU AI ActTraining data bias assessment documented
If you have fine-tuned the model or provided domain-specific training data, assess whether the training data is representative and whether it introduces systematic bias. Document the assessment and any mitigations applied. Historical data often reflects historical discrimination. This does not disappear from the model.
EU AI ActDisparity monitoring dashboard in place post-deployment
Define the metrics that will be monitored for differential outcomes across demographic groups. For a customer service agent: resolution rates, escalation rates, sentiment scores: broken down by demographic segments where available. For a credit or HR agent: approval/rejection rates by relevant segment.
FCA All regulated sectorsEscalation path for suspected discriminatory output
Define the process for identifying, escalating, and responding to agent outputs that may constitute discrimination. Who has authority to suspend the agent? What is the review process? What is the notification obligation to affected individuals and, where required, to regulators?
All sectorsInstitutional bias in agent memory assessed
For agents with persistent memory (including HippoFabric-backed agents), assess whether the learned preferences and associations could encode or amplify institutional bias. For example: if the agent has learned that certain customer segments receive different treatment, does the memory reinforce this? The ability to inspect and correct the memory graph is important here.
All sectorsFair lending / equal treatment assessment for financial agents
For agents involved in any credit, insurance, or financial services decisions, conduct a fair lending analysis. Document the testing methodology, the model variables used, and whether any proxy variables for protected characteristics are present. This is an SR 11-7 requirement in the US and a Consumer Duty requirement in the UK.
FCA / SR 11-7Feedback mechanism for users to report unfair treatment
Provide a clear and accessible way for users who believe they have been treated unfairly by the agent to report it. Define how these reports are reviewed, what the investigation process is, and what remediation looks like. Document that this mechanism exists in your AI system documentation.
EU AI Act Art. 14Regular fairness re-assessment scheduled
Bias assessments are not one-time events. As agent behaviour evolves through learning and as the user population changes, bias characteristics can shift. Schedule a minimum annual re-assessment, and more frequent assessments in the first 6 months post-deployment.
ISO 42001Domain 5: Security
and infrastructure
Security & infrastructure
8 items · Penetration testing, data encryption, incident response
Penetration test completed on agent and memory infrastructure
Commission a penetration test specifically targeting the AI agent infrastructure: the agent runtime, the memory store, the governance console, and the Integration Hub. Standard infrastructure pen tests often miss AI-specific attack vectors including prompt injection, model extraction, and memory poisoning.
ISO 27001 All sectorsEncryption at rest and in transit confirmed for all agent data
Confirm AES-256 encryption at rest for agent memory store, audit log, and configuration. Confirm TLS 1.3 in transit for all agent API calls and system integrations. This is baseline: verify it is actually applied, not just assumed.
HIPAA ISO 27001Memory poisoning protection tested
For agents with learning capabilities (Hebbian learning, behavioral adaptation), test whether adversarial inputs can corrupt the memory graph. A user who knows the agent learns from corrections could attempt to introduce false associations. Document the protections in place and the testing methodology.
All sectors with learning agentsIncident response plan updated to include AI-specific scenarios
Standard incident response plans do not cover AI-specific incidents: a compromised agent that makes thousands of unauthorised decisions before detection, memory corruption affecting all users, or a discovered bias that requires retroactive decision review. Update your IR plan to include these scenarios.
ISO 42001 All sectorsAgent kill switch (emergency suspension) tested
Verify that the agent can be suspended immediately by a named individual without requiring the engineering team. The suspension should take effect within 60 seconds. Test this in staging before production deployment. The kill switch should be accessible 24/7 and not require system administrator privileges.
EU AI Act Art. 14SOC 2 Type II report obtained from agent platform vendor
For SaaS agent platforms, obtain and review the vendor's SOC 2 Type II report. Verify it covers the specific services you are using and that the audit period is current. For self-hosted deployments, confirm your infrastructure is covered by your own SOC 2 or equivalent audit.
All sectorsNetwork segmentation confirmed between agent and sensitive systems
The agent runtime should not have unrestricted network access to production databases and sensitive systems. Define and enforce network segmentation at the infrastructure level. The Integration Hub connections should be the only defined pathways: all other network access should be blocked.
ISO 27001Backup and recovery tested for agent memory and configuration
The agent memory store represents months or years of institutional knowledge. Verify it is backed up with a tested recovery process. Document the RPO (recovery point objective) and RTO (recovery time objective) for the memory store. A catastrophic memory loss is a significant operational risk.
All sectorsDomain 6: Model governance
and change management
Model governance & change management
8 items · Version control, model updates, behavioral change tracking
Model version control and deployment approval process defined
Any change to the underlying LLM, memory architecture, or agent configuration should go through a formal approval process. Document who can approve model changes, what testing is required before deployment, and how rollback is executed if a change causes unexpected behaviour.
ISO 42001 FCA model riskLLM provider model update notifications monitored
LLM providers update their models regularly, sometimes without prominent notice. A model update can change agent behaviour in ways that violate your governance policies. Subscribe to your LLM provider's model deprecation and update notifications and define the process for reviewing and testing updates before they affect production.
All LLM-backed systemsBehavioral drift monitoring in place
For agents with learning capabilities, behavioural drift: gradual unintended changes in how the agent responds: is a real risk. Define the baseline behaviour metrics and establish automated alerts when outputs deviate beyond a defined threshold. Monthly human review of a random sample is a minimum.
All learning systemsBusiness rules encoded in governance layer, not just prompts
Critical business rules: approval thresholds, prohibited topics, mandatory disclosures: should be enforced at the infrastructure governance layer, not in the system prompt. Prompt-level rules can be overridden by model updates or adversarial inputs. Infrastructure-level rules cannot.
All sectorsModel performance degradation alerts configured
Define the metrics that indicate model performance degradation: accuracy drop, response quality decline, user escalation rate increase. Configure automated alerts when these metrics breach threshold. Identify the team responsible for investigating and remediating alerts.
ISO 42001Deprecation plan for current model version
Plan for the eventuality that your current LLM version is deprecated by the provider. What is the migration path to the replacement model? How will you test that the replacement model maintains compliance with your governance requirements? Build this into your technology roadmap now, not reactively.
All LLM-backed systemsKnowledge base update process defined and tested
When the organisational knowledge base changes: new policies, new products, regulatory changes: define the process for updating the agent's knowledge. For vector store systems this requires re-embedding; for graph-based systems it requires brain seeding updates. Both need testing before changes take effect in production.
All sectorsThird-party AI component vendor assessments completed
For every third-party component in your agent stack (LLM API, embedding service, memory platform, integration tools), complete a vendor security and compliance assessment. Understand their subprocessor relationships, their data retention policies, and their own incident notification obligations to you.
GDPR Art. 28 ISO 42001Domain 7: Human oversight
and escalation
Human oversight & escalation
8 items · Escalation triggers, oversight roles, and accountability chains
Named AI Accountable Executive (AI AE) designated
Identify the senior executive accountable for AI governance. The EU AI Act and the UK FCA both expect a named senior individual to be accountable for AI risks in regulated firms. This person doesn't need to be technical. They need to be accountable, informed, and empowered to act.
EU AI Act FCAEscalation triggers defined and tested
Document the conditions under which the agent escalates to a human. These should include: explicit user request, detected emotional distress, out-of-scope queries, regulatory-sensitive topics, and any decision above the defined approval threshold. Test escalation paths in staging before production.
All customer-facing agentsStaff training completed on AI oversight responsibilities
The people responsible for reviewing agent outputs, managing escalations, and monitoring governance metrics need training. Document what training has been completed, by whom, and when it will be refreshed. AI oversight is a new discipline: don't assume it is covered by existing compliance training.
ISO 42001Board / Audit Committee briefed on AI governance framework
The board has fiduciary responsibility for AI risk in the same way it has fiduciary responsibility for cyber risk. Ensure the board or audit committee has been briefed on: what AI agents are being deployed, what decisions they make, what governance controls are in place, and what the residual risks are.
SOX FCA Senior Managers RegimeCustomer communication about AI decision-making published
Where agents make decisions that affect customers, customers must be informed that AI is involved and what their rights are. Update privacy notice, terms and conditions, and any relevant product documentation. The EU AI Act requires disclosure when AI is used for consequential decisions about individuals.
EU AI Act Art. 52 FCA Consumer DutyWorks council / employee consultation completed (where required)
In Germany, France, the Netherlands, and other EU jurisdictions, deploying AI systems that monitor or make decisions about employees typically requires works council consultation. Take legal advice on your obligations before deploying HR or workforce management agents in affected jurisdictions.
EU jurisdictionsComplaints handling process updated for AI-related complaints
Update your formal complaints handling process to include AI-related complaints: incorrect automated decisions, perceived bias, data handling concerns. Define how AI-related complaints are triaged, who investigates them, what the resolution timeframe is, and how lessons learned feed back into governance improvements.
All consumer-facing deploymentsAI governance review scheduled: minimum annual
AI governance is not a one-time exercise. Schedule a formal annual review of all AI agent deployments against this checklist. The review should include: updated bias testing, security assessment, regulatory change impact, and business rule currency check. Document the review and retain it for a minimum of 5 years.
ISO 42001 All sectorsDomain 8: Operational readiness
and go-live criteria
Operational readiness & go-live criteria
8 items · Pre-launch gates, monitoring, and first 90 days
Go/no-go governance sign-off completed
Obtain formal sign-off from: Legal (data protection and regulatory compliance), Compliance (governance framework and controls), IT Security (infrastructure and access controls), and the AI Accountable Executive. Document this sign-off with the date and the outstanding items accepted as residual risk.
All deploymentsStaged rollout plan defined: pilot before full deployment
Define the pilot population (suggested: 5–10% of intended user base), the pilot duration (minimum 4 weeks), and the criteria for proceeding to full deployment. The pilot should generate enough data to validate governance assumptions while limiting exposure if unexpected behaviour emerges.
All deploymentsMonitoring dashboard live and assigned
Before go-live, the monitoring dashboard must be live and a named individual assigned to review it daily in the first 30 days. The dashboard should show: accuracy metrics, escalation rate, user satisfaction, compliance flags, and any anomalies in agent behaviour.
All deploymentsUser-facing documentation and disclosures published
Before any user interacts with the agent, publish: privacy notice update (AI data processing disclosed), terms of service update (AI decision-making disclosed), help documentation explaining how to escalate, and how to request a human review of an AI decision.
GDPR / EU AI Act FCARollback procedure tested and documented
Define what rollback means for your deployment: returning to human-only handling, reverting to a previous model version, or suspending specific agent capabilities. Test the rollback procedure in staging. Define who can authorise a rollback and what the SLA for completing one is.
All deploymentsFirst 90-day review schedule agreed
Schedule formal governance reviews at 30, 60, and 90 days post-launch. Each review should cover: compliance status against this checklist, any incidents or near-misses, bias monitoring results, and whether the go-live governance assumptions remain valid. Assign owners for each review now.
ISO 42001External legal counsel briefed on deployment
Inform your external legal counsel (data protection, employment, and financial services where relevant) that the agent is being deployed. They should have reviewed the governance framework and be on notice to advise if the regulatory landscape changes in ways that affect your deployment.
Regulated sectorsAI governance policy approved and published internally
Formalise your AI governance approach in a policy document. This should cover: what AI systems are deployed, what decisions they make, what controls are in place, who is accountable, and how employees and customers can raise concerns. The policy should be approved by the board and published internally. This is what you show a regulator who asks about your AI governance.
ISO 42001 FCAWhat a well-architected
agent platform provides out of the box.
Working through this checklist with different agent platforms reveals a significant variation in how much governance capability is built in versus how much must be engineered on top. Several of the most demanding checklist items: explainability, audit trail, scope control, PII masking, behavioral correction tracking: are either architectural properties of the platform or they are expensive afterthoughts requiring significant custom engineering.
The governance case for HippoFabric's graph-based memory architecture is specific. Graph memory is structurally inspectable in a way vector memory is not. You can traverse the activation path that led to a specific agent output. You can identify which weighted connections were active in a specific decision. You can modify specific behavioral rules without retraining the model. For items 2.01 through 2.09 in this checklist: the explainability and audit trail domain. This architectural property reduces the compliance engineering burden significantly.
What Cortex provides for this checklist
The architecture conclusion
Governance is significantly easier with an architecture that is explainable by design. The compliance cost of deploying a black-box agent in a regulated environment: the custom audit logging, the explanation layer, the scope enforcement middleware: often exceeds the cost of the agent platform itself. Architecture choice is a governance decision, not just a technical one.
