AIUC-1 Security Controls
Enforce AIUC-1 Domain B requirements — detect adversarial inputs, prevent endpoint scraping, filter malicious prompts, restrict unauthorized agent actions, and limit output over-exposure.
About This Policy Template
Comprehensive AIUC-1 Domain B compliance pack covering controls B002, B004–B006, and B009. Detects prompt injection attacks (direct, role-play, and encoded), blocks jailbreak attempts, identifies automated scraping and model extraction patterns, filters harmful and harassing content in inputs, prevents unauthorized AI agent actions and tool invocations, detects sensitive credentials in agent contexts, limits output over-exposure, and blocks system information disclosure and prompt extraction. Critical for any organization exposing AI systems to external users or operating autonomous agents.
Policy Rules(13)
Critical Severity
(6)Agent Scope Violation
Detect AI agent actions exceeding authorized scope or boundaries (AIUC-1 B006)
Harmful Content in Input
Detect user inputs requesting harmful, dangerous, or illegal content (AIUC-1 B005)
Jailbreak Attempt via Role Play
Detect jailbreak attempts using role-play, persona switching, or social engineering (AIUC-1 B002)
Prompt Injection Attempt
Detect prompt injection patterns designed to override system instructions (AIUC-1 B002)
System Prompt Extraction
Detect AI output that reveals system prompt or instructions (AIUC-1 B009)
Unauthorized Tool Invocation
Detect AI agent invoking tools or functions not authorized for its role (AIUC-1 B006)
High Severity
(5)Encoded or Obfuscated Injection
Detect encoded or obfuscated content that may contain adversarial injection (AIUC-1 B002)
Personal Attack or Harassment in Input
Detect harassment, personal attacks, or abusive language in user input (AIUC-1 B005)
Rapid-Fire Request Pattern
Detect automated scraping or model extraction patterns (AIUC-1 B004)
Sensitive Data in Agent Context Window
Detect sensitive credentials or connection strings in agent context (AIUC-1 B006)
System Information Disclosure
Detect AI output disclosing internal system information (AIUC-1 B009)
Medium Severity
(2)Excessive Output Length
Detect excessively long or verbose AI output indicating over-exposure (AIUC-1 B009)
Verbose Data Enumeration
Detect unnecessarily detailed data enumeration in AI output (AIUC-1 B009)
Enforcement by Integration
What happens when a violation is detected, based on the enforcement mode and integration type.
| Integration | Block | Approval | Warn | Monitor |
|---|---|---|---|---|
Version Control GitHub, GitLab, Bitbucket | Fail check run / merge request status | Pending check run — held for review | Neutral check run / comment on PR | Pass check run (silent) |
Email — Gmail Gmail | Quarantine label; + violation label (outbound) | Quarantine label — held for review | Add warning label | Log only |
Email — Outlook Outlook | Move to quarantine folder; + flag (outbound) | Move to quarantine — held for review | Flag + categorize | Log only |
Messaging Slack, Teams | Post violation warning in channel | Post 'held for review' warning | Post warning in channel | Log only |
Storage Google Drive, Dropbox, OneDrive | Move file to quarantine folder | Quarantine file — held for review | Log only | Log only |
AI Proxy OpenAI, Anthropic, Gemini, MCP, Agent | Block request (return 403) | Hold request — return review ID | Allow request + audit trail | Log only |
API REST API | Return BLOCK outcome (client decides) | Return APPROVAL_REQUIRED + poll URL | Return WARN outcome | Log only |
Version History
1 version published
Initial release
Ready to Install AIUC-1 Security Controls?
Get started with pre-built governance policies in minutes.