OpenAI (PII Redaction Guardrails)
Description
This use case shows how to govern LLM usage with decision-grade controls before any prompt leaves your network.
An internal KnowledgeOps agent may call OpenAI only when guardrails around capabilities, trust, model allow-lists, data classification, PII, region, and budget are satisfied.
Iron Book issues a one-shot, purpose-bound token and evaluates an open policy (Rego/OPA) per call; permitted calls proceed to OpenAI, denied calls return explicit reasons; both are fully auditable via Iron Book's Audit Log.
The script runs two scenarios:
- Allow: internal, no PII → model permitted → within budget → region US → OpenAI call proceeds.
- Deny: PII detected (email, SSN) → classification “internal” → blocked with reason.
Extend with a redaction scenario by, for example, setting purpose="redaction" and adjusting the policy to allow PII only for redaction workflows.
Business Problem & Value
Enterprises love the speed of LLMs but face real risks:
- PII/PHI exposure: prompts can accidentally leak regulated data.
- Model governance: teams quietly switch to unapproved models.
- Runaway spend: token costs are hard to cap at the call site.
- Auditability: approvals and denials need defensible, consistent, compliant evidence.
Traditional IAM can grant access to an API, but not to a specific model under specific business constraints per request. Iron Book closes that gap.
Why Iron Book vs. "Just More OAuth Scopes"?
OAuth proves an app can reach OpenAI; it doesn’t prove that this specific agent, with this capability, under this business context, and within budget, should be allowed to call this model right now. Iron Book adds:
- Agent-level least privilege (capabilities).
- Per-call, one-shot authorization with open, explainable policy.
- Behavioral trust + context (classification, PII, region, spend) in the same decision.
- Unified audit across agents, apps, and clouds.
Iron Book Components
- Verifiable agent identity (DID/VC) + capability descriptors (e.g., openai_infer).
- One-shot tokens bound to audience, action, resource, nonce, and expiry.
- Behavior/context-aware policy (Rego/OPA) that evaluates every call.
- Decision-grade audit: who/what acted, inputs summarized, policy version, reason, etc.
- Vendor-neutral: use with OpenAI today; add Bedrock, AOAI, Vertex with the same pattern.
High-Value Controls Enforced
The policy in this example allows an OpenAI call only when:
- Capability: agent declares
openai_infer
. - Trust threshold:
input.trust >= 60
(behaviorally adjustable; comes from Iron Book’s trust engine). - Model allow-list:
model = {gpt-4o-mini, gpt-4o, o4-mini}
. - Classification:
prompt is public | internal | deidentified
. - PII rule: deny if
PII detected = true
(example shows a simple local detector). - Region:
"US"
in the demo (extend for your data residency). - Budget: estimated
cost (¢) ≤ remaining daily budget (¢)
.
You can expand this with purpose-of-use, quiet hours, project codes, model version pins, tenant, business unit, approval ticket IDs, etc., all as simple policy inputs.
Quick Start
pip install ironbook-sdk openai pydantic
python openai_iron.py
SDK Implementation
"""
Secure OpenAI API calls with Iron Book
Enterprise scenario:
- An internal "KnowledgeOps" agent can call OpenAI models only when:
- The agent has capability 'openai_infer'
- Trust score >= 60
- The chosen model is in an allowlist
- Data classification ∈ {'public','internal','deidentified'}
- If PII is detected, the purpose must be 'redaction' (or else deny)
- Estimated call cost is within remaining daily budget
- Region is allowed (e.g., 'US')
What this script does:
1) Registers an agent in Iron Book with capability 'openai_infer'
2) Uploads a Rego/OPA policy that encodes the above guardrails
3) Defines `secured_openai_infer()` that:
- runs a simple local PII check on the prompt
- estimates token/cost for the model
- obtains a one-shot Iron Book token and requests a policy decision
- if allowed, calls OpenAI Responses API; if denied, raises
4) Runs two demos:
- Allowed normal inference
- Denied (PII present)
5) Feel free to add more demos, or to use this as a template for your own production policy:
- Add more capabilities to the agent
- Edit models on the allowlist
- Edit regions on the allowlist
- Edit classifications on the allowlist
- Add more PII patterns to the PII detection
- Add more cost guardrails
- Etc.
Rego policy build helper: https://play.openpolicyagent.org
"""
import os
import re
import math
import asyncio
from typing import Dict, Any, Optional
from pydantic import BaseModel, Field
# Iron Book SDK (async)
from ironbook_sdk import (
IronBookClient, RegisterAgentOptions, GetAuthTokenOptions,
UploadPolicyOptions, PolicyInput
)
# OpenAI SDK (sync)
from openai import OpenAI
# ------------------------------------------------------------------------------
# Configuration
# ------------------------------------------------------------------------------
IRONBOOK_API_KEY = 'REPLACE ME'
OPENAI_API_KEY = 'REPLACE ME'
IRONBOOK_AUDIENCE = 'https://api.openai.com' # default audience for OpenAI
AGENT_NAME = "knowledgeops-agent"
CAPABILITIES = ["openai_infer"]
# Simple per-model $/1K token pricing (illustrative; keep conservative)
# Adjust to your internal cost table if needed.
MODEL_PRICING_PER_1K = {
"gpt-4o-mini": {"input": 0.15, "output": 0.60}, # $/1k tok (example)
"gpt-4o": {"input": 2.50, "output": 5.00},
"o4-mini": {"input": 0.30, "output": 1.20},
}
# Default budget (in cents) for demo purposes
DEFAULT_DAILY_BUDGET_REMAINING_CENTS = 500 # $5.00 remaining
# ------------------------------------------------------------------------------
# Rego Policy: OpenAI guardrails
# ------------------------------------------------------------------------------
POLICY_CONTENT = """
default allow = false
# Constant sets
allowed_models = {"gpt-4o-mini", "gpt-4o", "o4-mini"}
allowed_classifications = {"public", "internal", "deidentified"}
allow if {
input.action == "infer"
input.resource == "openai://responses"
# Agent has the appropriate capability
input.capabilities[_] == "openai_infer"
# Trust threshold
input.trust >= 60
# Model and classification must be allowed
allowed_models[input.context.model]
allowed_classifications[input.context.data_classification]
# PII must not be present
input.context.pii_detected != true
# Region constraint
input.context.region == "US"
# Budget guardrail
input.context.estimated_cost_cents <= input.context.daily_budget_remaining_cents
}
"""
# ------------------------------------------------------------------------------
# Simple PII detection (demo only; replace with real DLP/PII detection)
# ------------------------------------------------------------------------------
PII_PATTERNS = [
re.compile(r"\b\d{3}-\d{2}-\d{4}\b"), # US SSN-like
re.compile(r"\b\d{10}\b"), # 10-digit numbers (phones, etc.)
re.compile(r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}"), # emails
]
def detect_pii(text: str) -> bool:
for pat in PII_PATTERNS:
if pat.search(text or ""):
return True
return False
# ------------------------------------------------------------------------------
# Token & cost estimation helpers (rough)
# ------------------------------------------------------------------------------
def estimate_tokens(text: str) -> int:
# Back-of-the-envelope: 1 token ~ 4 chars (varies widely in practice)
return max(1, math.ceil(len(text) / 4))
def estimate_cost_cents(model: str, prompt_tokens: int, max_output_tokens: int) -> int:
prices = MODEL_PRICING_PER_1K.get(model)
if not prices:
# unknown model: treat as expensive to be safe
prices = {"input": 5.0, "output": 10.0}
input_cost = (prompt_tokens / 1000.0) * prices["input"]
output_cost = (max_output_tokens / 1000.0) * prices["output"]
total_usd = input_cost + output_cost
return int(round(total_usd * 100)) # cents
# ------------------------------------------------------------------------------
# Iron Book bootstrap (async)
# ------------------------------------------------------------------------------
async def ib_bootstrap() -> Dict[str, Any]:
iron = IronBookClient(api_key=IRONBOOK_API_KEY)
# Note: You don't have to re-register the agent every time in the Production scenario
agent_vc = await iron.register_agent(RegisterAgentOptions(
agent_name=AGENT_NAME,
capabilities=CAPABILITIES
))
# Note: You don't have to re-upload the policy every time in the Production scenario
policy = await iron.upload_policy(UploadPolicyOptions(
agent_did=agent_vc["agentDid"],
config_type="opa",
policy_content=POLICY_CONTENT,
metadata={"name": "openai_guardrails", "version": "1.0"}
))
return {"iron": iron, "agent_vc": agent_vc, "policy": policy}
# ------------------------------------------------------------------------------
# Iron Book authorize (async)
# ------------------------------------------------------------------------------
async def ib_authorize_openai_call(agent_did: str, vc: str, policy_id: str, context: Dict[str, Any]) -> None:
iron = IronBookClient(api_key=IRONBOOK_API_KEY)
# Acquire a one-shot token for the OpenAI audience
token_data = await iron.get_auth_token(GetAuthTokenOptions(
agent_did=agent_did,
vc=vc,
audience=IRONBOOK_AUDIENCE
))
access_token = token_data.get("access_token")
if not access_token:
raise RuntimeError("Failed to obtain Iron Book access token")
# Ask for the policy decision
decision = await iron.policy_decision(PolicyInput(
agent_did=agent_did,
policy_id=policy_id,
token=access_token,
action="infer",
resource="openai://responses",
context=context
))
allowed = getattr(decision, "allow", False) if hasattr(decision, "allow") else decision.get("allow", False)
if not allowed:
reason = getattr(decision, "reason", None) if hasattr(decision, "reason") else decision.get("reason", "Denied by policy")
raise PermissionError(f"Iron Book policy denied OpenAI call: {reason}")
# ------------------------------------------------------------------------------
# Secured OpenAI call (sync wrapper)
# ------------------------------------------------------------------------------
class InferenceRequest(BaseModel):
prompt: str = Field(..., description="User prompt text.")
model: str = Field(default="gpt-4o-mini")
max_output_tokens: int = Field(default=200)
data_classification: str = Field(default="internal") # 'public'|'internal'|'deidentified'
purpose: str = Field(default="assistant") # 'assistant'|'redaction'|...
region: str = Field(default="US")
daily_budget_remaining_cents: int = Field(default=DEFAULT_DAILY_BUDGET_REMAINING_CENTS)
def secured_openai_infer(ib_ctx: Dict[str, Any], req: InferenceRequest) -> str:
# 1) Local prompt inspection for PII (signal only; source of truth is policy)
pii = detect_pii(req.prompt)
# 2) Rough cost estimate
prompt_tokens = estimate_tokens(req.prompt)
est_cost_cents = estimate_cost_cents(req.model, prompt_tokens, req.max_output_tokens)
# 3) Build policy context
context = {
"model": req.model,
"region": req.region,
"data_classification": req.data_classification,
"purpose": req.purpose,
"pii_detected": pii,
"estimated_cost_cents": est_cost_cents,
"daily_budget_remaining_cents": req.daily_budget_remaining_cents
}
# 4) Ask Iron Book for a decision (and fetch a one-shot token) — will raise on deny
asyncio.run(ib_authorize_openai_call(
agent_did=ib_ctx["agent_vc"]["agentDid"],
vc=ib_ctx["agent_vc"]["vc"],
policy_id=ib_ctx["policy"]["policyId"],
context=context
))
# 5) If allowed, call OpenAI
client = OpenAI(api_key=OPENAI_API_KEY)
response = client.responses.create(
model=req.model,
input=req.prompt,
max_output_tokens=req.max_output_tokens,
store=False,
)
# Normalize output string (Responses API shape may evolve)
try:
# new SDK: response.output_text available in many builds
return getattr(response, "output_text", None) or str(response)
except Exception:
return str(response)
# ------------------------------------------------------------------------------
# Demo
# ------------------------------------------------------------------------------
def main():
# Bootstrap Iron Book (register agent as a new agent, upload policy)
ib_ctx = asyncio.run(ib_bootstrap())
print("\n=== 1) Allowed normal inference (no PII) ===")
try:
out = secured_openai_infer(ib_ctx, InferenceRequest(
prompt="Summarize our Q3 internal roadmap into three bullet points for executives (no sensitive data).",
model="gpt-4o-mini",
data_classification="internal",
purpose="assistant",
region="US",
max_output_tokens=150
))
print("✅ ALLOW\n", out[:400], "...\n")
except Exception as e:
print("❌ DENY:", e, "\n")
print("=== 2) Denied due to PII being present ===")
try:
out = secured_openai_infer(ib_ctx, InferenceRequest(
prompt="Patient email is [email protected] and SSN is 123-45-6789. Draft a welcome letter.",
model="gpt-4o-mini",
data_classification="internal",
purpose="assistant",
region="US",
max_output_tokens=150
))
print("UNEXPECTED ALLOW:\n", out[:400], "...\n")
except Exception as e:
print("✅ Expected DENY:", e, "\n")
if __name__ == "__main__":
main()
Solution Architecture
Flow (per request):
- Agent identity & capability: The
KnowledgeOps
agent is registered in Iron Book with a single general-purpose capabilityopenai_infer
. - Policy upload: A Rego policy expressing the guardrails is uploaded/versioned.
- Local checks (signal): The script runs lightweight PII detection and a cost estimate (you may plug in a real DLP later).
- One-shot token: The agent requests a short-lived token with
audience=https://api.openai.com
. - Policy decision: The script calls Iron Book
policy_decision()
with custom context (model, region, classification, PII flag, costs, budget, etc.). - Allow → OpenAI: On allow, the script calls
openai.responses.create(...)
. - Deny → Reasoned failure: On deny, the script surfaces the exact reason (e.g., “PII detected”).
- Audit: Every allow/deny is logged with agent DID, policy version, context snapshot, reason, trust score, etc.
Key Implementation Elements
1) Agent Capability
Agent name: knowledgeops-agent
Primary capability: openai_infer
In production, you’ll typically have one agent per app/service or per critical workflow; capabilities let you keep least privilege.
2) Policy (Rego/OPA)
The policy evaluates a single simple allow
rule (demo uses the allow if { ... }
style) with conditions for capability, trust, model/classification allow-lists, PII, region, and budget. It’s uploaded through upload_policy()
and referenced by policyId
at decision time.
3) Context Inputs
The script builds and passes:
model
,region
,data_classification
,purpose
(optional in this demo).pii_detected
(bool) from the local detector.estimated_cost_cents
,daily_budget_remaining_cents
(numeric).
Add anything else you need (e.g., project
, ticket_id
, environment
, business_unit
), the policy can reference them immediately.
4) One-Shot Token
get_auth_token()
issues a short-lived one-shot JWT tied to the agent DID/VC, audience, and purpose. Even if intercepted, it quickly expires and is unusable outside the intended call.
5) Decision & Audit
policy_decision()
returns allow/deny and a reason. Iron Book records:
- agent DID, token/jti, action/resource, policy id with version.
- evaluated context snapshot (sanitized), agent's trust score.
- allow/deny result + reason (useful for support and audits).
Customization Checklist
You can easily extend this demo's features for production use:
- Models: enforce approved model families and versions and make this multi-provider: duplicate the same guardrails for Bedrock, Azure OpenAI, Vertex - only the audience, resource, and model sets change.
- Capabilities: split usage by team or workflow (
openai_infer_reports
,openai_infer_support
, …). - Classification: drive from your DLP or gateway; map to “public/internal/secret/regulatory”.
- PII/PHI: replace the demo regex with your enterprise DLP or a first-party PII service.
- Budget: read remaining budget from your cost tracker; deny or downgrade model dynamically.
- Region: expand to multi-region (“US”, “EU”, “CA”) based on user or tenant location.
- Purpose-of-use: e.g.,
assistant
,redaction
,summarization
,RCA
- enable/deny per purpose. - Evidence bundles: export decision logs to SIEM and attach to control mappings (SOC2/PCI/HIPAA/AI-Act).
This pattern converts LLM usage from “best-effort hygiene” into provable governance, with least privilege, adaptive risk controls, and audits that hold up.
Use this file as a template, plug in your policy inputs, and you’ll have a reusable trust layer for all AI calls across your estate. Reach out to us via [email protected] if you'd like some help.
Updated 6 days ago