AI Defense &
Guardrails Engineering
Design and deploy production-grade guardrails, content filters, input validation pipelines, and safety systems for LLM-powered applications. Build defenses that hold up under real adversarial pressure.
Eight weeks. Four build cycles.
Fortnightly modules pair lecture with shipped components. Each block ends in a lab graded on realism, not slide coverage.
AI Threats & Defense Foundations
- Mapping the LLM attack surface in real products (chat, agents, RAG, tools)
- Trust boundaries, data flows, and where controls actually belong
- Grounding defensive design in OWASP LLM Top 10 and incident patterns
- Latency, cost, and false-positive constraints when deploying filters
- Baseline architecture patterns: gateway vs inline vs async review
Threat-model a production-style LLM service, enumerate abuse cases, and prioritize controls with measurable acceptance criteria.
Input Validation, Prompt Hardening & Content Filtering
- Structural and semantic checks on user and retrieved inputs
- Prompt templates, delimiters, and hardening patterns that survive iteration
- Designing tiered moderation: lexical, embedding, and classifier stages
- Handling multilingual and encoded abuse without brittle blocklists
- Operationalising policies: versioning, appeals, and human-in-the-loop
Implement a validation and filtering pipeline for a chat API; tune thresholds against a labelled abuse set and document trade-offs.
Guardrails Architecture, Output Sanitization & Monitoring
- Orchestrating routers, tool allow-lists, and structured output enforcement
- Post-generation checks: schema validation, PII redaction, and citation hygiene
- Streaming responses safely: partial output handling and kill switches
- Metrics, traces, and alerts tuned for policy violations—not generic APM noise
- Feedback loops: analyst review, incident playbooks, and regression tests
Extend a guardrail microservice with output sanitization, OpenTelemetry signals, and dashboards that surface policy breaches in context.
Defense-in-Depth, Adversarial Testing & Capstone
- Layering controls so single bypasses do not collapse the whole system
- Red-team handoff: repro cases, severity rubrics, and fix verification
- Stress testing filters under encoding tricks, multi-turn, and tool misuse
- Capstone: end-to-end defense design for a vulnerable reference platform
- Hardening roadmap: rollout, canarying, and continuous evaluation
Run a guided adversarial test campaign against your capstone stack, file defects with traces, patch, and re-run until acceptance gates pass.
Walk out with shippable defenses.
- Design layered defenses for LLM-backed products with explicit trust boundaries and control points
- Ship input validation, moderation, and routing patterns that balance safety with latency and UX
- Implement output controls, schema enforcement, and redaction suitable for regulated or sensitive contexts
- Instrument policy-aware telemetry and build incident workflows that engineers will actually use
- Execute structured adversarial tests against guardrails and iterate from measurable results
- Deliver a capstone architecture brief and reference implementation reviewers can audit
What you need coming in.
- Strong software or security engineering experience in production environments
- Fluency with HTTP APIs, auth basics, and at least one scripting or typed language
- Prior exposure to integrating or operating LLM features (in-house or vendor APIs)
- Comfort reading structured logs, traces, or metrics—not mandatory to be a data scientist
- Backend or platform engineers owning LLM integrations
- Application security engineers reviewing AI features
- Staff engineers defining safety architecture for product teams
- Trust & safety or policy engineers partnering with infra
- Technical programme managers bridging risk and engineering
Harden the systems everyone relies on.
Eight weeks of structured, lab-first guardrails engineering—demo webinar and optional bootcamp session included. Next cohort details go to the waitlist first.
100+ professionals already on the waitlist. Seats are filling fast.