All workshops
OpsIntermediateObservabilityMonitoringAIOps

AI Observability & AIOps

Monitoring AI Systems Before They Fail

Most AI systems silently degrade — drift, regressions, cost spikes, latency spikes. This workshop is the AIOps playbook for catching them before customers do.

Duration
2 hours
Mode
Live online
Audience
CTOs · Engineers · Infra engineers
Schedule
Quarterly · next dates announced via newsletter

01 · What you'll learn

Concrete outcomes by the end

  • Wire the four signals every AI system needs (cost, latency, quality, drift)
  • Detect quality regressions before users complain
  • Catch cost runaways inside the same hour they start
  • Build a per-request audit trail without leaking PII
  • Page on the right thing — not the noise

02 · Agenda

What we cover

  1. The four signals

    Hour 1

    Cost, latency, quality, drift — how to instrument each without instrumentation regret.

  2. Quality + drift

    Hour 1

    Online and offline evals, golden-set regression, drift detection.

  3. Cost + ops

    Hour 2

    Per-tenant cost ledgers, anomaly alerts, budget gates.

  4. Alerting that doesn't burn

    Hour 2

    Page on outcomes, not metrics. SLO patterns for AI systems. Runbook templates.

03 · Who should attend

The right audience

CTOsEngineersInfra engineersFounders

04 · Prerequisites

Come prepared

  • You operate at least one AI service in production
  • Familiar with logging / metrics tooling (any vendor)

05 · Speaker

Hosted by

Pankaj Kharkwal

Founder, Pankh AI

Pankaj builds production AI systems for businesses and runs Pankh AI. He has shipped agents, RAG pipelines, and observability stacks for companies that needed AI to actually work — not just demo.

06 · Outcomes

Why people attend

After this workshop you leave with a concrete artefact you built live and a playbook you can use the next week. Cohort chat stays open so you can ask follow-up questions while you ship.

07 · FAQ

Common questions

Vendor-specific or vendor-agnostic?+

Vendor-agnostic patterns. We demo with App Insights / OpenTelemetry / Prometheus, but the design generalizes.

Will you cover LLM-as-judge?+

Yes — with the honest tradeoffs. We cover when it's safe to use, and when human review is non-negotiable.

More workshops

Keep going

AI Observability & AIOps

₹2,999 · 2 hours