aitp-playground
Run AITP scenario demonstrations end-to-end with real LLM-powered agents
Python FastAPI service that runs Agent Identity & Trust Protocol (AITP) scenario demonstrations end-to-end with real LLM-powered agents. Each scenario spins up a handful of agents that establish cryptographic identity, complete an AITP handshake, and do real LLM work under verifiable, scoped, revocable trust — so you can watch the protocol behave (and fail closed) instead of reading about it.
cp .env.example .env # optional: set OPENAI_API_KEY for real LLM output
docker compose up --build # service on :8000 → open http://localhost:8000/dashboardHow it works:
- Loads scenario packs from
scenarios/(intra-org,cross-org,cross-cloud) — declarative YAML, no code. - Spawns each scenario's agents as their own Python subprocesses (CrewAI / LangChain / LangGraph / custom), each on its own port with its own identity.
- Each agent uses the
aitp-pySDK to build its identity and run the 4-message AITP handshake with peers. - The runner drives capability calls, delegation, revocation, and more between agents, and surfaces a live event stream, narration, metrics, and a web dashboard.
This is a demo harness, not production. All AITP protocol logic lives
in aitp-py; this repo contains no envelope signing, JCS, or
handshake state. See docs/aitp-integration.md
for exactly where that boundary sits.
What it demonstrates
One service, ~20 scenarios, each isolating one AITP behavior:
| Area | Scenarios show… |
|---|---|
| Identity | pinned Ed25519/P-256 keys and OIDC (RFC-AITP-0002) ID-token binding |
| Handshake & TCTs | the 4-message mutual handshake; per-call capability authorization |
| Trust gating | a call with no/insufficient TCT is rejected (403), then succeeds after handshake; grant intersection |
| Delegation | single-hop and multi-hop delegation chains with scope narrowing (RFC-AITP-0006 / 0011) |
| Revocation | fail-closed local revocation and propagation through the Control Plane's signed list (RFC-AITP-0008) |
| Lifecycle | key rotation (0007), in-band TCT renewal + a verification cache (0005), session bundles (0010), SPKI pinning |
| Discovery | static localhost, did:web, and Control Plane registry — each with graceful fallback |
| Control Plane | optional enrollment, webhooks, trust-anchor provisioning, delegation-tree observability |
| Resilience | operator-injected faults (manifest_404, peer_offline) that the run survives with structured outcomes |
Everything is optional and degrades cleanly: no LLM key → deterministic
stubs (handshakes still run); no Control Plane → static fallback; an SDK
wheel built without --features experimental → the advanced scenarios
report "feature not available" instead of crashing. Check
GET /capabilities to see what your wheel exposes.
Documentation
The reader-facing docs are under docs/ (also published to the
docs site) — start with architecture.md:
- Architecture — components, runtime topology, where AITP lives.
- Getting started — install, env, first scenario run, endpoint cheatsheet, CLI.
- Scenarios — YAML schema, workflow step types, authoring guide.
- AITP integration — where the SDK is called; identity, handshake, TCT, delegation, revocation, and the post-v0.1 surfaces (OIDC, renewal, bundles, pinning, multi-hop).
- Observability — SSE events, narration, Prometheus metrics, the dashboard, run persistence.
- Control plane — the optional CP: discovery, enrollment, revocation, webhooks, trust anchors.
- Capabilities — which SDK features the installed wheel exposes, graceful degradation, conformance harness.
Deeper internals and ops mechanics — for hacking on the repo, not on the
docs site — live under
internal_docs/:
the runner engine, the agent-worker pattern, LLM providers, Docker, and the
test suite.
Sibling repos — the source of truth for everything the playground only orchestrates. The docs here link out to these rather than restating them:
agentidentitytrustprotocol— the normative AITP RFCs and registries.aitp-rs— reference Rust runtime; ships the Python SDK frombindings/aitp-py/. Start with its Python SDK guide.aitp-control-plane— the optional Control Plane the playground can talk to; see its API docs.
Quick start
Two paths.
Docker (no host toolchain)
The Dockerfile is multi-stage and builds the aitp SDK from the
sibling Rust source for you. The compose files set the build context
to the parent directory so the sibling repo is visible.
cp .env.example .env
$EDITOR .env # set OPENAI_API_KEY=sk-... (optional for stub runs)
# Just run the service:
docker compose up --build
# Or run the full LLM end-to-end test suite (three scenarios, real
# OpenAI, real AITP trust). Exit code of the `tests` container is the
# result.
docker compose -f docker-compose.test.yml up --build --abort-on-container-exitFirst image build is ~5 minutes on Apple Silicon (Rust cold compile); subsequent rebuilds are seconds thanks to BuildKit cache mounts.
Native (requires Rust + maturin once)
# 1. Build the aitp-py extension into your active venv (one-time).
# Add `--features experimental` to enable the post-v0.1 surfaces
# (TCT renewal, session bundles, SPKI pinning, the TCT verification
# cache, multi-hop delegation verify). Scenarios needing a feature the
# wheel was built without degrade cleanly — check GET /capabilities to
# see what the installed wheel exposes.
cd ../aitp-rs/bindings/aitp-py
maturin develop --release --features experimental
# 2. Install the service.
cd ../../../aitp-playground
uv sync # or: pip install -e .
# 3. Run.
uv run uvicorn aitp_playground.main:app --reload --port 8000
# 4. Trigger a scenario.
curl -X POST http://localhost:8000/runs \
-H "Content-Type: application/json" \
-d '{"scenario_ref":"intra-org/research-and-write@1.0.0",
"inputs":{"topic":"AI agent trust protocols"}}'
# 5. Watch live events (SSE) or poll:
curl -N http://localhost:8000/runs/<run_id>/events
curl http://localhost:8000/runs/<run_id> | jq .Agent extras (CrewAI / LangChain / LangGraph + the OpenAI/Anthropic clients) are optional; without them the agents fall back to deterministic stubs and AITP handshakes still run end-to-end. Install when you want real LLM output:
pip install -e ".[all-agents]"See docs/getting-started.md for the full env reference and the endpoint cheatsheet.
Repo map
aitp-playground/
├── docs/ # reader-facing docs (published to the docs site)
├── internal_docs/ # contributor & build docs (not published)
├── src/aitp_playground/ # FastAPI service — no AITP protocol logic here
│ ├── api/ # routes: /runs /scenarios /agents /capabilities /metrics /dashboard /cp/* /webhooks
│ ├── registry/ # YAML pack loader + index + templates
│ ├── runner/ # scenario engine + run store (+ optional SQLite) + SSE
│ ├── hosting/ # subprocess spawn, identity, port alloc, adapters
│ ├── trust/ # peer resolver + did:web + per-run OIDC issuer
│ ├── observability/ # metrics + event narrator
│ ├── cp_client/ # optional Control Plane client
│ ├── capabilities.py # SDK feature probe (GET /capabilities)
│ └── conformance.py # RFC fixture catalog + readiness
├── agents/ # agent subprocess workers
│ ├── base/ # shared aitp_server / bootstrap / telemetry / llm
│ ├── researcher/ # CrewAI worker
│ ├── writer/ # LangChain worker
│ └── analyzer/ # LangGraph worker
├── scenarios/ # YAML scenario packs (registry on disk)
└── tests/ # unit / integration / scenario / e2eTests
# Default unit suite — fast, in-process.
uv run pytest tests/unit/
# Runner integration — spawns real subprocesses, no LLM keys needed.
AITP_E2E=1 uv run pytest tests/integration/test_runner.py -v
# Protocol e2e — delegation/revocation/rotation/etc. under real trust,
# still no LLM keys (best run inside the Docker stack).
AITP_PROTOCOL_E2E=1 uv run pytest tests/integration/test_protocol_e2e.py -v
# Live LLM end-to-end (one-command via Docker, see above).Full details: internal_docs/testing.md.
License
See LICENSE.