Getting started
Local dev loop, common commands, and a first scenario run.
Prerequisites
- Python 3.11+ (matches
pyproject.toml'srequires-python). uv(recommended) orpip.- The sibling
aitp-rs/bindings/aitp-pySDK built into your active venv. The Dockerized flow builds it for you; for native dev you'll build it once withmaturin.
Prefer Docker? Skip to docker.md —
docker compose -f docker-compose.test.yml up --build --abort-on-container-exitruns the service plus the e2e suite end-to-end with no host toolchain.
One-time SDK install
cd ../aitp-rs/bindings/aitp-py
maturin develop --release # builds the Rust extension into the active venvVerify:
python -c "import aitp; print(aitp.__version__ if hasattr(aitp,'__version__') else 'ok')"Install the service
cd aitp-playground
uv sync # or: pip install -e .Optional agent extras (only needed if you want real LangChain/CrewAI/LangGraph running on the host):
pip install -e ".[all-agents]" # crewai + langchain + langgraph + LLM clients
# or per agent:
pip install -e ".[researcher]"
pip install -e ".[writer]"
pip install -e ".[analyzer]"Without the extras the agents fall back to deterministic stubs — handshakes and TCTs still happen, only the LLM output is canned. Great for fast iteration on the runner.
Configure
Copy .env.example → .env and edit. The only key the service strictly
needs for real LLM output is OPENAI_API_KEY (or ANTHROPIC_API_KEY if
you set LLM_PROVIDER=anthropic). Everything else has a sensible default.
| Var | Default | Purpose |
|---|---|---|
PORT | 8000 | uvicorn bind port |
HOST | 0.0.0.0 | uvicorn bind host |
SCENARIOS_DIR | ./scenarios | Where the registry walks |
REGISTRY_CACHE_TTL_MS | 0 | 0 = reload every lookup (hot reload while authoring) |
AGENT_BASE_PORT | 8100 | First port handed to spawned agents |
AGENT_PYTHON | python3 | Interpreter for agent subprocesses |
PLAYGROUND_BASE_URL | http://localhost:8000 | Where agents POST telemetry |
CP_BASE_URL | (empty) | Optional Control Plane base URL (control-plane.md) |
CP_API_KEY | (empty) | Optional CP bearer |
CP_TIMEOUT_MS | 5000 | Per-request timeout for CP calls |
LLM_PROVIDER | openai | openai or anthropic |
OPENAI_API_KEY / ANTHROPIC_API_KEY | (empty) | Required for real LLM output |
OPENAI_MODEL | gpt-4o-mini | Override |
ANTHROPIC_MODEL | claude-sonnet-4-6 | Override |
RUN_HISTORY_DB | (empty) | When set, persist runs + events to this SQLite file so they survive a restart. Empty = in-memory only. |
LOG_LEVEL | INFO | Standard logging level |
See llm-providers.md for provider details.
Run the service
uv run uvicorn aitp_playground.main:app --reload --port 8000Hit health to confirm:
curl -s http://localhost:8000/healthz
# {"status":"ok"}First scenario run
curl -s -X POST http://localhost:8000/runs \
-H "Content-Type: application/json" \
-d '{"scenario_ref":"intra-org/research-and-write@1.0.0",
"inputs":{"topic":"AI agent identity"}}'
# {"run_id":"<uuid>","status":"pending","scenario_ref":"intra-org/research-and-write@1.0.0"}Watch it run:
# Poll the final state:
curl -s http://localhost:8000/runs/<uuid> | jq .
# Or stream events live (SSE):
curl -N http://localhost:8000/runs/<uuid>/eventsCancel a stuck run:
curl -s -X POST http://localhost:8000/runs/<uuid>/cancelUseful endpoints
| Endpoint | What |
|---|---|
GET /healthz | Liveness |
GET /capabilities | Installed aitp wheel + which experimental features it exposes (capabilities.md) |
GET /packs | List loaded scenario packs |
GET /scenarios | List all scenarios with refs |
GET /scenarios/{pack}/{scenario}@{version} | Full scenario YAML, parsed (+ template list) |
POST /runs | Start a run (async; returns run_id immediately). Body accepts template to run a variant. |
GET /runs | List recent runs (in-memory by default; RUN_HISTORY_DB makes them durable) |
GET /runs/{id} | Full run record incl. outputs and events |
GET /runs/{id}/status | Just status + event count |
GET /runs/{id}/events | SSE event stream (replay + live) |
GET /runs/{id}/narrate | Human-readable narration of the event log (text/plain) |
GET /runs/{id}/cp-deliveries | CP webhook deliveries this run has received (requires a prior cp_subscribe_webhook step) |
POST /webhooks/cp/{run_id} | Receiver Control Plane POSTs to during webhook fan-out (HMAC-verified) |
POST /runs/{id}/cancel | Kill agent subprocesses, mark cancelled |
GET /agents | List currently-running agent processes |
GET /metrics | Prometheus metrics (observability.md) |
GET /dashboard | Single-page trust console (HTML) |
GET /cp/* | Read-only Control Plane observability projections (control-plane.md) |
POST /internal/telemetry | Sink for agents — not for external use |
OpenAPI is at http://localhost:8000/docs while the server runs.
Scenario authoring CLI
For dev work without spinning up the API:
uv run python -m aitp_playground.cli list
uv run python -m aitp_playground.cli validate
uv run python -m aitp_playground.cli validate scenarios/intra-org/research-and-write
uv run python -m aitp_playground.cli dry-run intra-org/research-and-write@1.0.0 \
--inputs '{"topic":"test"}'dry-run validates the inputs against the scenario schema and prints
the trust mode, agent list, and workflow steps without spawning
anything. Useful for catching typos before you wait for spawns.
The CLI has a few more subcommands:
uv run python -m aitp_playground.cli new intra-org/my-scenario@1.0.0 # scaffold on disk
uv run python -m aitp_playground.cli lint # cross-scenario refs + step graph
uv run python -m aitp_playground.cli trace intra-org/research-and-write@1.0.0 \
--inputs '{"topic":"AI agents"}' # run against a live server + narrate
uv run python -m aitp_playground.cli conformance # RFC fixture catalog + wheel readinesstrace drives a run against a running playground and streams the
narration; conformance is covered in capabilities.md.
Common dev loops
| You're working on… | Run |
|---|---|
| Scenario YAML | Edit + dry-run; if good, hit POST /runs. REGISTRY_CACHE_TTL_MS=0 so no restart needed. |
| Runner engine | uv run pytest tests/integration/test_runner.py -v (uses stubs). |
| Agent worker | Restart the uvicorn server so it re-spawns subprocesses with your changes. |
| Real LLM behavior | Set OPENAI_API_KEY and either run a scenario or AITP_LLM_E2E=1 uv run pytest tests/integration/test_llm_e2e.py. |
See testing.md for the full test layout.
Troubleshooting
AITP_BOOTSTRAP_FILE not set— an agent is being launched without the env var. Almost always means you ran an agent script directly instead of letting the supervisor spawn it.- Subprocess exits before
AITP_AGENT_READY— supervisor prints the child's stderr. Most common cause: missing optional dep or broken import in the agent'screw.py/chain.py/graph.py. tct rejected: ...— the SDK'sverify_tctfailed. Look in the event log for the priortrust.establishedto confirm the jti/grants the caller actually holds.- Run hangs in
runningforever — usually an agent didn't bind its port withinstartupTimeoutMs(default 30s). Check the playground logs for the captured stdout/stderr from the child. /runs/{id}returnsoutputs: {}— the run failed before reaching any workflow step. Checkerrorand the event log; the failing step is usually right beforerun.failed.
When in doubt, the event log is authoritative — every state change in
the runner produces a RunEvent.