Runbook
Operational reference for running Wakeplane in production or staging environments.
Operator warning: Wakeplane has no authentication or RBAC. Bind it to localhost, a trusted subnet, VPN, Tailscale, or a reverse-proxied private network. Do not expose it directly to the public internet. See Security before deploying.
Startup
Section titled “Startup”WAKEPLANE_DB_PATH=/var/lib/wakeplane/data.db \WAKEPLANE_HTTP_ADDR=:8080 \WAKEPLANE_WORKER_ID=wrk_prod_01 \wakeplane serveVerify startup:
curl http://localhost:8080/healthz # {"ok":true}curl http://localhost:8080/readyz # {"ok":true,"storage":"ok"}Health Endpoints
Section titled “Health Endpoints”| Endpoint | Purpose | Probe type |
|---|---|---|
GET /healthz | Process is alive | Liveness |
GET /readyz | Database is reachable | Readiness |
Shutdown
Section titled “Shutdown”Send SIGINT or SIGTERM. The daemon emits structured shutdown logs and timeout warnings if drain exceeds the deadline.
Metrics
Section titled “Metrics”Scrape GET /v1/metrics.
| Metric | Alert condition | Meaning |
|---|---|---|
runs_failed_total | Increasing | Executions failing |
dead_letters_total | > 0 | Runs exhausted all retries |
claimed_but_expired_total | > 0 | Workers dying mid-execution or lease TTL too short |
runs_due | Growing over time | Dispatcher not keeping up |
runs_retry_queued | Growing over time | Retries accumulating |
Status Interpretation
Section titled “Status Interpretation”GET /v1/status exposes scheduler timing, worker counts, and run counts.
Common Failures
Section titled “Common Failures”Runs Stuck In running
Section titled “Runs Stuck In running”Recovery is automatic on next startup through lease-expiry handling.
Runs Stuck In claimed
Section titled “Runs Stuck In claimed”Expired claimed leases are reset to pending.
Dead Letters Accumulating
Section titled “Dead Letters Accumulating”Inspect target configuration, executor logs, and run receipts.
Database Locked Errors
Section titled “Database Locked Errors”Ensure no other process is writing to the SQLite file.
Schedule Not Firing
Section titled “Schedule Not Firing”wakeplane schedule get <id>wakeplane run listEnvironment Reference
Section titled “Environment Reference”| Variable | Default | Description |
|---|---|---|
WAKEPLANE_DB_PATH | ./wakeplane.db | SQLite database file path |
WAKEPLANE_HTTP_ADDR | :8080 | HTTP listen address |
WAKEPLANE_WORKER_ID | wrk_local | Worker identity string in lease records |
WAKEPLANE_SCHEDULER_INTERVAL_SECONDS | 5 | Planner loop tick interval |
WAKEPLANE_DISPATCHER_INTERVAL_SECONDS | 2 | Dispatcher loop tick interval |
WAKEPLANE_LEASE_TTL_SECONDS | 30 | Worker lease TTL for crash recovery |