Operations & troubleshooting
The shape of a Paas deployment is docker compose ps — most of your
operational primitives are familiar.
Where to look first
| Symptom | Service to tail |
|---|---|
git push hangs / refuses |
git-server |
| Push accepted but no Deployment row | api (look for /internal/git/post-receive) |
Deployment stuck Queued |
builder |
| Deployment built but app not live | orchestrator |
| App live but URL 404/502 | proxy-controller then nginx |
| TLS not active | cert-manager |
Database stuck Provisioning |
orchestrator (the DatabaseProvisioner runs there) |
docker compose logs -f api orchestrator builder proxy-controller cert-manager
Health endpoint
GET /health on the API returns {"ok":true} if the API is up. The
orchestrator/builder/proxy/cert services don't expose HTTP themselves — use
docker compose ps to see if they're Up.
Common failure modes
"I pushed and nothing happened"
- Did the push succeed without an error from the post-receive hook? If you
saw
[paas] queueing deploy: …, the hook fired. Check:docker compose logs api | grep post-receive. - If the API responded non-200, the body is in the API logs.
- If you saw no
[paas]output, the bare repo's hook isn't installed — that happens when sshd'sAuthorizedKeysCommandis failing (checkgit-serverlogs) and a key fell back to a default shell. Re-add your SSH key in the dashboard.
"Deployment is stuck in Building"
Look at builder logs. Common: out-of-disk on the host (docker system df),
or your Dockerfile requires network access to a registry that's blocked.
"App is Live but I get 502s"
- Inspect:
docker compose exec nginx cat /etc/nginx/conf.d/<host>.conf. Theupstreamblock should list at least one IP. - If it lists IPs but they're unreachable, check
paas-appsnetwork:docker network inspect paas-apps. Both nginx and the user container must be members. - If the upstream block has only
server 127.0.0.1:1 down;, the orchestrator has noHealthycontainers for the current Release. Look atcontainer_instances:docker compose exec postgres psql -U paas paas -c 'select status, failure_reason from container_instances order by created_at desc limit 10;'.
"The cert is stuck Pending forever"
docker compose logs cert-manager— the underlying ACME error is surfaced.- Verify port 80 is reachable from the public internet
(
curl -v http://<host>/.well-known/acme-challenge/test). - Confirm
PAAS_ACME_STAGING=truewhile debugging — staging has higher rate limits.
Backing up
The state-of-record is the control-plane Postgres in volume pg-data,
plus the repos (bare git repos), paas-secrets (master key + ACME
account), and certs (issued certificates) volumes. A serviceable backup
plan:
docker compose exec postgres pg_dump -U paas paas | gzip > paas-$(date +%F).sql.gz
docker run --rm -v paas-secrets:/src -v $PWD:/dst alpine tar czf /dst/secrets-$(date +%F).tgz -C /src .
docker run --rm -v repos:/src -v $PWD:/dst alpine tar czf /dst/repos-$(date +%F).tgz -C /src .
The local Docker registry (registry-data) is not backed up — images
are reproducible from repos + Dockerfile.
Restoring on a new host
# (Install Paas as usual, then immediately, before users sign in:)
docker compose -f docker-compose.yml down
docker volume create paas-secrets pg-data repos
docker run --rm -v paas-secrets:/dst -v $PWD:/src alpine tar xzf /src/secrets-….tgz -C /dst
docker run --rm -v repos:/dst -v $PWD:/src alpine tar xzf /src/repos-….tgz -C /dst
gunzip < paas-….sql.gz | docker compose run --rm -T postgres psql -U paas paas
docker compose up -d
Maintenance mode
There's no first-class maintenance mode in v1. To take an app offline
temporarily, set replicas=0 (the upstream block will go to the "down" placeholder
and nginx will return 502). To take the whole platform offline:
docker compose stop nginx
…leaves the dashboard / API unreachable but preserves all state.
Resetting everything
docker compose down -v # WARNING: deletes ALL state including user apps