Ready Is Not the Same as Started

The rolling deploy looked clean. A new pod started. Kubernetes saw the healthcheck pass — php -v returned zero — and began routing traffic to the new container.

For the next forty seconds — out of a possible sixty — that container was polling for the database.

Requests that landed on it during that window got errors. Not many — the window was short — but enough to show up as noise in the monitoring. The kind of noise that gets dismissed as a transient network issue and filed nowhere. The deploy succeeded. The pod eventually became ready. The mechanism that caused it was still there, waiting for the next deploy.

The entrypoint script does five things before FrankenPHP starts: copy a version file, verify the vendor directory, wait up to sixty seconds for the database, run pending migrations, install assets and set filesystem permissions. In Docker Compose, this is invisible. In Kubernetes, the gap becomes traffic.

The gap between started and ready

Kubernetes decides whether to send traffic to a pod by watching its readiness probe. A pod whose readiness probe passes receives requests. A pod whose readiness probe fails is removed from the load balancer rotation until it recovers. This is the mechanism that makes rolling deploys safe: Kubernetes doesn’t cut over to a new pod until that pod says it’s ready.

The compose.yaml defines a healthcheck on every service:

healthcheck:
    test: [ "CMD", "php", "-v" ]
    interval: 30s
    timeout: 10s
    retries: 3
    start_period: 10s

php -v succeeds the moment the PHP binary is present — which is true from the first millisecond of container life. The start_period: 10s gives ten seconds before checks begin. But the entrypoint polling loop runs for up to sixty seconds before FrankenPHP even starts. At second ten, the healthcheck passes. The application is still waiting for the database.

The Dockerfile has a better signal:

HEALTHCHECK --start-period=60s CMD curl -f http://localhost:2019/metrics || exit 1

Port 2019 is Caddy’s built-in metrics server, embedded directly in FrankenPHP. The endpoint is Prometheus-compatible and only responds once Caddy’s HTTP stack is fully initialized and PHP workers are accepting connections. php -v exits in fifty milliseconds regardless of what the application is doing — it checks the binary, not the server. :2019/metrics only answers when the server is actually serving. It is also not an endpoint added just for the probe: every service in the platform already has it scraped by Prometheus, so the signal is live regardless of any healthcheck configuration.

That’s closer. But in Kubernetes, the HEALTHCHECK instruction is ignored entirely. Kubernetes uses its own probe configuration. Without explicit probe definitions in the Kubernetes manifests, there are no readiness checks — and a pod is considered ready the moment its container starts.

Which means: pod starts, entrypoint begins polling, Kubernetes routes traffic, application is not yet serving. Requests arrive at a container that isn’t ready to handle them.

Three signals, three questions

Kubernetes separates container lifecycle into three distinct questions, each with its own probe type:

startupProbe — “Has the application finished starting?” Fires repeatedly until it passes, then hands off to liveness. Prevents the liveness probe from killing a container that’s legitimately slow to initialize. For a container whose entrypoint can take sixty seconds, this is the right tool.

readinessProbe — “Is the application ready to handle requests?” Fails and passes throughout the container’s life. When it fails, the pod is removed from the load balancer. This is what makes a rolling deploy safe.

livenessProbe — “Is the application still alive?” If it fails, Kubernetes restarts the container. Meant to catch hung processes, not slow startups.

The sixty-second polling loop belongs in the startupProbe’s patience, not in application code:

startupProbe:
    httpGet:
        path: /metrics
        port: 2019
    failureThreshold: 12    # 12 attempts × 5s = 60s max
    periodSeconds: 5

Once the startupProbe passes, a readinessProbe on the same endpoint takes over — telling Kubernetes when the pod is safe to receive traffic — and a livenessProbe watches for hung processes. But the startupProbe is the one that absorbs the slow start. The entrypoint polling loop becomes redundant: its job was to keep the container alive while the database caught up. Without it, the application attempts to connect, fails, and the container exits — Kubernetes restarts the pod, and the startupProbe maintains its retry cycle until the database responds and the application starts cleanly. The retry responsibility moves from inside the entrypoint to the orchestrator, which is exactly where it belongs.

The migration problem

The polling loop is the most visible issue, but the migrations create a subtler one.

With a rolling deploy and two replicas, Kubernetes starts a new pod while the old one still serves traffic. Both pods run the same entrypoint. Both reach doctrine:migrations:migrate.

Doctrine’s migration table tracks which migrations have already executed, so a completed migration won’t run twice. But if two pods start simultaneously and both see a pending migration, both attempt to run it at the same time. Whether that’s safe depends on the migration: additive schema changes are usually fine; destructive ones less so. And you don’t get to choose which ones run on a deploy that didn’t expect to coordinate. --all-or-nothing wraps migrations in a transaction and rolls back everything if one fails — it’s about atomicity within a single run, not coordination across processes.

The cleaner approach separates the two concerns into two init containers: one that waits for the database, one that runs migrations. The main container starts only after both complete:

initContainers:
    - name: wait-for-db
      image: authentication:latest
      command: ["php", "bin/console", "dbal:run-sql", "-q", "SELECT 1"]
    - name: migrate
      image: authentication:latest
      command: ["php", "bin/console", "doctrine:migrations:migrate", "--no-interaction", "--all-or-nothing"]

Both init containers reuse the application image. That’s not waste: they need the same PHP binary and the same environment wiring to reach the database and resolve the migration classes. A lighter purpose-built image would reduce startup overhead, but would require maintaining a separate PHP installation in sync with the main image.

Even with init containers, multiple pods starting simultaneously — initial deploy, after a node failure, or under autoscaling pressure — will each attempt to run migrations. Solving that properly — through a Helm pre-upgrade hook, a maxSurge: 0 strategy, or a separate migration Job — is a topic in itself. What matters here is that the entrypoint is the wrong place to host that decision: it can’t coordinate across pods, and it ties migration execution to application startup in a way that’s hard to untangle later. The question of which approach fits this codebase — and why the entrypoint hasn’t been replaced — gets its own treatment in the next article in this series .

Factor XII of the twelve-factor methodology — admin processes run in the same environment as the application — is satisfied either way. The question is whether “same environment” means “same entrypoint script” or “same image, separate process”. In Kubernetes, the latter is safer.

What the entrypoint’s real job is

Strip out the database wait (now a startupProbe or init container), the migrations (now an init container or Job), and the assets install (a build-time operation that belongs in the Dockerfile), and the entrypoint has one remaining job: start the application.

exec docker-php-entrypoint "$@"

Factor IX of the twelve-factor app asks for fast startup and graceful shutdown. A container whose startup takes sixty seconds because it’s waiting for external dependencies is not fast. It means rolling deploys are slow, recovery after a crash is slow, and horizontal scale-out creates a sixty-second gap before each new pod contributes.

Fast startup is not just a nice-to-have. It’s what makes the rest of the cloud model work. When a pod can start in seconds, the orchestrator can scale aggressively and recover quickly. When it takes a minute, you add headroom everywhere — longer probe timeouts, larger deployment windows, more conservative scaling policies — and the system becomes rigid.

The Docker Compose tax

The entrypoint accumulates these responsibilities for a reason. In Docker Compose, there is no init container concept. There is no startupProbe. Services declare depends_on, but without health conditions, that’s just startup ordering — not readiness. The entrypoint fills the gap.

This is not a design flaw. It’s a reasonable adaptation to the constraints of Docker Compose. The script works. It handles edge cases (the database timeout, unrecoverable errors, missing migrations directory). Someone tested it.

The issue is the assumption that the same script works equally well in Kubernetes. It runs. The application eventually starts. But it bypasses the probe system that makes Kubernetes deployments reliable, and it puts migration responsibility in a place where coordination across pods is difficult to reason about.

Several of the changes in this series — media storage , secrets in image layers , log handlers , service dependencies , CI environment parity , cache adapters — were changes to application code or configuration. This one is different. It requires the infrastructure to gain awareness of what “ready” means for this application, and it requires the entrypoint to give up responsibilities it currently owns.

That’s a harder conversation. But the startupProbe is waiting for it.

The gap between started and ready#

Three signals, three questions#

The migration problem#

What the entrypoint’s real job is#

The Docker Compose tax#

The gap between started and ready

Three signals, three questions

The migration problem

What the entrypoint’s real job is

The Docker Compose tax