<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Cache on Guillaume Delré</title><link>https://guillaumedelre.github.io/tags/cache/</link><description>Recent content in Cache on Guillaume Delré</description><generator>Hugo</generator><language>en</language><lastBuildDate>Sat, 16 May 2026 15:00:00 +0000</lastBuildDate><atom:link href="https://guillaumedelre.github.io/tags/cache/index.xml" rel="self" type="application/rss+xml"/><item><title>The Cache That Was Lying to Us</title><link>https://guillaumedelre.github.io/2026/05/16/the-cache-that-was-lying-to-us/</link><pubDate>Sat, 16 May 2026 15:00:00 +0000</pubDate><guid>https://guillaumedelre.github.io/2026/05/16/the-cache-that-was-lying-to-us/</guid><description>Part 6 of 8 in &amp;quot;Symfony to the Cloud: Twelve Factors, Thirteen Services&amp;quot;: How a single config line blocked horizontal scaling across 13 Symfony microservices, and what the twelve-factor app had to say about it.</description><category>symfony-to-the-cloud</category><content:encoded><![CDATA[<p>The first time we ran two replicas of the same Symfony service behind a load balancer, everything looked fine. Health checks passed. Traffic split cleanly. Response times were good.</p>
<p>Then someone noticed the rate limiter was acting strange. Hit the API five times, get blocked. Hit it five more times on the next request, get through. Depending on which pod answered, you were a different person.</p>
<p>That was the cache talking. One config line, replicated across thirteen services, was blocking horizontal scaling entirely.</p>
<h2 id="one-config-file-thirteen-times">One config file, thirteen times</h2>
<p>We were preparing a platform of thirteen Symfony microservices to move to Kubernetes. The stack was already in good shape: FrankenPHP for the HTTP server, multi-stage Dockerfiles, a GitLab CI that pushed tagged images to a cloud registry. The pieces were there. We just needed to verify nothing would break when we started scaling pods horizontally.</p>
<p>A good checklist for that kind of audit is the <a href="https://12factor.net" target="_blank" rel="noopener noreferrer">twelve-factor app methodology</a> — twelve principles for building software that runs cleanly in cloud environments. Most factors were already covered without us doing anything deliberate about it.</p>
<p>Factor VII (port binding) came for free. FrankenPHP embeds Caddy directly into the PHP process. The container exposes its own HTTP endpoint, no Apache or Nginx to bolt on. The image is self-contained, which is exactly what the factor requires:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-dockerfile" data-lang="dockerfile"><span style="display:flex;"><span><span style="color:#66d9ef">HEALTHCHECK</span> --start-period<span style="color:#f92672">=</span>60s CMD curl -f http://localhost:2019/metrics <span style="color:#f92672">||</span> exit <span style="color:#ae81ff">1</span><span style="color:#960050;background-color:#1e0010">
</span></span></span></code></pre></div><p>Factor II (dependencies) was handled by <code>composer.json</code> and the Dockerfile extensions. Factor X (dev/prod parity) was covered enough for our scope: same image, same backing services locally and in CI, which is the part that actually matters for what we were auditing.</p>
<p>Then I got to Factor VI.</p>
<h2 id="the-problem-with-it-works-on-one-server">The problem with &ldquo;it works on one server&rdquo;</h2>
<p>Factor VI says processes must share nothing. Nothing written to disk between requests, nothing in local memory that another instance can&rsquo;t see. If you need to persist state, put it in a backing service — a database, a cache cluster, a queue. The process itself stays disposable.</p>
<p>I opened <code>authentication/config/packages/cache.yaml</code>. Then <code>content/config/packages/cache.yaml</code>. Then <code>media/config/packages/cache.yaml</code>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-yaml" data-lang="yaml"><span style="display:flex;"><span><span style="color:#f92672">framework</span>:
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">cache</span>:
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">app</span>: <span style="color:#ae81ff">cache.adapter.filesystem</span>
</span></span></code></pre></div><p>Thirteen services. Thirteen times, word for word.</p>
<p>Every instance of every service was writing its cache to the local filesystem. Which meant every pod had its own private cache, invisible to every other pod. When the load balancer sent a request to pod A, it got pod A&rsquo;s cached version of reality. Pod B had built its own. They might have been generated at different times, from different source data, or one of them might not have been built yet at all.</p>
<p>The rate limiter was the most visible symptom because it had a counter. But the same divergence affected every piece of data we were caching: serializer metadata, route collections, Doctrine result caches. Two users sending identical requests could get different responses depending on which node happened to pick up the connection.</p>
<h2 id="redis-was-already-there">Redis was already there</h2>
<p>This is the part that stings a little. Redis was already in the stack. Every service had it configured via SncRedisBundle:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-yaml" data-lang="yaml"><span style="display:flex;"><span><span style="color:#75715e"># config/packages/snc_redis.yaml — present on all 13 services</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">snc_redis</span>:
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">clients</span>:
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">default</span>:
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">type</span>: <span style="color:#e6db74">&#39;phpredis&#39;</span>
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">alias</span>: <span style="color:#e6db74">&#39;default&#39;</span>
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">dsn</span>: <span style="color:#e6db74">&#39;%env(IN_MEM_STORE__URI)%&#39;</span>
</span></span></code></pre></div><p>Factor IV of the twelve-factor app says backing services should be attached resources, interchangeable through configuration. Redis was exactly that: reachable via an environment variable, ready to be swapped for a managed instance in the cloud. The plumbing was done. We just weren&rsquo;t using it for the application cache.</p>
<p>Some services even had it right for specific pools. The rate limiter in the authentication service:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-yaml" data-lang="yaml"><span style="display:flex;"><span><span style="color:#f92672">pools</span>:
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">rate_limiter.cache</span>:
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">adapter</span>: <span style="color:#ae81ff">cache.adapter.redis</span>
</span></span></code></pre></div><p>Which explains the inconsistency we saw first. The rate limit <em>count</em> went to Redis (shared across pods). The cache backing the rate limit <em>check</em> went to the filesystem (local to the pod). Two sources of truth, one invisible to the other.</p>
<p>The fix was one line per service:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-yaml" data-lang="yaml"><span style="display:flex;"><span><span style="color:#f92672">framework</span>:
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">cache</span>:
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">app</span>: <span style="color:#ae81ff">cache.adapter.redis</span>
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">default_redis_provider</span>: <span style="color:#ae81ff">snc_redis.default</span>
</span></span></code></pre></div><p>Thirteen files. Thirteen identical changes. The kind of fix that makes you feel like you should have caught it earlier, except it&rsquo;s perfectly invisible when you&rsquo;re running a single instance.</p>
<h2 id="what-needs-to-move-to-redis">What needs to move to Redis</h2>
<p>The filesystem cache violated Factor VI (processes carry local state they shouldn&rsquo;t) and Factor VIII (you can&rsquo;t scale out without sharing that state). They&rsquo;re the same problem seen from two angles: VI describes what&rsquo;s wrong, VIII describes what you can&rsquo;t do because of it.</p>
<p>With a shared cache backend, a second pod is safe. The two pods build the same cache, see the same invalidations, agree on the same rate limits. You can add a third pod under load and remove it when traffic drops. The orchestrator handles it; the application doesn&rsquo;t need to know.</p>
<p>Without it, horizontal scaling is a liability. More pods means more divergence, more &ldquo;works on my machine&rdquo; bugs that are impossible to reproduce locally because local only runs one container.</p>
<p>Sessions had the same problem — and potentially a worse one. Twelve of the thirteen services were using <code>session.storage.factory.native</code> — which writes sessions to the filesystem by default. A user whose request lands on pod A gets a session tied to pod A. Their next request goes to pod B. Session gone, they&rsquo;re logged out. Only one service had <code>RedisSessionHandler</code> configured.</p>
<p>The partial mitigation is that most of the platform runs stateless JWT-based APIs, so session usage is limited. But &ldquo;limited&rdquo; isn&rsquo;t &ldquo;zero&rdquo;. The services that do create sessions — authentication flows, temporary state during OAuth handshakes — have a user-visible failure mode waiting for the second pod. Either those sessions get moved to Redis, or the code that creates them gets removed. Leaving them as-is is a decision that waits for the first user whose session disappears without explanation.</p>
<h2 id="the-other-kind-of-state">The other kind of state</h2>
<p>Redis fixes the cross-pod problem. FrankenPHP introduces a different one worth knowing about.</p>
<p>In the standard PHP-FPM model, each request forks a fresh process. Every in-memory object — every cached value, every computed result — dies with the response. The process is stateless by construction.</p>
<p>FrankenPHP has a worker mode that doesn&rsquo;t follow that model. In worker mode, a single PHP process boots once, loads the kernel, wires the container, and handles multiple successive requests without restarting. Request throughput improves: no autoloader cold start, no container rebuild per request, fewer allocations. The tradeoff is that the PHP process now has a lifecycle that spans requests.</p>
<p>For cache, this adds a wrinkle. An <code>array</code> adapter or APCu pool accumulates entries across requests on the same worker. A cache invalidation pushed to Redis reaches the other pods immediately — but doesn&rsquo;t clear what&rsquo;s sitting in a worker&rsquo;s in-process memory. Two requests on the same pod can see different things: one hits a warm in-memory entry, the next triggers a Redis fetch after the in-process entry expires.</p>
<p>The platform keeps worker mode disabled (<code>APP__WORKER_MODE__ENABLED=false</code>). It&rsquo;s available — the infrastructure is there, the flag is wired — but it&rsquo;s not active. The performance gain didn&rsquo;t justify the audit. Every cache pool would need to be verified against worker-mode semantics; every place where state leaks between requests would become a potential bug.</p>
<p>The conservative position: keep PHP stateless at the process level even when the runtime doesn&rsquo;t require it. Factor VI&rsquo;s shared-nothing principle applies not just to the filesystem — it applies to the process itself.</p>
<h2 id="what-was-already-working">What was already working</h2>
<p>To be fair to the codebase: the Symfony Scheduler was already using Redis for distributed locks:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span>$schedule<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">lock</span>($this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">lockFactory</span><span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">createLock</span>(<span style="color:#e6db74">&#39;schedule_purge&#39;</span>));
</span></span></code></pre></div><p>In a multi-pod environment, you don&rsquo;t want five instances running the same purge job simultaneously. The lock prevents it. Redis makes the lock visible across pods. Whoever wrote the scheduler knew exactly what they were doing.</p>
<p>The same reasoning just hadn&rsquo;t propagated to the cache configuration — probably because when you&rsquo;re running a single instance, <code>cache.adapter.filesystem</code> is invisible. It works, it&rsquo;s fast, it requires zero configuration. The problem only appears at two.</p>
<h2 id="the-four-questions">The four questions</h2>
<p>Factor VI catches most applications off guard during a cloud migration. Not because developers don&rsquo;t know about stateless processes — they usually do — but because the filesystem is always there, and the problem stays hidden until you try to run a second instance.</p>
<p>Before scaling a Symfony service horizontally, four questions are worth answering:</p>
<ul>
<li>Where does the application cache go? (<code>cache.adapter.filesystem</code> needs to become <code>cache.adapter.redis</code>)</li>
<li>Where do sessions go? (<code>session.storage.factory.native</code> needs Redis — or remove sessions entirely if you&rsquo;re JWT-only)</li>
<li>Does anything write to <code>var/</code> at runtime that another pod would need to read?</li>
<li>Is anything in your code path that needs to be mutually exclusive across pods? (if yes, that&rsquo;s a job for the <a href="https://symfony.com/doc/current/components/lock.html" target="_blank" rel="noopener noreferrer">Symfony Lock component</a> backed by Redis, not a local mutex)</li>
</ul>
<p>If the answers all point to shared backing services, you&rsquo;re ready. If any of them points to the local filesystem, production will find the pod that built its cache three hours ago and serve it to the user who least expects it.</p>
]]></content:encoded></item></channel></rss>