<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Doctrine on Guillaume Delré</title><link>https://guillaumedelre.github.io/tags/doctrine/</link><description>Recent content in Doctrine on Guillaume Delré</description><generator>Hugo</generator><language>en</language><lastBuildDate>Sun, 17 May 2026 15:00:00 +0000</lastBuildDate><atom:link href="https://guillaumedelre.github.io/tags/doctrine/index.xml" rel="self" type="application/rss+xml"/><item><title>Eleven Out of Twelve</title><link>https://guillaumedelre.github.io/2026/05/17/eleven-out-of-twelve/</link><pubDate>Sun, 17 May 2026 15:00:00 +0000</pubDate><guid>https://guillaumedelre.github.io/2026/05/17/eleven-out-of-twelve/</guid><description>Part 8 of 8 in &amp;quot;Symfony to the Cloud: Twelve Factors, Thirteen Services&amp;quot;: Eleven factors resolved cleanly. The twelfth: Doctrine migrations in the entrypoint, waiting on a governance question that code alone can&amp;#39;t answer.</description><category>symfony-to-the-cloud</category><content:encoded><![CDATA[<p>The <code>composer.json</code> in each service had this in its <code>post-install-cmd</code> section:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-json" data-lang="json"><span style="display:flex;"><span><span style="color:#e6db74">&#34;post-install-cmd&#34;</span><span style="color:#960050;background-color:#1e0010">:</span> [
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;bin/console cache:clear --env=prod&#34;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#e6db74">&#34;bin/console doctrine:migrations:migrate --no-interaction&#34;</span>
</span></span><span style="display:flex;"><span>]
</span></span></code></pre></div><p><code>post-install-cmd</code> runs during <code>composer install</code>, which in the production Dockerfile runs during the image build. There is no database available during a Docker build. The migration command either failed silently, or connected to nothing, or was skipped by Doctrine when it couldn&rsquo;t find a schema to compare against. In any case, it didn&rsquo;t migrate anything.</p>
<p>This is a clean violation of <a href="https://12factor.net/admin-processes" target="_blank" rel="noopener noreferrer">Factor XII</a>
: admin processes — migrations, one-off scripts, console tasks — should run in the same environment as the application, against the actual production data. Running them at build time inverts the relationship. The image shouldn&rsquo;t know about the database. The database should be there when the image needs it.</p>
<h2 id="the-move-to-the-entrypoint">The move to the entrypoint</h2>
<p>The migration command moved from <code>composer.json</code> to <code>docker-entrypoint.sh</code>. The shift looks small on a diff. The implications are not.</p>
<p>The entrypoint runs when the container starts, not when the image is built. The database is reachable. The entrypoint waits for it — up to 60 seconds, one attempt per second — before doing anything:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-sh" data-lang="sh"><span style="display:flex;"><span>ATTEMPTS_LEFT_TO_REACH_DATABASE<span style="color:#f92672">=</span><span style="color:#ae81ff">60</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">until</span> <span style="color:#f92672">[</span> $ATTEMPTS_LEFT_TO_REACH_DATABASE -eq <span style="color:#ae81ff">0</span> <span style="color:#f92672">]</span> <span style="color:#f92672">||</span> <span style="color:#ae81ff">\
</span></span></span><span style="display:flex;"><span>  DATABASE_ERROR<span style="color:#f92672">=</span><span style="color:#66d9ef">$(</span>php bin/console dbal:run-sql -q <span style="color:#e6db74">&#34;SELECT 1&#34;</span> 2&gt;&amp;1<span style="color:#66d9ef">)</span>; <span style="color:#66d9ef">do</span>
</span></span><span style="display:flex;"><span>    sleep <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>    ATTEMPTS_LEFT_TO_REACH_DATABASE<span style="color:#f92672">=</span><span style="color:#66d9ef">$((</span>ATTEMPTS_LEFT_TO_REACH_DATABASE <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span><span style="color:#66d9ef">))</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">done</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">if</span> <span style="color:#f92672">[</span> $ATTEMPTS_LEFT_TO_REACH_DATABASE -eq <span style="color:#ae81ff">0</span> <span style="color:#f92672">]</span>; <span style="color:#66d9ef">then</span>
</span></span><span style="display:flex;"><span>    echo <span style="color:#e6db74">&#34;</span>$DATABASE_ERROR<span style="color:#e6db74">&#34;</span>
</span></span><span style="display:flex;"><span>    exit <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">fi</span>
</span></span></code></pre></div><p>If the database doesn&rsquo;t respond within 60 seconds, the container exits with an error and Kubernetes restarts it. Once the database is ready, the migration runs:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-sh" data-lang="sh"><span style="display:flex;"><span><span style="color:#66d9ef">if</span> <span style="color:#f92672">[</span> <span style="color:#e6db74">&#34;</span><span style="color:#66d9ef">$(</span> find ./migrations -iname <span style="color:#e6db74">&#39;*.php&#39;</span> -print -quit <span style="color:#66d9ef">)</span><span style="color:#e6db74">&#34;</span> <span style="color:#f92672">]</span>; <span style="color:#66d9ef">then</span>
</span></span><span style="display:flex;"><span>    php bin/console doctrine:migrations:migrate --no-interaction --all-or-nothing
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">fi</span>
</span></span></code></pre></div><p>Two changes from the original command: <code>--all-or-nothing</code> ensures that if any migration in a batch fails, the entire batch rolls back. And the <code>find</code> guard skips the command entirely if there are no migration files — useful for services that don&rsquo;t use Doctrine migrations at all.</p>
<p>This is genuinely better. The database is present. The migration runs in the real environment. The <code>--all-or-nothing</code> flag adds atomicity that the build-time version never had.</p>
<h2 id="what-it-doesnt-solve">What it doesn&rsquo;t solve</h2>
<p>Two pods redeploying simultaneously both run the entrypoint. Both reach the database. Both find pending migrations. Both call <code>doctrine:migrations:migrate</code>.</p>
<p>Doctrine has a locking mechanism: a <code>doctrine_migration_versions</code> table that records which migrations have run, and the command checks it before applying. Under normal conditions this is fine: the second pod finds the table up to date and exits cleanly. The real failure modes are more specific: a migration long enough that the database lock times out before it completes, letting a second runner start the same migration before the first has finished; or a pod that crashes mid-migration before recording the version in the table, leaving the schema in an applied-but-unregistered state that the next pod will try to apply again.</p>
<p>The team&rsquo;s position is explicit: a brief deployment downtime is acceptable. Application versions aren&rsquo;t necessarily forward-compatible with older schema versions, so running N and N+1 simultaneously against the same database isn&rsquo;t safe anyway. The deployment strategy is Recreate: all old pods are terminated before any new pods start. The migration runs on first startup, no overlap between versions. It works.</p>
<p>But &ldquo;it works&rdquo; and &ldquo;it&rsquo;s the right architecture&rdquo; are different answers.</p>
<h2 id="what-would-be-different">What would be different</h2>
<p><a href="https://12factor.net/admin-processes" target="_blank" rel="noopener noreferrer">Factor XII</a>
 says admin processes should run in &ldquo;one-off processes.&rdquo; A process that runs once, for a specific purpose, against the production environment. The entrypoint is not one-off — it runs every time a container starts, including restarts, scaling events, and Kubernetes node movements.</p>
<p>Three alternatives exist, each with a different answer to the question of ownership:</p>
<p><strong>A Kubernetes init container</strong> runs before the main container starts, in the same pod. It could run the migration, exit, and let the main container start only after it succeeds. The migration is isolated from the application runtime. The downside: the init container is another image to build and maintain, and it runs on every pod start — so a 14-service platform starting simultaneously still has a potential race.</p>
<p><strong>A Kubernetes Job</strong> runs once, on demand or triggered by a deployment pipeline. It can be made to run before any pods are updated — serial, isolated, with a clear success or failure signal. The race condition goes away. The complexity moves to the deployment process: the Job must complete before the Deployment rollout begins, and the CI pipeline must coordinate both.</p>
<p><strong>A Helm hook</strong> is the same concept expressed declaratively in the Helm chart. A <code>pre-upgrade</code> hook runs the migration before the application pods are updated. It&rsquo;s the most idiomatic Kubernetes answer. It also means the Helm chart is now responsible for running migrations — a decision that belongs to whoever owns the chart.</p>
<p>That last sentence is why the entrypoint hasn&rsquo;t changed. Moving migrations out of the application means deciding that the deployment infrastructure — not the application itself — is responsible for the schema. It&rsquo;s a governance question as much as a technical one, and governance questions take longer to resolve than code changes.</p>
<h2 id="the-honest-end">The honest end</h2>
<p>The migration block in the entrypoint is two lines. Literally: the <code>if [ &quot;$( find ./migrations... )&quot; ]</code> guard, and the <code>php bin/console doctrine:migrations:migrate</code> that follows. Eleven other factors have clean resolutions. The cache moved to Redis. The logs go to stdout. The filesystem is an S3 bucket. The CI assembles production images from the same commit it tests. The secrets don&rsquo;t travel in image layers.</p>
<p>Factor XII has an answer. It&rsquo;s just not the final one.</p>
<p>The migrations run at startup, with a real database, with atomicity, with a bounded retry window. That&rsquo;s better than running at build time against nothing. Whether they eventually move to a Job or a Helm hook is a conversation about who owns the schema — a question that a <code>kubectl apply</code> can&rsquo;t answer.</p>
]]></content:encoded></item><item><title>PostgreSQL full-text search through Doctrine, without a line of raw SQL</title><link>https://guillaumedelre.github.io/2025/02/10/postgresql-full-text-search-through-doctrine-without-a-line-of-raw-sql/</link><pubDate>Mon, 10 Feb 2025 00:00:00 +0000</pubDate><guid>https://guillaumedelre.github.io/2025/02/10/postgresql-full-text-search-through-doctrine-without-a-line-of-raw-sql/</guid><description>How we layered custom DBAL types and DQL wrappers on top of postgresql-for-doctrine to bring PostgreSQL full-text search to a Symfony API Platform project.</description><content:encoded><![CDATA[<p>The search box on the media library returned results in 800 milliseconds on staging. Production had forty times more rows. The query plan showed a sequential scan: no index involved, no way to fix it with a standard B-tree. The product team also wanted multi-word search: type &ldquo;interview president&rdquo;, get results containing both words. A <code>LIKE</code> query with wildcards has no clean way to express that without multiple independent conditions, each requiring its own scan.</p>
<p>PostgreSQL has had built-in full-text search for over fifteen years. The platform was already on PostgreSQL. The catch: the project uses Doctrine ORM, and Doctrine doesn&rsquo;t natively know what a <code>tsvector</code> is.</p>
<p>A community library, <a href="https://github.com/martin-georgiev/postgresql-for-doctrine" target="_blank" rel="noopener noreferrer">postgresql-for-doctrine</a>, covers part of that gap. It registers basic DQL functions like <code>TO_TSQUERY</code>, <code>TO_TSVECTOR</code>, and the <code>@@</code> match operator as separate atomic pieces. The foundation was there. Three things still had to be built on top.</p>
<h2 id="the-type-doctrine-has-never-seen">The type Doctrine has never seen</h2>
<p><a href="https://www.postgresql.org/docs/current/datatype-textsearch.html" target="_blank" rel="noopener noreferrer">PostgreSQL&rsquo;s full-text search</a> is built around two types: <code>tsvector</code> (a pre-processed list of normalized tokens) and <code>tsquery</code> (a search expression). You maintain a <code>tsvector</code> column, index it with GIN, and query with the <code>@@</code> match operator.</p>
<p>Doctrine&rsquo;s DBAL ships no <code>tsvector</code> type. Declaring <code>#[ORM\Column(type: 'tsvector')]</code> without registering it first throws a <code>UnknownColumnTypeException</code>. The fix is a custom DBAL type:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">TsVector</span> <span style="color:#66d9ef">extends</span> <span style="color:#a6e22e">Type</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">final</span> <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">const</span> <span style="color:#66d9ef">string</span> <span style="color:#a6e22e">DBAL_TYPE</span> <span style="color:#f92672">=</span> <span style="color:#e6db74">&#39;tsvector&#39;</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getSQLDeclaration</span>(<span style="color:#66d9ef">array</span> $column, <span style="color:#a6e22e">AbstractPlatform</span> $platform)<span style="color:#f92672">:</span> <span style="color:#a6e22e">string</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">self</span><span style="color:#f92672">::</span><span style="color:#a6e22e">DBAL_TYPE</span>;
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getName</span>()<span style="color:#f92672">:</span> <span style="color:#a6e22e">string</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">self</span><span style="color:#f92672">::</span><span style="color:#a6e22e">DBAL_TYPE</span>;
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">convertToDatabaseValueSQL</span>(<span style="color:#a6e22e">string</span> $sqlExpr, <span style="color:#a6e22e">AbstractPlatform</span> $platform)<span style="color:#f92672">:</span> <span style="color:#a6e22e">string</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">sprintf</span>(<span style="color:#e6db74">&#34;to_tsvector(&#39;simple&#39;, %s)&#34;</span>, $sqlExpr);
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">convertToDatabaseValue</span>(<span style="color:#a6e22e">mixed</span> $value, <span style="color:#a6e22e">AbstractPlatform</span> $platform)<span style="color:#f92672">:</span> <span style="color:#a6e22e">mixed</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> (<span style="color:#a6e22e">is_array</span>($value) <span style="color:#f92672">&amp;&amp;</span> <span style="color:#a6e22e">isset</span>($value[<span style="color:#e6db74">&#39;data&#39;</span>])) {
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">return</span> $value[<span style="color:#e6db74">&#39;data&#39;</span>];
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">is_string</span>($value) <span style="color:#f92672">?</span> $value <span style="color:#f92672">:</span> <span style="color:#66d9ef">null</span>;
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getMappedDatabaseTypes</span>(<span style="color:#a6e22e">AbstractPlatform</span> $platform)<span style="color:#f92672">:</span> <span style="color:#66d9ef">array</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> [<span style="color:#a6e22e">self</span><span style="color:#f92672">::</span><span style="color:#a6e22e">DBAL_TYPE</span>];
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The interesting method is <code>convertToDatabaseValueSQL()</code>. Doctrine calls it to wrap the SQL placeholder before the value reaches the database. The written value automatically becomes <code>to_tsvector('simple', ?)</code> at the DBAL boundary with no extra step needed on the calling side.</p>
<p>Register the type in <code>doctrine.yaml</code>, then map the column on the entity:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-yaml" data-lang="yaml"><span style="display:flex;"><span><span style="color:#f92672">doctrine</span>:
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">dbal</span>:
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">types</span>:
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">tsvector</span>: <span style="color:#ae81ff">App\Doctrine\DBAL\Types\TsVector</span>
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#75715e">#[ORM\Column(type: &#39;tsvector&#39;, nullable: true)]
</span></span></span><span style="display:flex;"><span><span style="color:#66d9ef">protected</span> <span style="color:#f92672">?</span><span style="color:#a6e22e">string</span> $textSearch <span style="color:#f92672">=</span> <span style="color:#66d9ef">null</span>;
</span></span></code></pre></div><p>PHP-side, the value is a plain string. The conversion to a proper <code>tsvector</code> happens invisibly at the DBAL layer.</p>
<p>We used the <code>'simple'</code> dictionary, which tokenizes on whitespace and punctuation without language-specific stemming. The platform handles multiple languages, and French stemming rules would break Spanish. Simple is good enough for phonetics.</p>
<h2 id="keeping-the-column-current">Keeping the column current</h2>
<p>A <code>tsvector</code> column is derived data: it has to stay in sync with the source fields whenever the entity changes. A Doctrine event listener handles that:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#75715e">#[AsDoctrineListener(event: Events::prePersist)]
</span></span></span><span style="display:flex;"><span><span style="color:#75715e">#[AsDoctrineListener(event: Events::preUpdate)]
</span></span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">MediaTsVectorSubscriber</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">prePersist</span>(<span style="color:#a6e22e">PrePersistEventArgs</span> $event)<span style="color:#f92672">:</span> <span style="color:#a6e22e">void</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> (<span style="color:#f92672">!</span>$event<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getObject</span>() <span style="color:#a6e22e">instanceof</span> <span style="color:#a6e22e">Media</span>) {
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">return</span>;
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>        $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">updateTextSearch</span>($event<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getObject</span>());
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">preUpdate</span>(<span style="color:#a6e22e">PreUpdateEventArgs</span> $event)<span style="color:#f92672">:</span> <span style="color:#a6e22e">void</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> (<span style="color:#f92672">!</span>$event<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getObject</span>() <span style="color:#a6e22e">instanceof</span> <span style="color:#a6e22e">Media</span>) {
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">return</span>;
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>        $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">updateTextSearch</span>($event<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getObject</span>());
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">private</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">updateTextSearch</span>(<span style="color:#a6e22e">Media</span> $entity)<span style="color:#f92672">:</span> <span style="color:#a6e22e">void</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        $entity<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">setTextSearch</span>(
</span></span><span style="display:flex;"><span>            <span style="color:#a6e22e">sprintf</span>(<span style="color:#e6db74">&#39;%s %s&#39;</span>, $entity<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getTitle</span>(), $entity<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getCaption</span>())
</span></span><span style="display:flex;"><span>        );
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Before every persist and update, the subscriber concatenates the fields that should be searchable into <code>textSearch</code>. Doctrine flushes the combined string, the DBAL type wraps it in <code>to_tsvector('simple', ...)</code>, and PostgreSQL stores the tokenized form.</p>
<p>One subtlety: the PHP-side value is <code>&quot;title caption&quot;</code>, not the actual tsvector output. The database shows <code>'caption' 'title'</code> (sorted tokens), but the entity holds a plain string. That&rsquo;s expected: the conversion is a DBAL responsibility, not a PHP one. It can be confusing to debug until you remember where the boundary is.</p>
<h2 id="extending-dql-with-fts-operators">Extending DQL with FTS operators</h2>
<p>Doctrine&rsquo;s DQL covers common SQL operations, but anything PostgreSQL-specific is out of scope. That&rsquo;s where <code>postgresql-for-doctrine</code> starts: it registers <code>TO_TSQUERY</code>, <code>TO_TSVECTOR</code>, and <code>TSMATCH</code> as individual DQL functions. Writing a full-text query in DQL without it would mean dropping to native SQL entirely.</p>
<p>The library&rsquo;s functions are atomic, though. Each maps to one SQL call. Expressing a full match check in DQL looks like <code>TSMATCH(o.textSearch, TO_TSQUERY(:term))</code>. Readable enough, but the team wanted something more compact: a single DQL function that encodes both the match operator and the query type, including <code>websearch_to_tsquery</code>, which <code>postgresql-for-doctrine</code> didn&rsquo;t ship.</p>
<p>The solution is <a href="https://www.doctrine-project.org/projects/doctrine-orm/en/latest/cookbook/dql-user-defined-functions.html" target="_blank" rel="noopener noreferrer">custom DQL functions</a> via <code>FunctionNode</code>. You parse the DQL syntax, then emit SQL. All FTS functions share the same two-argument signature, so an abstract base class handles parsing:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#66d9ef">abstract</span> <span style="color:#66d9ef">class</span> <span style="color:#a6e22e">TsFunction</span> <span style="color:#66d9ef">extends</span> <span style="color:#a6e22e">FunctionNode</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#a6e22e">PathExpression</span><span style="color:#f92672">|</span><span style="color:#a6e22e">Node</span><span style="color:#f92672">|</span><span style="color:#66d9ef">null</span> $ftsField <span style="color:#f92672">=</span> <span style="color:#66d9ef">null</span>;
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#a6e22e">PathExpression</span><span style="color:#f92672">|</span><span style="color:#a6e22e">Node</span><span style="color:#f92672">|</span><span style="color:#66d9ef">null</span> $queryString <span style="color:#f92672">=</span> <span style="color:#66d9ef">null</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">parse</span>(<span style="color:#a6e22e">Parser</span> $parser)<span style="color:#f92672">:</span> <span style="color:#a6e22e">void</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        $parser<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">match</span>(<span style="color:#a6e22e">TokenType</span><span style="color:#f92672">::</span><span style="color:#a6e22e">T_IDENTIFIER</span>);
</span></span><span style="display:flex;"><span>        $parser<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">match</span>(<span style="color:#a6e22e">TokenType</span><span style="color:#f92672">::</span><span style="color:#a6e22e">T_OPEN_PARENTHESIS</span>);
</span></span><span style="display:flex;"><span>        $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">ftsField</span> <span style="color:#f92672">=</span> $parser<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">StringPrimary</span>();
</span></span><span style="display:flex;"><span>        $parser<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">match</span>(<span style="color:#a6e22e">TokenType</span><span style="color:#f92672">::</span><span style="color:#a6e22e">T_COMMA</span>);
</span></span><span style="display:flex;"><span>        $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">queryString</span> <span style="color:#f92672">=</span> $parser<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">StringPrimary</span>();
</span></span><span style="display:flex;"><span>        $parser<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">match</span>(<span style="color:#a6e22e">TokenType</span><span style="color:#f92672">::</span><span style="color:#a6e22e">T_CLOSE_PARENTHESIS</span>);
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Each concrete class implements <code>getSql()</code> to emit its PostgreSQL expression:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#75715e">// e.textSearch @@ websearch_to_tsquery(&#39;simple&#39;, :term)
</span></span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">TsWebsearchQueryFunction</span> <span style="color:#66d9ef">extends</span> <span style="color:#a6e22e">TsFunction</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getSql</span>(<span style="color:#a6e22e">SqlWalker</span> $sqlWalker)<span style="color:#f92672">:</span> <span style="color:#a6e22e">string</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">ftsField</span><span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">dispatch</span>($sqlWalker)
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">.</span><span style="color:#e6db74">&#34; @@ websearch_to_tsquery(&#39;simple&#39;, &#34;</span>
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">.</span>$this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">queryString</span><span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">dispatch</span>($sqlWalker)<span style="color:#f92672">.</span><span style="color:#e6db74">&#39;)&#39;</span>;
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// ts_rank(e.textSearch, to_tsquery(:term)) for relevance ordering
</span></span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">TsRankFunction</span> <span style="color:#66d9ef">extends</span> <span style="color:#a6e22e">TsFunction</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getSql</span>(<span style="color:#a6e22e">SqlWalker</span> $sqlWalker)<span style="color:#f92672">:</span> <span style="color:#a6e22e">string</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#e6db74">&#39;ts_rank(&#39;</span>
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">.</span>$this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">ftsField</span><span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">dispatch</span>($sqlWalker)
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">.</span><span style="color:#e6db74">&#39;, to_tsquery(&#39;</span><span style="color:#f92672">.</span>$this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">queryString</span><span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">dispatch</span>($sqlWalker)<span style="color:#f92672">.</span><span style="color:#e6db74">&#39;))&#39;</span>;
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-yaml" data-lang="yaml"><span style="display:flex;"><span><span style="color:#f92672">doctrine</span>:
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">orm</span>:
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">entity_managers</span>:
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">default</span>:
</span></span><span style="display:flex;"><span>                <span style="color:#f92672">dql</span>:
</span></span><span style="display:flex;"><span>                    <span style="color:#f92672">string_functions</span>:
</span></span><span style="display:flex;"><span>                        <span style="color:#f92672">tswebsearchquery</span>: <span style="color:#ae81ff">App\Doctrine\ORM\Query\AST\Functions\TsWebsearchQueryFunction</span>
</span></span><span style="display:flex;"><span>                        <span style="color:#f92672">tsrank</span>: <span style="color:#ae81ff">App\Doctrine\ORM\Query\AST\Functions\TsRankFunction</span>
</span></span><span style="display:flex;"><span>                        <span style="color:#f92672">tsquery</span>: <span style="color:#ae81ff">App\Doctrine\ORM\Query\AST\Functions\TsQueryFunction</span>
</span></span><span style="display:flex;"><span>                        <span style="color:#f92672">tsplainquery</span>: <span style="color:#ae81ff">App\Doctrine\ORM\Query\AST\Functions\TsPlainQueryFunction</span>
</span></span></code></pre></div><p><code>websearch_to_tsquery</code> is the right choice for user-facing search: spaces become AND, quoted strings become phrases, <code>-word</code> excludes a term. No need to teach users to type <code>interview &amp; president</code>. It was added in PostgreSQL 11. On older versions, <code>plainto_tsquery</code> is the closest equivalent.</p>
<h2 id="the-api-platform-filter-and-the-gin-index">The API Platform filter and the GIN index</h2>
<p>With the DQL functions registered, the API Platform filter is straightforward. A custom <code>AbstractFilter</code> calls the DQL function directly in the <code>QueryBuilder</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">TextSearchFilter</span> <span style="color:#66d9ef">extends</span> <span style="color:#a6e22e">AbstractFilter</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">protected</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">filterProperty</span>(
</span></span><span style="display:flex;"><span>        <span style="color:#a6e22e">string</span> $property,
</span></span><span style="display:flex;"><span>        $value,
</span></span><span style="display:flex;"><span>        <span style="color:#a6e22e">QueryBuilder</span> $queryBuilder,
</span></span><span style="display:flex;"><span>        <span style="color:#a6e22e">QueryNameGeneratorInterface</span> $queryNameGenerator,
</span></span><span style="display:flex;"><span>        <span style="color:#a6e22e">string</span> $resourceClass,
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">?</span><span style="color:#a6e22e">Operation</span> $operation <span style="color:#f92672">=</span> <span style="color:#66d9ef">null</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">array</span> $context <span style="color:#f92672">=</span> []
</span></span><span style="display:flex;"><span>    )<span style="color:#f92672">:</span> <span style="color:#a6e22e">void</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> (<span style="color:#e6db74">&#39;textSearch&#39;</span> <span style="color:#f92672">!==</span> $property <span style="color:#f92672">||</span> <span style="color:#66d9ef">empty</span>($value)) {
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">return</span>;
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        $queryBuilder
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">andWhere</span>(<span style="color:#e6db74">&#39;tswebsearchquery(o.textSearch, :value) = true&#39;</span>)
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">setParameter</span>(<span style="color:#e6db74">&#39;:value&#39;</span>, $value);
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getDescription</span>(<span style="color:#a6e22e">string</span> $resourceClass)<span style="color:#f92672">:</span> <span style="color:#66d9ef">array</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> [];
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Apply it on the entity alongside the index declaration:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#75715e">#[ORM\Index(
</span></span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">columns</span><span style="color:#f92672">:</span> [<span style="color:#e6db74">&#39;text_search&#39;</span>],
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">name</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;media_text_search_idx_gin&#39;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">options</span><span style="color:#f92672">:</span> [<span style="color:#e6db74">&#39;USING&#39;</span> <span style="color:#f92672">=&gt;</span> <span style="color:#e6db74">&#39;gin (text_search)&#39;</span>]
</span></span><span style="display:flex;"><span>)]
</span></span><span style="display:flex;"><span><span style="color:#75715e">#[ApiFilter(TextSearchFilter::class, properties: [&#39;textSearch&#39; =&gt; &#39;partial&#39;])]
</span></span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Media</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// ...
</span></span></span><span style="display:flex;"><span>    <span style="color:#75715e">#[ORM\Column(type: &#39;tsvector&#39;, nullable: true)]
</span></span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">protected</span> <span style="color:#f92672">?</span><span style="color:#a6e22e">string</span> $textSearch <span style="color:#f92672">=</span> <span style="color:#66d9ef">null</span>;
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The <code>USING gin</code> option is non-negotiable. A standard B-tree index on a <code>tsvector</code> column is useless: PostgreSQL can&rsquo;t use it for <code>@@</code> queries. GIN (Generalized Inverted Index) works differently: it indexes each token individually, so lookups by any token are <code>O(log n)</code> rather than <code>O(n)</code>. Without it, you&rsquo;ve built a fast-looking system that still does a full table scan.</p>
<p>A <code>GET /media?textSearch=interview+president</code> now hits the GIN index and returns in single-digit milliseconds regardless of table size.</p>
<h2 id="what-the-split-actually-looked-like">What the split actually looked like</h2>
<p>The library covered the low-level atomic functions. The custom code covered the gaps: a <code>tsvector</code> DBAL type the library didn&rsquo;t provide, convenience DQL wrappers that combined <code>@@</code> and <code>websearch_to_tsquery</code> into a single call, and the application-specific glue connecting it all to Doctrine&rsquo;s event system and API Platform. Nothing needed to drop to a native query.</p>
<p>The split is worth noting in general: <code>postgresql-for-doctrine</code> gives you the atomic PostgreSQL building blocks, but you still need to compose them into something the rest of the codebase can use without thinking about it. The <code>FunctionNode</code> pattern and the <code>convertToDatabaseValueSQL()</code> hook are the two extension points that make that composition clean. Both are worth knowing about, regardless of what library you start from.</p>
]]></content:encoded></item><item><title>Revision pruning with window functions and logarithms, when DQL wasn't enough</title><link>https://guillaumedelre.github.io/2020/09/27/revision-pruning-with-window-functions-and-logarithms-when-dql-wasnt-enough/</link><pubDate>Sun, 27 Sep 2020 00:00:00 +0000</pubDate><guid>https://guillaumedelre.github.io/2020/09/27/revision-pruning-with-window-functions-and-logarithms-when-dql-wasnt-enough/</guid><description>How a logarithmic score and ROW_NUMBER() OVER PARTITION BY solved runaway revision table growth after DQL hit its limits.</description><content:encoded><![CDATA[<p>Every content update on the platform creates a revision. That&rsquo;s by design: editors need a history they can roll back to, and the platform needs an audit trail. What nobody anticipated was the rate. Some articles go through forty saves in a single afternoon. A high-traffic piece accumulates hundreds of revisions over its lifetime. After a few months, the revision table had several million rows.</p>
<p>Deleting them naively wasn&rsquo;t an option. &ldquo;Keep the last 50&rdquo; loses all historical context for articles that haven&rsquo;t been touched in a year. &ldquo;Keep one per day&rdquo; loses all the detail for content that&rsquo;s actively being edited. What we needed was a distribution that matched how revisions are actually used: dense coverage for recent history, sparse coverage for old history.</p>
<p>That&rsquo;s a logarithmic distribution. And building it required raw SQL.</p>
<h2 id="why-simple-strategies-fail">Why simple strategies fail</h2>
<p>The appeal of a fixed window is obvious: keep the N most recent revisions and delete the rest. It&rsquo;s one line of SQL and zero math. The problem is that it treats a revision from yesterday and a revision from three years ago as equally valuable, which they aren&rsquo;t. An editor who opens an article from 2017 doesn&rsquo;t need its last 50 versions; they might need one per quarter. An article that shipped this morning might need every save from the past hour.</p>
<p>A time-based strategy (one revision per calendar day) has the opposite problem: it&rsquo;s too aggressive for active content. If an article gets 30 saves between 09:00 and 10:00, all of them except one disappear. That&rsquo;s not history, that&rsquo;s erasure.</p>
<p>Neither strategy can express &ldquo;keep more detail for recent content, less for old content.&rdquo; That relationship is logarithmic.</p>
<h2 id="the-scoring-idea">The scoring idea</h2>
<p>The algorithm assigns each revision a score based on its age, then keeps only one revision per score bucket. The score formula produces high, widely-spaced values for recent revisions and small, clustered values for old ones.</p>
<p>The core expression, simplified, looks like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span>(
</span></span><span style="display:flex;"><span>  ln( <span style="color:#66d9ef">EXTRACT</span>(epoch <span style="color:#66d9ef">FROM</span> (now() <span style="color:#f92672">-</span> created_at)) )
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">/</span>
</span></span><span style="display:flex;"><span>  ( <span style="color:#66d9ef">EXTRACT</span>(epoch <span style="color:#66d9ef">FROM</span> (now() <span style="color:#f92672">-</span> created_at)) <span style="color:#f92672">/</span> <span style="color:#ae81ff">6000</span> )
</span></span><span style="display:flex;"><span>)
</span></span><span style="display:flex;"><span><span style="color:#f92672">*</span> ( <span style="color:#ae81ff">1</span> <span style="color:#f92672">/</span> (<span style="color:#66d9ef">EXTRACT</span>(epoch <span style="color:#66d9ef">FROM</span> (now() <span style="color:#f92672">-</span> created_at)) <span style="color:#f92672">/</span> <span style="color:#ae81ff">60</span> <span style="color:#f92672">/</span> <span style="color:#ae81ff">1440</span>) )
</span></span><span style="display:flex;"><span><span style="color:#f92672">*</span> <span style="color:#ae81ff">1000</span>
</span></span></code></pre></div><p>Let <code>s</code> be the age in seconds. The formula is roughly <code>ln(s) / s * C</code>, where both the logarithm in the numerator and <code>s</code> in the denominator make the result decrease rapidly as <code>s</code> grows.</p>
<p>Cast to an integer, the effect is this: a revision saved 10 minutes ago might score 8432, one saved 11 minutes ago scores 8431. They&rsquo;re in different buckets. A revision from six months ago scores 2, one from eight months ago also scores 2. Same bucket. The window function then picks the most recent revision from each bucket and discards the rest.</p>
<p>The result is automatic: recent saves are all kept because each has a distinct score; old saves are thinned because many share the same score.</p>
<h2 id="the-dql-attempt-that-didnt-ship">The DQL attempt that didn&rsquo;t ship</h2>
<p>Window functions aren&rsquo;t part of DQL. Doctrine&rsquo;s query language has no syntax for <code>OVER</code>, <code>PARTITION BY</code>, or <code>ROW_NUMBER()</code>. Before going to raw SQL, the team tried to add them.</p>
<p>The <code>FunctionNode</code> approach works for single SQL functions, as we&rsquo;d already seen with FTS. A <code>RowNumber</code> node emitting <code>ROW_NUMBER()</code> is trivial:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">RowNumber</span> <span style="color:#66d9ef">extends</span> <span style="color:#a6e22e">FunctionNode</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getSql</span>(<span style="color:#a6e22e">SqlWalker</span> $sqlWalker)<span style="color:#f92672">:</span> <span style="color:#a6e22e">string</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#e6db74">&#39;ROW_NUMBER()&#39;</span>;
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The harder part is <code>OVER(PARTITION BY ... ORDER BY ...)</code>. An <code>Over</code> function node was drafted, with a custom <code>PartitionByClause</code> AST node to handle the <code>PARTITION BY</code> clause:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Over</span> <span style="color:#66d9ef">extends</span> <span style="color:#a6e22e">FunctionNode</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">protected</span> <span style="color:#f92672">?</span><span style="color:#a6e22e">PartitionByClause</span> $partitionByClause <span style="color:#f92672">=</span> <span style="color:#66d9ef">null</span>;
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">protected</span> <span style="color:#f92672">?</span><span style="color:#a6e22e">OrderByClause</span> $orderByClause <span style="color:#f92672">=</span> <span style="color:#66d9ef">null</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getSql</span>(<span style="color:#a6e22e">SqlWalker</span> $sqlWalker)<span style="color:#f92672">:</span> <span style="color:#a6e22e">string</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#e6db74">&#39;OVER(&#39;</span>
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">.</span>($this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">partitionByClause</span>
</span></span><span style="display:flex;"><span>                <span style="color:#f92672">?</span> $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">partitionByClause</span><span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">dispatch</span>($sqlWalker)
</span></span><span style="display:flex;"><span>                <span style="color:#f92672">:</span> ($this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">orderByClause</span>
</span></span><span style="display:flex;"><span>                    <span style="color:#f92672">?</span> $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">orderByClause</span><span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">dispatch</span>($sqlWalker)
</span></span><span style="display:flex;"><span>                    <span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;&#39;</span>))
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">.</span><span style="color:#e6db74">&#39;)&#39;</span>;
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>It was never finished. The classes shipped marked <code>@deprecated</code> and &ldquo;NOT TESTED YET&rdquo;. The issue is composability: DQL&rsquo;s <code>FunctionNode</code> works cleanly for functions that appear in WHERE clauses or SELECT expressions. A window function like <code>ROW_NUMBER() OVER (PARTITION BY ...)</code> is a different structure: it appears in a SELECT position, modifies the surrounding query semantics, and requires the parser to handle <code>PARTITION BY</code> as an extension to DQL&rsquo;s grammar. Making that robust enough to trust in production is a significant investment. Going to DBAL and writing the SQL directly took an afternoon.</p>
<h2 id="the-query-layer-by-layer">The query, layer by layer</h2>
<p>The final implementation is three nested queries:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-sql" data-lang="sql"><span style="display:flex;"><span><span style="color:#66d9ef">DELETE</span> <span style="color:#66d9ef">FROM</span> revision
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">WHERE</span> iri <span style="color:#f92672">=</span> <span style="color:#f92672">?</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">AND</span> id <span style="color:#66d9ef">NOT</span> <span style="color:#66d9ef">IN</span> (
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">SELECT</span> id <span style="color:#66d9ef">FROM</span> (
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">SELECT</span>
</span></span><span style="display:flex;"><span>            row_number() OVER (
</span></span><span style="display:flex;"><span>                PARTITION <span style="color:#66d9ef">BY</span> num, iri
</span></span><span style="display:flex;"><span>                <span style="color:#66d9ef">ORDER</span> <span style="color:#66d9ef">BY</span> num <span style="color:#66d9ef">DESC</span>, created_at <span style="color:#66d9ef">DESC</span>
</span></span><span style="display:flex;"><span>            ) <span style="color:#66d9ef">AS</span> lines,
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">*</span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">FROM</span> (
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">SELECT</span>
</span></span><span style="display:flex;"><span>                (
</span></span><span style="display:flex;"><span>                    ( ln( <span style="color:#66d9ef">EXTRACT</span>(epoch <span style="color:#66d9ef">FROM</span> (now() <span style="color:#f92672">-</span> created_at)) )
</span></span><span style="display:flex;"><span>                      <span style="color:#f92672">/</span> ( <span style="color:#66d9ef">EXTRACT</span>(epoch <span style="color:#66d9ef">FROM</span> (now() <span style="color:#f92672">-</span> created_at)) <span style="color:#f92672">/</span> <span style="color:#ae81ff">6000</span> ) )
</span></span><span style="display:flex;"><span>                    <span style="color:#f92672">*</span> ( <span style="color:#ae81ff">1</span> <span style="color:#f92672">/</span> (<span style="color:#66d9ef">EXTRACT</span>(epoch <span style="color:#66d9ef">FROM</span> (now() <span style="color:#f92672">-</span> created_at)) <span style="color:#f92672">/</span> <span style="color:#ae81ff">60</span> <span style="color:#f92672">/</span> <span style="color:#ae81ff">1440</span>) )
</span></span><span style="display:flex;"><span>                    <span style="color:#f92672">*</span> <span style="color:#ae81ff">1000</span>
</span></span><span style="display:flex;"><span>                )::numeric::integer <span style="color:#66d9ef">AS</span> num,
</span></span><span style="display:flex;"><span>                <span style="color:#f92672">*</span>
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">FROM</span> revision
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">WHERE</span> iri <span style="color:#f92672">=</span> <span style="color:#f92672">?</span>
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">ORDER</span> <span style="color:#66d9ef">BY</span> created_at <span style="color:#66d9ef">DESC</span>
</span></span><span style="display:flex;"><span>        ) <span style="color:#66d9ef">AS</span> lst
</span></span><span style="display:flex;"><span>    ) <span style="color:#66d9ef">AS</span> rst
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">WHERE</span> lines <span style="color:#f92672">=</span> <span style="color:#ae81ff">1</span>
</span></span><span style="display:flex;"><span>);
</span></span></code></pre></div><p><strong>Inner query:</strong> computes <code>num</code>, the integer score, for every revision of the given IRI. Rows are sorted by <code>created_at DESC</code> at this stage.</p>
<p><strong>Middle query:</strong> runs <code>ROW_NUMBER() OVER (PARTITION BY num, iri ORDER BY num DESC, created_at DESC)</code>. Within each score bucket (<code>num</code>), revisions are numbered starting from 1 in descending age order. The most recent revision in each bucket gets <code>lines = 1</code>.</p>
<p><strong>Outer filter:</strong> keeps only the <code>lines = 1</code> rows, one revision per score bucket.</p>
<p><strong>DELETE:</strong> removes every revision for this IRI that isn&rsquo;t in the kept set.</p>
<p>The <code>PARTITION BY num, iri</code> is redundant on the IRI (the whole query is already filtered to one IRI), but makes the intent explicit and keeps the logic correct if the query is ever reused in a broader context.</p>
<p>The method is called from a companion query that identifies which IRIs have accumulated more than a threshold of revisions:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getIrisWithMoreRevisionThan</span>(<span style="color:#a6e22e">int</span> $maxRevisionsCount, <span style="color:#a6e22e">int</span> $limit <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>, <span style="color:#f92672">?</span><span style="color:#a6e22e">int</span> $retencyDay <span style="color:#f92672">=</span> <span style="color:#66d9ef">null</span>)<span style="color:#f92672">:</span> <span style="color:#66d9ef">array</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    $queryBuilder <span style="color:#f92672">=</span> $this
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">createQueryBuilder</span>(<span style="color:#e6db74">&#39;revision&#39;</span>)
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">select</span>(<span style="color:#e6db74">&#39;revision.iri&#39;</span>)
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">groupBy</span>(<span style="color:#e6db74">&#39;revision.iri&#39;</span>)
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">having</span>(<span style="color:#e6db74">&#39;COUNT(1) &gt; :maxRevisions&#39;</span>)
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">orderBy</span>(<span style="color:#e6db74">&#39;COUNT(1)&#39;</span>, <span style="color:#a6e22e">Order</span><span style="color:#f92672">::</span><span style="color:#a6e22e">Descending</span><span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">value</span>)
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">setParameter</span>(<span style="color:#e6db74">&#39;maxRevisions&#39;</span>, $maxRevisionsCount);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// ...
</span></span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">array_column</span>($queryBuilder<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getQuery</span>()<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getResult</span>(), <span style="color:#e6db74">&#39;iri&#39;</span>);
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The two methods run together in a scheduled cleanup: find the IRIs over the threshold, prune each one.</p>
<h2 id="wiring-it-to-a-scheduled-command">Wiring it to a scheduled command</h2>
<p>The pruning query doesn&rsquo;t run in a request. It runs behind a Symfony command, called on a schedule.</p>
<p>The command takes a few options to control how aggressively it runs:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#75715e">#[AsCommand(&#39;app:purge:revision&#39;, &#39;Remove useless revisions&#39;)]
</span></span></span><span style="display:flex;"><span><span style="color:#66d9ef">final</span> <span style="color:#66d9ef">class</span> <span style="color:#a6e22e">PurgeRevisionCommand</span> <span style="color:#66d9ef">extends</span> <span style="color:#a6e22e">Command</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">protected</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">configure</span>()<span style="color:#f92672">:</span> <span style="color:#a6e22e">void</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        $this
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">addOption</span>(<span style="color:#e6db74">&#39;max-revisions&#39;</span>, <span style="color:#e6db74">&#39;m&#39;</span>, <span style="color:#a6e22e">InputOption</span><span style="color:#f92672">::</span><span style="color:#a6e22e">VALUE_REQUIRED</span>,
</span></span><span style="display:flex;"><span>                <span style="color:#e6db74">&#39;Revision threshold above which an IRI gets pruned&#39;</span>, <span style="color:#ae81ff">30</span>)
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">addOption</span>(<span style="color:#e6db74">&#39;limit&#39;</span>, <span style="color:#e6db74">&#39;l&#39;</span>, <span style="color:#a6e22e">InputOption</span><span style="color:#f92672">::</span><span style="color:#a6e22e">VALUE_REQUIRED</span>,
</span></span><span style="display:flex;"><span>                <span style="color:#e6db74">&#39;Max number of IRIs to process per run&#39;</span>)
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">addOption</span>(<span style="color:#e6db74">&#39;delay&#39;</span>, <span style="color:#e6db74">&#39;w&#39;</span>, <span style="color:#a6e22e">InputOption</span><span style="color:#f92672">::</span><span style="color:#a6e22e">VALUE_REQUIRED</span>,
</span></span><span style="display:flex;"><span>                <span style="color:#e6db74">&#39;Delay in seconds between each IRI&#39;</span>)
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">addOption</span>(<span style="color:#e6db74">&#39;retencyDay&#39;</span>, <span style="color:#e6db74">&#39;r&#39;</span>, <span style="color:#a6e22e">InputOption</span><span style="color:#f92672">::</span><span style="color:#a6e22e">VALUE_OPTIONAL</span>,
</span></span><span style="display:flex;"><span>                <span style="color:#e6db74">&#39;Only process IRIs whose last revision is older than N days&#39;</span>);
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">protected</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">execute</span>(<span style="color:#a6e22e">InputInterface</span> $input, <span style="color:#a6e22e">OutputInterface</span> $output)<span style="color:#f92672">:</span> <span style="color:#a6e22e">int</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        $iris <span style="color:#f92672">=</span> $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">revisionRepository</span><span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getIrisWithMoreRevisionThan</span>(
</span></span><span style="display:flex;"><span>            (<span style="color:#a6e22e">int</span>) $input<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getOption</span>(<span style="color:#e6db74">&#39;max-revisions&#39;</span>),
</span></span><span style="display:flex;"><span>            (<span style="color:#a6e22e">int</span>) $input<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getOption</span>(<span style="color:#e6db74">&#39;limit&#39;</span>),
</span></span><span style="display:flex;"><span>            (<span style="color:#a6e22e">int</span>) $input<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getOption</span>(<span style="color:#e6db74">&#39;retencyDay&#39;</span>),
</span></span><span style="display:flex;"><span>        );
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">foreach</span> ($iris <span style="color:#66d9ef">as</span> $iri) {
</span></span><span style="display:flex;"><span>            $totalDeleted <span style="color:#f92672">+=</span> $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">revisionRepository</span><span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">deleteOldRevisionForIri</span>($iri);
</span></span><span style="display:flex;"><span>            <span style="color:#a6e22e">usleep</span>((<span style="color:#a6e22e">int</span>) $input<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getOption</span>(<span style="color:#e6db74">&#39;delay&#39;</span>) <span style="color:#f92672">*</span> <span style="color:#ae81ff">1_000_000</span>);
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">Command</span><span style="color:#f92672">::</span><span style="color:#a6e22e">SUCCESS</span>;
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The <code>--delay</code> option is worth noting: on a busy database, hammering a hundred <code>DELETE</code> statements back-to-back can cause lock contention. A small sleep between iterations keeps the purge from competing with production traffic.</p>
<p>The command runs behind two crontab entries with different thresholds:</p>
<pre tabindex="0"><code># Hourly: keep 30 revisions per IRI, process 100 IRIs per run
0 * * * * php bin/console app:purge:revision --max-revisions 30 --limit 100

# Nightly: for content untouched for a year, keep only 3
0 0 * * * php bin/console app:purge:revision --max-revisions 3 --limit 100 --retencyDay 365
</code></pre><p>The two-level strategy matters. The hourly job keeps 30 revisions per IRI, which is a reasonable ceiling for actively-edited content. The nightly job targets only IRIs not updated in over a year and keeps just 3. An article that hasn&rsquo;t moved in twelve months doesn&rsquo;t need thirty versions in its history.</p>
<h2 id="what-it-looks-like-in-practice">What it looks like in practice</h2>
<p>An article saved 200 times will typically keep 20 to 30 revisions after pruning: most of the recent saves, a handful from last month, one or two from each quarter of the previous year. The exact count depends on the age distribution of the saves, not on an arbitrary cap.</p>
<p>An article last edited two years ago might end up with 5 or 6 revisions. Recent edits are all there; the old history is compressed but not gone.</p>
<p>It&rsquo;s not a perfect history. It&rsquo;s a useful one.</p>
<h2 id="the-line-between-dql-and-raw-sql">The line between DQL and raw SQL</h2>
<p>The window function attempt isn&rsquo;t a failure worth hiding. It&rsquo;s a useful data point: <code>FunctionNode</code> works well for scalar functions in WHERE and SELECT positions, but composing a full <code>ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...)</code> expression in DQL is harder than it looks. The grammar extension, the AST nodes, the SQL walker integration: it&rsquo;s a non-trivial amount of code for something that native SQL handles in three lines.</p>
<p>The practical boundary is roughly this: if a PostgreSQL feature maps to a function call with fixed arity, custom DQL works. If it requires new clause syntax (window frames, CTEs, lateral joins), native DBAL is usually the better trade-off.</p>
]]></content:encoded></item><item><title>Enforcing UTC in Doctrine without touching your entities</title><link>https://guillaumedelre.github.io/2017/02/19/enforcing-utc-in-doctrine-without-touching-your-entities/</link><pubDate>Sun, 19 Feb 2017 00:00:00 +0000</pubDate><guid>https://guillaumedelre.github.io/2017/02/19/enforcing-utc-in-doctrine-without-touching-your-entities/</guid><description>How to override Doctrine&amp;#39;s built-in types to enforce UTC everywhere, without touching a single entity.</description><content:encoded><![CDATA[<p>A timestamp coming back from the database one hour off. Not every time. Only when the dev server runs in <code>Europe/Paris</code> and CI runs in UTC. The kind of bug that disappears when you look for it and comes back in production on a Friday evening.</p>
<p>The problem isn&rsquo;t in the business logic. It&rsquo;s in what Doctrine quietly does with dates.</p>
<h2 id="what-doctrine-does-by-default">What Doctrine does by default</h2>
<p>When you declare a <code>datetime</code> field in a Doctrine entity, the conversion between PHP and the database goes through <code>DateTimeType</code>. That class calls <code>format()</code> on your <code>DateTime</code> object to write to the database, and <code>DateTime::createFromFormat()</code> to read it back. No mention of timezone anywhere.</p>
<p>If your PHP object is in <code>Europe/Paris</code>, Doctrine formats <code>2017-01-15 11:30:00</code> and writes it as-is. If the server reading that field is in UTC, it gets <code>2017-01-15 11:30:00</code> and interprets it as UTC. One hour has evaporated in the round trip, without a single error message.</p>
<p><a href="https://www.doctrine-project.org/projects/doctrine-orm/en/latest/cookbook/working-with-datetime.html" target="_blank" rel="noopener noreferrer">The Doctrine docs cover this</a>, suggesting custom types as the fix. What they mention in passing is that you can give those custom types the same name as the built-in ones. That detail changes everything.</p>
<h2 id="replace-dont-add">Replace, don&rsquo;t add</h2>
<p>Most custom Doctrine type examples introduce a new name: <code>utc_datetime</code>, <code>app_date</code>, and so on. You then annotate every field with <code>type: 'utc_datetime'</code> in the entities. It works, but it&rsquo;s tedious and doesn&rsquo;t protect against a forgotten <code>type: 'datetime'</code>.</p>
<p>The other option: register the custom type under the name <code>datetime</code>. Doctrine replaces its own type with yours, everywhere, no exceptions. Every <code>datetime</code> field across all entities goes through your logic, without changing a single annotation.</p>
<p>That&rsquo;s what we just shipped across our PHP microservices platform. Here&rsquo;s what it looks like.</p>
<h2 id="the-shared-trait">The shared trait</h2>
<p>Both types (<code>date</code> and <code>datetime</code>) share the same conversion logic through a trait:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#66d9ef">trait</span> <span style="color:#a6e22e">UTCDate</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">private</span> <span style="color:#a6e22e">\DateTimeZone</span> $utc;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">convertToPHPValue</span>($value, <span style="color:#a6e22e">AbstractPlatform</span> $platform)<span style="color:#f92672">:</span> <span style="color:#f92672">?</span><span style="color:#a6e22e">\DateTime</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> (<span style="color:#66d9ef">null</span> <span style="color:#f92672">===</span> $value <span style="color:#f92672">||</span> $value <span style="color:#a6e22e">instanceof</span> <span style="color:#a6e22e">\DateTime</span>) {
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">return</span> $value;
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        $format <span style="color:#f92672">=</span> $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getFormat</span>($platform);
</span></span><span style="display:flex;"><span>        $converted <span style="color:#f92672">=</span> <span style="color:#a6e22e">\DateTime</span><span style="color:#f92672">::</span><span style="color:#a6e22e">createFromFormat</span>($format, $value, $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getUtc</span>());
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> (<span style="color:#f92672">!</span>$converted) {
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">throw</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">\RuntimeException</span>(
</span></span><span style="display:flex;"><span>                <span style="color:#a6e22e">sprintf</span>(<span style="color:#e6db74">&#39;Could not convert database value &#34;%s&#34; to DateTime using format &#34;%s&#34;.&#39;</span>, $value, $format)
</span></span><span style="display:flex;"><span>            );
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">postConvert</span>($converted);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> $converted;
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">abstract</span> <span style="color:#66d9ef">protected</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getFormat</span>(<span style="color:#a6e22e">AbstractPlatform</span> $platform)<span style="color:#f92672">:</span> <span style="color:#a6e22e">string</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">private</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getUtc</span>()<span style="color:#f92672">:</span> <span style="color:#a6e22e">\DateTimeZone</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> (<span style="color:#66d9ef">empty</span>($this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">utc</span>)) {
</span></span><span style="display:flex;"><span>            $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">utc</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> <span style="color:#a6e22e">\DateTimeZone</span>(<span style="color:#e6db74">&#39;UTC&#39;</span>);
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">utc</span>;
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The key: <code>\DateTime::createFromFormat()</code> receives an explicit UTC timezone. The raw value from the database is interpreted as UTC, regardless of what the PHP server&rsquo;s timezone is set to.</p>
<h2 id="utcdatetimetype">UTCDateTimeType</h2>
<p>For <code>datetime</code> fields, the write path also needs to enforce UTC:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">UTCDateTimeType</span> <span style="color:#66d9ef">extends</span> <span style="color:#a6e22e">DateTimeType</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">use</span> <span style="color:#a6e22e">UTCDate</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">#[\Override]
</span></span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">convertToPHPValue</span>($value, <span style="color:#a6e22e">AbstractPlatform</span> $platform)<span style="color:#f92672">:</span> <span style="color:#f92672">?</span><span style="color:#a6e22e">\DateTime</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> (<span style="color:#66d9ef">null</span> <span style="color:#f92672">===</span> $value <span style="color:#f92672">||</span> $value <span style="color:#a6e22e">instanceof</span> <span style="color:#a6e22e">\DateTimeInterface</span>) {
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">return</span> <span style="color:#66d9ef">parent</span><span style="color:#f92672">::</span><span style="color:#a6e22e">convertToPHPValue</span>($value, $platform);
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#66d9ef">parent</span><span style="color:#f92672">::</span><span style="color:#a6e22e">convertToPHPValue</span>(<span style="color:#e6db74">&#34;</span><span style="color:#e6db74">$value</span><span style="color:#e6db74">+0000&#34;</span>, $platform);
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">#[\Override]
</span></span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">convertToDatabaseValue</span>($value, <span style="color:#a6e22e">AbstractPlatform</span> $platform)<span style="color:#f92672">:</span> <span style="color:#f92672">?</span><span style="color:#a6e22e">string</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> ($value <span style="color:#a6e22e">instanceof</span> <span style="color:#a6e22e">\DateTime</span>) {
</span></span><span style="display:flex;"><span>            $value<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">setTimezone</span>($this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getUtc</span>());
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#66d9ef">parent</span><span style="color:#f92672">::</span><span style="color:#a6e22e">convertToDatabaseValue</span>($value, $platform);
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">#[\Override]
</span></span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">protected</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getFormat</span>(<span style="color:#a6e22e">AbstractPlatform</span> $platform)<span style="color:#f92672">:</span> <span style="color:#a6e22e">string</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> $platform<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getDateTimeFormatString</span>();
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">protected</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">postConvert</span>(<span style="color:#a6e22e">\DateTime</span> $converted)<span style="color:#f92672">:</span> <span style="color:#a6e22e">void</span> {}
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>On read (<code>convertToPHPValue</code>), if the value is a raw string, we append <code>+0000</code> before delegating to the parent. The parent then uses that timezone suffix to create the PHP object correctly.</p>
<p>On write (<code>convertToDatabaseValue</code>), we force the <code>DateTime</code> to UTC before formatting. What goes into the database is always UTC.</p>
<h2 id="utcdatetype">UTCDateType</h2>
<p>For <code>date</code> columns (no time component), same approach with one extra step:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">UTCDateType</span> <span style="color:#66d9ef">extends</span> <span style="color:#a6e22e">DateType</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">use</span> <span style="color:#a6e22e">UTCDate</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">#[\Override]
</span></span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">protected</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getFormat</span>(<span style="color:#a6e22e">AbstractPlatform</span> $platform)<span style="color:#f92672">:</span> <span style="color:#a6e22e">string</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> $platform<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getDateFormatString</span>();
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">protected</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">postConvert</span>(<span style="color:#a6e22e">\DateTime</span> $converted)<span style="color:#f92672">:</span> <span style="color:#a6e22e">void</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        $converted<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">setTime</span>(<span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">0</span>, <span style="color:#ae81ff">0</span>);
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The <code>postConvert()</code> method resets the time to <code>00:00:00</code> after parsing. Without it, a <code>date</code> field might come back with <code>23:59:59</code> or <code>00:00:00+02:00</code> depending on the server&rsquo;s timezone, which breaks comparisons and ordering.</p>
<h2 id="registering-in-symfony">Registering in Symfony</h2>
<p>The decisive part: declaring the types under their built-in names in <code>config/packages/doctrine.yaml</code>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-yaml" data-lang="yaml"><span style="display:flex;"><span><span style="color:#f92672">doctrine</span>:
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">dbal</span>:
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">types</span>:
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">date</span>:
</span></span><span style="display:flex;"><span>                <span style="color:#f92672">class</span>: <span style="color:#ae81ff">App\Doctrine\DBAL\Types\UTCDateType</span>
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">datetime</span>:
</span></span><span style="display:flex;"><span>                <span style="color:#f92672">class</span>: <span style="color:#ae81ff">App\Doctrine\DBAL\Types\UTCDateTimeType</span>
</span></span></code></pre></div><p>That&rsquo;s it. Doctrine swaps out its own implementations for yours. Existing entities don&rsquo;t change, migrations don&rsquo;t move, annotations stay <code>type: Types::DATETIME_MUTABLE</code>. The behavior changes globally, without friction.</p>
<h2 id="12-microservices-89-columns-one-config-block">12 microservices, 89 columns, one config block</h2>
<p>These two types are now running across 12 independent microservices, each with its own Doctrine config, covering 89 production columns. CI servers run in UTC, dev machines in <code>Europe/Paris</code>, data travels between them without shifting. It&rsquo;s not spectacular. It&rsquo;s just reliable.</p>
<p>The real lesson isn&rsquo;t technical: an unresolved timezone issue is a data integrity issue. Offsets accumulate silently, comparisons go wrong, exports become inaccurate. Two lines of config and three classes can prevent that permanently.</p>
]]></content:encoded></item></channel></rss>