<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Recherche-Plein-Texte on Guillaume Delré</title><link>https://guillaumedelre.github.io/fr/tags/recherche-plein-texte/</link><description>Recent content in Recherche-Plein-Texte on Guillaume Delré</description><generator>Hugo</generator><language>fr-FR</language><lastBuildDate>Mon, 10 Feb 2025 00:00:00 +0000</lastBuildDate><atom:link href="https://guillaumedelre.github.io/fr/tags/recherche-plein-texte/index.xml" rel="self" type="application/rss+xml"/><item><title>La recherche full-text PostgreSQL avec Doctrine, sans une ligne de SQL brut</title><link>https://guillaumedelre.github.io/fr/2025/02/10/la-recherche-full-text-postgresql-avec-doctrine-sans-une-ligne-de-sql-brut/</link><pubDate>Mon, 10 Feb 2025 00:00:00 +0000</pubDate><guid>https://guillaumedelre.github.io/fr/2025/02/10/la-recherche-full-text-postgresql-avec-doctrine-sans-une-ligne-de-sql-brut/</guid><description>Comment nous avons superposé des types DBAL personnalisés et des wrappers DQL sur postgresql-for-doctrine pour intégrer la recherche full-text PostgreSQL dans un projet Symfony API Platform.</description><content:encoded><![CDATA[<p>Le champ de recherche de la médiathèque renvoyait des résultats en 800 millisecondes en staging. En production, il y avait quarante fois plus de lignes. Le plan d&rsquo;exécution révélait un sequential scan: aucun index sollicité, aucune façon d&rsquo;y remédier avec un B-tree classique. L&rsquo;équipe produit voulait aussi une recherche multi-mots: taper &ldquo;interview président&rdquo;, obtenir des résultats contenant les deux termes. Une requête <code>LIKE</code> avec des wildcards n&rsquo;a pas de manière propre d&rsquo;exprimer ça sans conditions indépendantes multiples, chacune nécessitant son propre scan.</p>
<p>PostgreSQL embarque une recherche full-text depuis plus de quinze ans. La plateforme tournait déjà sous PostgreSQL. Le hic: le projet utilise Doctrine ORM, et Doctrine ne sait pas nativement ce qu&rsquo;est un <code>tsvector</code>.</p>
<p>Une bibliothèque communautaire, <a href="https://github.com/martin-georgiev/postgresql-for-doctrine" target="_blank" rel="noopener noreferrer">postgresql-for-doctrine</a>, couvre une partie de cette lacune. Elle enregistre des fonctions DQL basiques comme <code>TO_TSQUERY</code>, <code>TO_TSVECTOR</code>, et l&rsquo;opérateur de correspondance <code>@@</code> en tant que pièces atomiques séparées. La fondation était là. Trois choses restaient à construire par-dessus.</p>
<h2 id="le-type-que-doctrine-na-jamais-vu">Le type que Doctrine n&rsquo;a jamais vu</h2>
<p><a href="https://www.postgresql.org/docs/current/datatype-textsearch.html" target="_blank" rel="noopener noreferrer">La recherche full-text de PostgreSQL</a> repose sur deux types: <code>tsvector</code> (une liste pré-traitée de tokens normalisés) et <code>tsquery</code> (une expression de recherche). On maintient une colonne <code>tsvector</code>, on l&rsquo;indexe avec GIN, et on interroge avec l&rsquo;opérateur <code>@@</code>.</p>
<p>Le DBAL de Doctrine ne livre aucun type <code>tsvector</code>. Déclarer <code>#[ORM\Column(type: 'tsvector')]</code> sans l&rsquo;enregistrer au préalable lève une <code>UnknownColumnTypeException</code>. La solution: un type DBAL personnalisé:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">TsVector</span> <span style="color:#66d9ef">extends</span> <span style="color:#a6e22e">Type</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">final</span> <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">const</span> <span style="color:#66d9ef">string</span> <span style="color:#a6e22e">DBAL_TYPE</span> <span style="color:#f92672">=</span> <span style="color:#e6db74">&#39;tsvector&#39;</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getSQLDeclaration</span>(<span style="color:#66d9ef">array</span> $column, <span style="color:#a6e22e">AbstractPlatform</span> $platform)<span style="color:#f92672">:</span> <span style="color:#a6e22e">string</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">self</span><span style="color:#f92672">::</span><span style="color:#a6e22e">DBAL_TYPE</span>;
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getName</span>()<span style="color:#f92672">:</span> <span style="color:#a6e22e">string</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">self</span><span style="color:#f92672">::</span><span style="color:#a6e22e">DBAL_TYPE</span>;
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">convertToDatabaseValueSQL</span>(<span style="color:#a6e22e">string</span> $sqlExpr, <span style="color:#a6e22e">AbstractPlatform</span> $platform)<span style="color:#f92672">:</span> <span style="color:#a6e22e">string</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">sprintf</span>(<span style="color:#e6db74">&#34;to_tsvector(&#39;simple&#39;, %s)&#34;</span>, $sqlExpr);
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">convertToDatabaseValue</span>(<span style="color:#a6e22e">mixed</span> $value, <span style="color:#a6e22e">AbstractPlatform</span> $platform)<span style="color:#f92672">:</span> <span style="color:#a6e22e">mixed</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> (<span style="color:#a6e22e">is_array</span>($value) <span style="color:#f92672">&amp;&amp;</span> <span style="color:#a6e22e">isset</span>($value[<span style="color:#e6db74">&#39;data&#39;</span>])) {
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">return</span> $value[<span style="color:#e6db74">&#39;data&#39;</span>];
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">is_string</span>($value) <span style="color:#f92672">?</span> $value <span style="color:#f92672">:</span> <span style="color:#66d9ef">null</span>;
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getMappedDatabaseTypes</span>(<span style="color:#a6e22e">AbstractPlatform</span> $platform)<span style="color:#f92672">:</span> <span style="color:#66d9ef">array</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> [<span style="color:#a6e22e">self</span><span style="color:#f92672">::</span><span style="color:#a6e22e">DBAL_TYPE</span>];
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>La méthode intéressante est <code>convertToDatabaseValueSQL()</code>. Doctrine l&rsquo;appelle pour envelopper le placeholder SQL avant que la valeur n&rsquo;atteigne la base de données. La valeur écrite devient automatiquement <code>to_tsvector('simple', ?)</code> à la frontière DBAL, sans étape supplémentaire côté appelant.</p>
<p>On enregistre le type dans <code>doctrine.yaml</code>, puis on mappe la colonne sur l&rsquo;entité:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-yaml" data-lang="yaml"><span style="display:flex;"><span><span style="color:#f92672">doctrine</span>:
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">dbal</span>:
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">types</span>:
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">tsvector</span>: <span style="color:#ae81ff">App\Doctrine\DBAL\Types\TsVector</span>
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#75715e">#[ORM\Column(type: &#39;tsvector&#39;, nullable: true)]
</span></span></span><span style="display:flex;"><span><span style="color:#66d9ef">protected</span> <span style="color:#f92672">?</span><span style="color:#a6e22e">string</span> $textSearch <span style="color:#f92672">=</span> <span style="color:#66d9ef">null</span>;
</span></span></code></pre></div><p>Côté PHP, la valeur est une simple chaîne. La conversion en vrai <code>tsvector</code> se fait invisiblement au niveau DBAL.</p>
<p>Nous avons utilisé le dictionnaire <code>'simple'</code>, qui tokenise sur les espaces et la ponctuation sans stemming spécifique à une langue. La plateforme gère plusieurs langues, et les règles de stemming français auraient cassé l&rsquo;espagnol. Simple suffit largement pour la phonétique.</p>
<h2 id="garder-la-colonne-à-jour">Garder la colonne à jour</h2>
<p>Une colonne <code>tsvector</code> est une donnée dérivée: elle doit rester synchronisée avec les champs source chaque fois que l&rsquo;entité change. Un event listener Doctrine s&rsquo;en charge:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#75715e">#[AsDoctrineListener(event: Events::prePersist)]
</span></span></span><span style="display:flex;"><span><span style="color:#75715e">#[AsDoctrineListener(event: Events::preUpdate)]
</span></span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">MediaTsVectorSubscriber</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">prePersist</span>(<span style="color:#a6e22e">PrePersistEventArgs</span> $event)<span style="color:#f92672">:</span> <span style="color:#a6e22e">void</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> (<span style="color:#f92672">!</span>$event<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getObject</span>() <span style="color:#a6e22e">instanceof</span> <span style="color:#a6e22e">Media</span>) {
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">return</span>;
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>        $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">updateTextSearch</span>($event<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getObject</span>());
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">preUpdate</span>(<span style="color:#a6e22e">PreUpdateEventArgs</span> $event)<span style="color:#f92672">:</span> <span style="color:#a6e22e">void</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> (<span style="color:#f92672">!</span>$event<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getObject</span>() <span style="color:#a6e22e">instanceof</span> <span style="color:#a6e22e">Media</span>) {
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">return</span>;
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>        $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">updateTextSearch</span>($event<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getObject</span>());
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">private</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">updateTextSearch</span>(<span style="color:#a6e22e">Media</span> $entity)<span style="color:#f92672">:</span> <span style="color:#a6e22e">void</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        $entity<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">setTextSearch</span>(
</span></span><span style="display:flex;"><span>            <span style="color:#a6e22e">sprintf</span>(<span style="color:#e6db74">&#39;%s %s&#39;</span>, $entity<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getTitle</span>(), $entity<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">getCaption</span>())
</span></span><span style="display:flex;"><span>        );
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Avant chaque persist et update, le subscriber concatène les champs qui doivent être recherchables dans <code>textSearch</code>. Doctrine flush la chaîne combinée, le type DBAL l&rsquo;enveloppe dans <code>to_tsvector('simple', ...)</code>, et PostgreSQL stocke la forme tokenisée.</p>
<p>Une subtilité: la valeur côté PHP est <code>&quot;title caption&quot;</code>, pas la sortie tsvector réelle. La base affiche <code>'caption' 'title'</code> (tokens triés), mais l&rsquo;entité contient une chaîne brute. C&rsquo;est attendu: la conversion est une responsabilité DBAL, pas PHP. Ça peut dérouter le débogage jusqu&rsquo;à ce qu&rsquo;on se souvienne où se situe la frontière.</p>
<h2 id="étendre-dql-avec-les-opérateurs-fts">Étendre DQL avec les opérateurs FTS</h2>
<p>Le DQL de Doctrine couvre les opérations SQL courantes, mais tout ce qui est spécifique à PostgreSQL est hors périmètre. C&rsquo;est là que <code>postgresql-for-doctrine</code> entre en jeu: il enregistre <code>TO_TSQUERY</code>, <code>TO_TSVECTOR</code>, et <code>TSMATCH</code> comme fonctions DQL individuelles. Écrire une requête full-text en DQL sans lui signifierait basculer en SQL natif.</p>
<p>Les fonctions de la bibliothèque sont atomiques, cependant. Chacune correspond à un appel SQL. Exprimer une vérification de correspondance complète en DQL ressemble à <code>TSMATCH(o.textSearch, TO_TSQUERY(:term))</code>. Assez lisible, mais l&rsquo;équipe voulait quelque chose de plus compact: une seule fonction DQL encodant à la fois l&rsquo;opérateur de correspondance et le type de requête, y compris <code>websearch_to_tsquery</code> que <code>postgresql-for-doctrine</code> ne fournissait pas.</p>
<p>La solution: des <a href="https://www.doctrine-project.org/projects/doctrine-orm/en/latest/cookbook/dql-user-defined-functions.html" target="_blank" rel="noopener noreferrer">fonctions DQL personnalisées</a> via <code>FunctionNode</code>. On parse la syntaxe DQL, puis on émet du SQL. Toutes les fonctions FTS partagent la même signature à deux arguments, donc une classe abstraite de base gère le parsing:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#66d9ef">abstract</span> <span style="color:#66d9ef">class</span> <span style="color:#a6e22e">TsFunction</span> <span style="color:#66d9ef">extends</span> <span style="color:#a6e22e">FunctionNode</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#a6e22e">PathExpression</span><span style="color:#f92672">|</span><span style="color:#a6e22e">Node</span><span style="color:#f92672">|</span><span style="color:#66d9ef">null</span> $ftsField <span style="color:#f92672">=</span> <span style="color:#66d9ef">null</span>;
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#a6e22e">PathExpression</span><span style="color:#f92672">|</span><span style="color:#a6e22e">Node</span><span style="color:#f92672">|</span><span style="color:#66d9ef">null</span> $queryString <span style="color:#f92672">=</span> <span style="color:#66d9ef">null</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">parse</span>(<span style="color:#a6e22e">Parser</span> $parser)<span style="color:#f92672">:</span> <span style="color:#a6e22e">void</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        $parser<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">match</span>(<span style="color:#a6e22e">TokenType</span><span style="color:#f92672">::</span><span style="color:#a6e22e">T_IDENTIFIER</span>);
</span></span><span style="display:flex;"><span>        $parser<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">match</span>(<span style="color:#a6e22e">TokenType</span><span style="color:#f92672">::</span><span style="color:#a6e22e">T_OPEN_PARENTHESIS</span>);
</span></span><span style="display:flex;"><span>        $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">ftsField</span> <span style="color:#f92672">=</span> $parser<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">StringPrimary</span>();
</span></span><span style="display:flex;"><span>        $parser<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">match</span>(<span style="color:#a6e22e">TokenType</span><span style="color:#f92672">::</span><span style="color:#a6e22e">T_COMMA</span>);
</span></span><span style="display:flex;"><span>        $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">queryString</span> <span style="color:#f92672">=</span> $parser<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">StringPrimary</span>();
</span></span><span style="display:flex;"><span>        $parser<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">match</span>(<span style="color:#a6e22e">TokenType</span><span style="color:#f92672">::</span><span style="color:#a6e22e">T_CLOSE_PARENTHESIS</span>);
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Chaque classe concrète implémente <code>getSql()</code> pour émettre son expression PostgreSQL:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#75715e">// e.textSearch @@ websearch_to_tsquery(&#39;simple&#39;, :term)
</span></span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">TsWebsearchQueryFunction</span> <span style="color:#66d9ef">extends</span> <span style="color:#a6e22e">TsFunction</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getSql</span>(<span style="color:#a6e22e">SqlWalker</span> $sqlWalker)<span style="color:#f92672">:</span> <span style="color:#a6e22e">string</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> $this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">ftsField</span><span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">dispatch</span>($sqlWalker)
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">.</span><span style="color:#e6db74">&#34; @@ websearch_to_tsquery(&#39;simple&#39;, &#34;</span>
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">.</span>$this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">queryString</span><span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">dispatch</span>($sqlWalker)<span style="color:#f92672">.</span><span style="color:#e6db74">&#39;)&#39;</span>;
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// ts_rank(e.textSearch, to_tsquery(:term)) pour le tri par pertinence
</span></span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">TsRankFunction</span> <span style="color:#66d9ef">extends</span> <span style="color:#a6e22e">TsFunction</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getSql</span>(<span style="color:#a6e22e">SqlWalker</span> $sqlWalker)<span style="color:#f92672">:</span> <span style="color:#a6e22e">string</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> <span style="color:#e6db74">&#39;ts_rank(&#39;</span>
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">.</span>$this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">ftsField</span><span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">dispatch</span>($sqlWalker)
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">.</span><span style="color:#e6db74">&#39;, to_tsquery(&#39;</span><span style="color:#f92672">.</span>$this<span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">queryString</span><span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">dispatch</span>($sqlWalker)<span style="color:#f92672">.</span><span style="color:#e6db74">&#39;))&#39;</span>;
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-yaml" data-lang="yaml"><span style="display:flex;"><span><span style="color:#f92672">doctrine</span>:
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">orm</span>:
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">entity_managers</span>:
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">default</span>:
</span></span><span style="display:flex;"><span>                <span style="color:#f92672">dql</span>:
</span></span><span style="display:flex;"><span>                    <span style="color:#f92672">string_functions</span>:
</span></span><span style="display:flex;"><span>                        <span style="color:#f92672">tswebsearchquery</span>: <span style="color:#ae81ff">App\Doctrine\ORM\Query\AST\Functions\TsWebsearchQueryFunction</span>
</span></span><span style="display:flex;"><span>                        <span style="color:#f92672">tsrank</span>: <span style="color:#ae81ff">App\Doctrine\ORM\Query\AST\Functions\TsRankFunction</span>
</span></span><span style="display:flex;"><span>                        <span style="color:#f92672">tsquery</span>: <span style="color:#ae81ff">App\Doctrine\ORM\Query\AST\Functions\TsQueryFunction</span>
</span></span><span style="display:flex;"><span>                        <span style="color:#f92672">tsplainquery</span>: <span style="color:#ae81ff">App\Doctrine\ORM\Query\AST\Functions\TsPlainQueryFunction</span>
</span></span></code></pre></div><p><code>websearch_to_tsquery</code> est le bon choix pour la recherche côté utilisateur: les espaces deviennent des AND, les chaînes entre guillemets deviennent des phrases, <code>-mot</code> exclut un terme. Inutile d&rsquo;apprendre aux utilisateurs à taper <code>interview &amp; président</code>. C&rsquo;est disponible depuis PostgreSQL 11. Sur les versions antérieures, <code>plainto_tsquery</code> est l&rsquo;équivalent le plus proche.</p>
<h2 id="le-filtre-api-platform-et-lindex-gin">Le filtre API Platform et l&rsquo;index GIN</h2>
<p>Avec les fonctions DQL enregistrées, le filtre API Platform est simple. Un <code>AbstractFilter</code> personnalisé appelle directement la fonction DQL dans le <code>QueryBuilder</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">TextSearchFilter</span> <span style="color:#66d9ef">extends</span> <span style="color:#a6e22e">AbstractFilter</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">protected</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">filterProperty</span>(
</span></span><span style="display:flex;"><span>        <span style="color:#a6e22e">string</span> $property,
</span></span><span style="display:flex;"><span>        $value,
</span></span><span style="display:flex;"><span>        <span style="color:#a6e22e">QueryBuilder</span> $queryBuilder,
</span></span><span style="display:flex;"><span>        <span style="color:#a6e22e">QueryNameGeneratorInterface</span> $queryNameGenerator,
</span></span><span style="display:flex;"><span>        <span style="color:#a6e22e">string</span> $resourceClass,
</span></span><span style="display:flex;"><span>        <span style="color:#f92672">?</span><span style="color:#a6e22e">Operation</span> $operation <span style="color:#f92672">=</span> <span style="color:#66d9ef">null</span>,
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">array</span> $context <span style="color:#f92672">=</span> []
</span></span><span style="display:flex;"><span>    )<span style="color:#f92672">:</span> <span style="color:#a6e22e">void</span> {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">if</span> (<span style="color:#e6db74">&#39;textSearch&#39;</span> <span style="color:#f92672">!==</span> $property <span style="color:#f92672">||</span> <span style="color:#66d9ef">empty</span>($value)) {
</span></span><span style="display:flex;"><span>            <span style="color:#66d9ef">return</span>;
</span></span><span style="display:flex;"><span>        }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>        $queryBuilder
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">andWhere</span>(<span style="color:#e6db74">&#39;tswebsearchquery(o.textSearch, :value) = true&#39;</span>)
</span></span><span style="display:flex;"><span>            <span style="color:#f92672">-&gt;</span><span style="color:#a6e22e">setParameter</span>(<span style="color:#e6db74">&#39;:value&#39;</span>, $value);
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">function</span> <span style="color:#a6e22e">getDescription</span>(<span style="color:#a6e22e">string</span> $resourceClass)<span style="color:#f92672">:</span> <span style="color:#66d9ef">array</span>
</span></span><span style="display:flex;"><span>    {
</span></span><span style="display:flex;"><span>        <span style="color:#66d9ef">return</span> [];
</span></span><span style="display:flex;"><span>    }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>On l&rsquo;applique sur l&rsquo;entité avec la déclaration d&rsquo;index:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;-webkit-text-size-adjust:none;"><code class="language-php" data-lang="php"><span style="display:flex;"><span><span style="color:#75715e">#[ORM\Index(
</span></span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">columns</span><span style="color:#f92672">:</span> [<span style="color:#e6db74">&#39;text_search&#39;</span>],
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">name</span><span style="color:#f92672">:</span> <span style="color:#e6db74">&#39;media_text_search_idx_gin&#39;</span>,
</span></span><span style="display:flex;"><span>    <span style="color:#a6e22e">options</span><span style="color:#f92672">:</span> [<span style="color:#e6db74">&#39;USING&#39;</span> <span style="color:#f92672">=&gt;</span> <span style="color:#e6db74">&#39;gin (text_search)&#39;</span>]
</span></span><span style="display:flex;"><span>)]
</span></span><span style="display:flex;"><span><span style="color:#75715e">#[ApiFilter(TextSearchFilter::class, properties: [&#39;textSearch&#39; =&gt; &#39;partial&#39;])]
</span></span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Media</span>
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span>    <span style="color:#75715e">// ...
</span></span></span><span style="display:flex;"><span>    <span style="color:#75715e">#[ORM\Column(type: &#39;tsvector&#39;, nullable: true)]
</span></span></span><span style="display:flex;"><span>    <span style="color:#66d9ef">protected</span> <span style="color:#f92672">?</span><span style="color:#a6e22e">string</span> $textSearch <span style="color:#f92672">=</span> <span style="color:#66d9ef">null</span>;
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>L&rsquo;option <code>USING gin</code> n&rsquo;est pas négociable. Un index B-tree standard sur une colonne <code>tsvector</code> est inutile: PostgreSQL ne peut pas l&rsquo;utiliser pour les requêtes <code>@@</code>. GIN (Generalized Inverted Index) fonctionne différemment: il indexe chaque token individuellement, donc les recherches par n&rsquo;importe quel token sont en <code>O(log n)</code> plutôt que <code>O(n)</code>. Sans ça, on a construit un système qui donne l&rsquo;impression d&rsquo;être rapide mais qui fait quand même un full table scan.</p>
<p>Un <code>GET /media?textSearch=interview+president</code> touche maintenant l&rsquo;index GIN et répond en quelques millisecondes quel que soit la taille de la table.</p>
<h2 id="ce-que-la-répartition-ressemblait-vraiment">Ce que la répartition ressemblait vraiment</h2>
<p>La bibliothèque couvrait les fonctions atomiques bas niveau. Le code personnalisé couvrait les lacunes: un type DBAL <code>tsvector</code> que la bibliothèque ne fournissait pas, des wrappers DQL pratiques combinant <code>@@</code> et <code>websearch_to_tsquery</code> en un seul appel, et la colle applicative reliant tout ça au système d&rsquo;événements de Doctrine et à API Platform. Aucune requête native n&rsquo;a été nécessaire.</p>
<p>La répartition vaut d&rsquo;être notée en général: <code>postgresql-for-doctrine</code> donne les briques atomiques PostgreSQL, mais il faut quand même les composer en quelque chose que le reste du code peut utiliser sans y penser. Le pattern <code>FunctionNode</code> et le hook <code>convertToDatabaseValueSQL()</code> sont les deux points d&rsquo;extension qui rendent cette composition propre. Les deux valent d&rsquo;être connus, quelle que soit la bibliothèque de départ.</p>
]]></content:encoded></item></channel></rss>