@@ -2011,6 +2011,10 @@ GROUP BY region, product;
20112011 but we'd have needed two levels of nested sub-<command>SELECT</command>s. It's a bit
20122012 easier to follow this way.
20132013 </para>
2014+ </sect2>
2015+
2016+ <sect2 id="queries-with-recursive">
2017+ <title>Recursive Queries</title>
20142018
20152019 <para>
20162020 <indexterm>
@@ -2114,6 +2118,120 @@ GROUP BY sub_part
21142118</programlisting>
21152119 </para>
21162120
2121+ <sect3 id="queries-with-search">
2122+ <title>Search Order</title>
2123+
2124+ <para>
2125+ When computing a tree traversal using a recursive query, you might want to
2126+ order the results in either depth-first or breadth-first order. This can
2127+ be done by computing an ordering column alongside the other data columns
2128+ and using that to sort the results at the end. Note that this does not
2129+ actually control in which order the query evaluation visits the rows; that
2130+ is as always in SQL implementation-dependent. This approach merely
2131+ provides a convenient way to order the results afterwards.
2132+ </para>
2133+
2134+ <para>
2135+ To create a depth-first order, we compute for each result row an array of
2136+ rows that we have visited so far. For example, consider the following
2137+ query that searches a table <structname>tree</structname> using a
2138+ <structfield>link</structfield> field:
2139+
2140+ <programlisting>
2141+ WITH RECURSIVE search_tree(id, link, data) AS (
2142+ SELECT t.id, t.link, t.data
2143+ FROM tree t
2144+ UNION ALL
2145+ SELECT t.id, t.link, t.data
2146+ FROM tree t, search_tree st
2147+ WHERE t.id = st.link
2148+ )
2149+ SELECT * FROM search_tree;
2150+ </programlisting>
2151+
2152+ To add depth-first ordering information, you can write this:
2153+
2154+ <programlisting>
2155+ WITH RECURSIVE search_tree(id, link, data, <emphasis>path</emphasis>) AS (
2156+ SELECT t.id, t.link, t.data, <emphasis>ARRAY[t.id]</emphasis>
2157+ FROM tree t
2158+ UNION ALL
2159+ SELECT t.id, t.link, t.data, <emphasis>path || t.id</emphasis>
2160+ FROM tree t, search_tree st
2161+ WHERE t.id = st.link
2162+ )
2163+ SELECT * FROM search_tree <emphasis>ORDER BY path</emphasis>;
2164+ </programlisting>
2165+ </para>
2166+
2167+ <para>
2168+ In the general case where more than one field needs to be used to identify
2169+ a row, use an array of rows. For example, if we needed to track fields
2170+ <structfield>f1</structfield> and <structfield>f2</structfield>:
2171+
2172+ <programlisting>
2173+ WITH RECURSIVE search_tree(id, link, data, <emphasis>path</emphasis>) AS (
2174+ SELECT t.id, t.link, t.data, <emphasis>ARRAY[ROW(t.f1, t.f2)]</emphasis>
2175+ FROM tree t
2176+ UNION ALL
2177+ SELECT t.id, t.link, t.data, <emphasis>path || ROW(t.f1, t.f2)</emphasis>
2178+ FROM tree t, search_tree st
2179+ WHERE t.id = st.link
2180+ )
2181+ SELECT * FROM search_tree <emphasis>ORDER BY path</emphasis>;
2182+ </programlisting>
2183+ </para>
2184+
2185+ <note>
2186+ <para>
2187+ The queries shown in this and the following section involving
2188+ <literal>ROW</literal> constructors in the target list only support
2189+ <literal>UNION ALL</literal> (not plain <literal>UNION</literal>) in the
2190+ current implementation.
2191+ </para>
2192+ </note>
2193+
2194+ <tip>
2195+ <para>
2196+ Omit the <literal>ROW()</literal> syntax in the common case where only one
2197+ field needs to be tracked. This allows a simple array rather than a
2198+ composite-type array to be used, gaining efficiency.
2199+ </para>
2200+ </tip>
2201+
2202+ <para>
2203+ To create a breadth-first order, you can add a column that tracks the depth
2204+ of the search, for example:
2205+
2206+ <programlisting>
2207+ WITH RECURSIVE search_tree(id, link, data, <emphasis>depth</emphasis>) AS (
2208+ SELECT t.id, t.link, t.data, <emphasis>0</emphasis>
2209+ FROM tree t
2210+ UNION ALL
2211+ SELECT t.id, t.link, t.data, <emphasis>depth + 1</emphasis>
2212+ FROM tree t, search_tree st
2213+ WHERE t.id = st.link
2214+ )
2215+ SELECT * FROM search_tree <emphasis>ORDER BY depth</emphasis>;
2216+ </programlisting>
2217+
2218+ To get a stable sort, add data columns as secondary sorting columns.
2219+ </para>
2220+
2221+ <tip>
2222+ <para>
2223+ The recursive query evaluation algorithm produces its output in
2224+ breadth-first search order. However, this is an implementation detail and
2225+ it is perhaps unsound to rely on it. The order of the rows within each
2226+ level is certainly undefined, so some explicit ordering might be desired
2227+ in any case.
2228+ </para>
2229+ </tip>
2230+ </sect3>
2231+
2232+ <sect3 id="queries-with-cycle">
2233+ <title>Cycle Detection</title>
2234+
21172235 <para>
21182236 When working with recursive queries it is important to be sure that
21192237 the recursive part of the query will eventually return no tuples,
@@ -2123,13 +2241,13 @@ GROUP BY sub_part
21232241 cycle does not involve output rows that are completely duplicate: it may be
21242242 necessary to check just one or a few fields to see if the same point has
21252243 been reached before. The standard method for handling such situations is
2126- to compute an array of the already-visited values. For example, consider
2244+ to compute an array of the already-visited values. For example, consider again
21272245 the following query that searches a table <structname>graph</structname> using a
21282246 <structfield>link</structfield> field:
21292247
21302248<programlisting>
21312249WITH RECURSIVE search_graph(id, link, data, depth) AS (
2132- SELECT g.id, g.link, g.data, 1
2250+ SELECT g.id, g.link, g.data, 0
21332251 FROM graph g
21342252 UNION ALL
21352253 SELECT g.id, g.link, g.data, sg.depth + 1
@@ -2147,17 +2265,17 @@ SELECT * FROM search_graph;
21472265 <structfield>is_cycle</structfield> and <structfield>path</structfield> to the loop-prone query:
21482266
21492267<programlisting>
2150- WITH RECURSIVE search_graph(id, link, data, depth, is_cycle, path) AS (
2151- SELECT g.id, g.link, g.data, 1 ,
2152- false,
2153- ARRAY[g.id]
2268+ WITH RECURSIVE search_graph(id, link, data, depth, <emphasis> is_cycle, path</emphasis> ) AS (
2269+ SELECT g.id, g.link, g.data, 0 ,
2270+ <emphasis> false,
2271+ ARRAY[g.id]</emphasis>
21542272 FROM graph g
21552273 UNION ALL
21562274 SELECT g.id, g.link, g.data, sg.depth + 1,
2157- g.id = ANY(path),
2158- path || g.id
2275+ <emphasis> g.id = ANY(path),
2276+ path || g.id</emphasis>
21592277 FROM graph g, search_graph sg
2160- WHERE g.id = sg.link AND NOT is_cycle
2278+ WHERE g.id = sg.link <emphasis> AND NOT is_cycle</emphasis>
21612279)
21622280SELECT * FROM search_graph;
21632281</programlisting>
@@ -2172,17 +2290,17 @@ SELECT * FROM search_graph;
21722290 compare fields <structfield>f1</structfield> and <structfield>f2</structfield>:
21732291
21742292<programlisting>
2175- WITH RECURSIVE search_graph(id, link, data, depth, is_cycle, path) AS (
2176- SELECT g.id, g.link, g.data, 1 ,
2177- false,
2178- ARRAY[ROW(g.f1, g.f2)]
2293+ WITH RECURSIVE search_graph(id, link, data, depth, <emphasis> is_cycle, path</emphasis> ) AS (
2294+ SELECT g.id, g.link, g.data, 0 ,
2295+ <emphasis> false,
2296+ ARRAY[ROW(g.f1, g.f2)]</emphasis>
21792297 FROM graph g
21802298 UNION ALL
21812299 SELECT g.id, g.link, g.data, sg.depth + 1,
2182- ROW(g.f1, g.f2) = ANY(path),
2183- path || ROW(g.f1, g.f2)
2300+ <emphasis> ROW(g.f1, g.f2) = ANY(path),
2301+ path || ROW(g.f1, g.f2)</emphasis>
21842302 FROM graph g, search_graph sg
2185- WHERE g.id = sg.link AND NOT is_cycle
2303+ WHERE g.id = sg.link <emphasis> AND NOT is_cycle</emphasis>
21862304)
21872305SELECT * FROM search_graph;
21882306</programlisting>
@@ -2198,10 +2316,8 @@ SELECT * FROM search_graph;
21982316
21992317 <tip>
22002318 <para>
2201- The recursive query evaluation algorithm produces its output in
2202- breadth-first search order. You can display the results in depth-first
2203- search order by making the outer query <literal>ORDER BY</literal> a
2204- <quote>path</quote> column constructed in this way.
2319+ The cycle path column is computed in the same way as the depth-first
2320+ ordering column show in the previous section.
22052321 </para>
22062322 </tip>
22072323
@@ -2217,7 +2333,7 @@ WITH RECURSIVE t(n) AS (
22172333 UNION ALL
22182334 SELECT n+1 FROM t
22192335)
2220- SELECT n FROM t LIMIT 100;
2336+ SELECT n FROM t <emphasis> LIMIT 100</emphasis> ;
22212337</programlisting>
22222338
22232339 This works because <productname>PostgreSQL</productname>'s implementation
@@ -2229,6 +2345,11 @@ SELECT n FROM t LIMIT 100;
22292345 outer query will usually try to fetch all of the <literal>WITH</literal> query's
22302346 output anyway.
22312347 </para>
2348+ </sect3>
2349+ </sect2>
2350+
2351+ <sect2>
2352+ <title>Common Table Expression Materialization</title>
22322353
22332354 <para>
22342355 A useful property of <literal>WITH</literal> queries is that they are
0 commit comments