53.1. The Path of a Query #

Here we give a short overview of the stages a query has to pass to obtain a result.

53.1.1. For Simple Queries #

If your system is not distributed, a query is processed locally on one node. For example, when a table is local or global, or even if it is sharded but the optimizer has everything required locally on the node:

  1. A connection from an application program to the Postgres Pro Shardman server has to be established. The application program transmits a query to the server and waits to receive the results sent back by the server.

  2. The parser stage checks the query transmitted by the application program for correct syntax and creates a query tree.

  3. The rewrite system takes the query tree created by the parser stage and looks for any rules (stored in the system catalogs) to apply to the query tree. It performs the transformations given in the rule bodies.

    One application of the rewrite system is in the realization of views. Whenever a query against a view (i.e., a virtual table) is made, the rewrite system rewrites the user's query to a query that accesses the base tables given in the view definition instead.

  4. The planner/optimizer takes the (rewritten) query tree and creates a query plan that will be the input to the executor.

    It does so by first creating all possible paths leading to the same result. For example if there is an index on a relation to be scanned, there are two paths for the scan. One possibility is a simple sequential scan and the other possibility is to use the index. Next the cost for the execution of each path is estimated and the cheapest path is chosen. The cheapest path is expanded into a complete plan that the executor can use.

  5. The executor recursively steps through the plan tree and retrieves rows in the way represented by the plan. The executor makes use of the storage system while scanning relations, performs sorts and joins, evaluates qualifications and finally hands back the rows derived.

In the following sections we will cover each of the above listed items in more detail to give a better understanding of Postgres Pro Shardman's internal control and data structures.

53.1.2. For More Complex Queries #

For more complex queries see the following charts:

53.1.2.1. General Query Path #

If an SQL query is to be executed on multiple nodes, or if a node cannot be specified, the query is processed on the coordinator first, followed by remote nodes and, finally, compiled as one set on the coordinator.

Figure 53.1. General Query Path Chart


53.1.2.2. Query Path for Single Node #

If the optimizer can specify the node (e.g. the requested data has a sharding key), then the only difference from the previous section would be the queries and subqueries being sent to the specific node:

Figure 53.2. Query Path for Single Node Chart


53.1.2.3. Query Path with Fast Path #

With shardman.enable_fast_path is enabled (disabled by default), the queries the nodes can be defined for will follow a much shorter path. It only works when the data we need is located on a specific shard, while the query contains the sharding_key = X.

Silk also supports fast path for reading queries.

Figure 53.3. Query Path with Fast Path Chart


Mind some restrictions:

  • Postgres Pro Shardman should only start after postgres_fdw, thus in shared_preload_libraries it must be placed after it.

  • The UPDATE command must not change the sharding key.

  • Only works for sharded tables with one sharding key per table.

  • The following operations are prohibited: blocking operations FOR UPDATE, FOR SHARE, Common Table Expressions, JOIN between tables, INSERT INTO ... SELECT, window functions, MERGE requests, all operations with fast path enabled.

  • No functions (except for aggregation functions) are supported.

  • If the extra_float_digits configuration parameter is less than 0, fast pass cannot be used.

53.1.2.4. Query Path Locally on a Specific Node #

If the query only affects one node that can be directly connected to, the query path is similar to one in not distributed systems:

Figure 53.4. Query Path Locally on a Specific Node Chart