0

I find that the two queries given below when fired on PostgreSQL generate different query execution times:

Query1:

\timing
select s0.value,s1.value,s2.value,s3.value,s4.value
from (
   select f0.subject as r0,f0.predicate as r1,f0.object as r2,f1.predicate as r3,f1.object as r4
   from schemaName.facts f0,schemaName.facts f1
   where f1.subject=f0.subject
) facts,schemaName.strings s0,schemaName.strings s1,schemaName.strings s2,schemaName.strings s3,schemaName.strings s4
where s0.id=facts.r0 and s1.id=facts.r1 and s2.id=facts.r2 and s3.id=facts.r3 and s4.id=facts.r4;

Query1 rewritten:

select s0.value,s1.value,s2.value,s3.value,s4.value
from schemaName.strings s0,schemaName.strings s1,schemaName.strings s2,schemaName.strings, schemaName.facts f0,schemaName.facts f1 s3,schemaName.strings s4
where s0.id=f0.subject and s1.id=f0.predicate and s2.id=f0.object and s3.id=f1.predicate and s4.id=f1.object, f0.subject=f1.subject;

I am unable to understand the reason behind postgresql generating different query execution times. Can someone please help me understand this?

1
  • 1
    Timings can vary from one run to the next. You need to look at the execution plan to see if the queries are being processed differently. Also, I would recommend that you learn and use ASCII standard join syntax, instead of implicit joins in the where. Commented Dec 22, 2013 at 1:55

1 Answer 1

2

Postgresql comes with a very nice command: EXPLAIN and EXPLAIN ANALYZE. The former prints out the query plan with estimates of how long things will take, and the latter outputs the query plan while actually running the query, which allows it to place the real execution costs with the plan.

Postgresql uses a whole mess of criteria and heuristics to decide how best to run a query. Everything from sequential and random access costs (tunable in the configs) to statistical samplings of the data in the tables.

I've found that very often it will come up with the same query plan give two radically different-looking queries (assuming they give the same results), and I've seen the query structure affect the plan. The best way to see what it is doing is to ask it to explain.

All of that said: the second run will always be faster than the first, since the data is now cached. So, if you are really trying to compare runtimes, be sure to run each query at least four times, drop the first one, and average the rest.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.