0

I'm working with OpenStreetMap osm2pgsql database. One of its table (planet_osm_line) have two indexed fields: osm_id (int, primary key) and way (postgis geometry).

I'd like to find which streets intersect with a specific street, which I know by it's osm_id. So I do:

SELECT name, * FROM planet_osm_line
WHERE highway IS NOT NULL
AND osm_id != 126021312
AND ST_Intersects(way, (SELECT way FROM planet_osm_line WHERE osm_id = 126021312 LIMIT 1))

And it takes about 10 seconds to run.

If instead, I take that subquery out and run it separately, it looks about like this:

SELECT name, * FROM planet_osm_line
WHERE highway IS NOT NULL
AND osm_id != 126021312
AND ST_Intersects(way, '010200002031BF0D000D000000E17...')

And it takes about 0.47 seconds to run.

Running EXPLAIN on the first and the second query gives me a hint about the difference.

First:

Seq Scan on planet_osm_line  (cost=2.09..614596.67 rows=628706 width=1079)
  Filter: ((highway IS NOT NULL) AND (osm_id <> 126021312) AND st_intersects(way, $0))
  InitPlan 1 (returns $0)
    ->  Limit  (cost=0.43..2.09 rows=1 width=249)
          ->  Index Scan using planet_osm_line_pkey on planet_osm_line planet_osm_line_1  (cost=0.43..3.76 rows=2 width=249)
                Index Cond: (osm_id = 126021312)

Second:

Index Scan using planet_osm_line_index on planet_osm_line  (cost=0.41..4.25 rows=1 width=1079)
  Index Cond: (way && '010200002031BF0D000D000000E17...'::geometry)
  Filter: ((highway IS NOT NULL) AND (osm_id <> 126021312) AND _st_intersects(way, '010200002031BF0D000D000000E17...'::geometry))

Why is it that PostgreSQL is doing a seq scan on the first and a index scan on the second? Is there a way to solve this problem without issuing two queries?

3
  • 1
    (SELECT way FROM planet_osm_line WHERE osm_id = 126021312 LIMIT 1)) is IMnsHO a terrible way to emulate an EXISTS(...) (or JOIN) Commented May 27, 2015 at 14:38
  • Did you run VACUUM ANALYZE? Commented May 27, 2015 at 14:38
  • wildplasser: yes, I'm emulating a JOIN there. rightfold: I didn't, but I did now. It continues to do the seq scan on the first query. Commented May 27, 2015 at 15:29

2 Answers 2

2

Rewrite your query so that instead of having a sub-query within ST_Intersects, you instead have a cross join in the FROM, which will then be restricted by the intersects in the WHERE (which also implicitly does a &&, ie, bounding box check, which will hit the spatial index).

SELECT name, osm.* 
FROM planet_osm_line osm, 
  (SELECT way FROM planet_osm_line WHERE osm_id = 126021312 LIMIT 1) line
WHERE highway IS NOT NULL
AND osm_id != 126021312
AND ST_Intersects(osm.way, line.way);
Sign up to request clarification or add additional context in comments.

3 Comments

Yes, that works fast too. The EXPLAIN shows it has the same meaning as the other query I posted above (with explicit JOIN).
Your query doesn't work as is, could you fix it? I changed the ST_Intersects to make it work: ST_Intersects(osm.way, line.way).
Sure, fixed, sorry, didn't have any way to test.
1

This way seems to work fine (answers partially my question):

SELECT l1.name, l1.*
FROM planet_osm_line AS l1
INNER JOIN planet_osm_line AS l2
ON ST_Intersects(l1.way, l2.way)
WHERE l1.highway IS NOT NULL
AND l1.osm_id != 126021312
AND l2.osm_id = 126021312

The EXPLAIN of it shows PostgreSQL seems to be doing what I wanted in the first place:

Nested Loop  (cost=6.80..577.98 rows=7451 width=1108)
  ->  Index Scan using planet_osm_line_pkey on planet_osm_line l2  (cost=0.43..3.76 rows=2 width=249)
      Index Cond: (osm_id = 126021312)
  ->  Bitmap Heap Scan on planet_osm_line l1  (cost=6.37..286.48 rows=63 width=1108)
        Recheck Cond: (way && l2.way)
        Filter: ((highway IS NOT NULL) AND (osm_id <> 126021312) AND _st_intersects(way, l2.way))
        ->  Bitmap Index Scan on planet_osm_line_index  (cost=0.00..6.36 rows=206 width=0)
            Index Cond: (way && l2.way)

I'm still curious about why the first query didn't behave like this one, though.

1 Comment

Because the planner doesn't see a geometry column, but a table expression. The INNER JOIN syntax is the same as the explicit CROSS JOIN when you have an intersects in the where clause. I personally find the CROSS JOIN (ie, the comma between two tables) more natural, as you can't really inner join on a geometric join.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.