The problem I found that I cannot make Postgres use GIN index when I use jsonb_* functions in my queries. The problem exists for both jsonb_ops and jsonb_path_ops operator classes.
Let's first make preparations. Create the table
CREATE TABLE applications
(
id TEXT PRIMARY KEY,
application JSONB
);
CREATE INDEX ON applications USING gin (application jsonb_path_ops);
Let's fill the table with some bulk of data (to make Postgres use GIN index)
INSERT INTO applications(id, application)
VALUES ('1', '{
"type_code": 1,
"persons": [
{
"type_code": 4,
"firstname": "John",
"lastname": "Doe"
}
]
}');
INSERT INTO applications (SELECT i, a.application FROM applications a, generate_series(2, 100000) i);
Then I try to select the data like the following:
EXPLAIN ANALYZE
SELECT * FROM applications
WHERE applications.application @? '$.persons[*] ? (@.type_code == 3)';
Note that GIN index was used
-- Bitmap Heap Scan on applications (cost=64.00..68.01 rows=1 width=130) (actual time=0.410..0.419 rows=0 loops=1)
-- " Recheck Cond: (application @? '$.""persons""[*]?(@.""type_code"" == 3)'::jsonpath)"
-- -> Bitmap Index Scan on applications_application_idx (cost=0.00..64.00 rows=1 width=0) (actual time=0.095..0.096 rows=0 loops=1)
-- " Index Cond: (application @? '$.""persons""[*]?(@.""type_code"" == 3)'::jsonpath)"
-- Planning Time: 1.493 ms
-- Execution Time: 0.861 ms
Now I try to select the data like this:
EXPLAIN ANALYZE
SELECT * FROM applications
WHERE jsonb_path_exists(
applications.application,
'$.persons[*] ? (@.type_code == 3)'
);
You can see that GIN index was not used in this case:
-- Aggregate (cost=3374.33..3374.34 rows=1 width=8) (actual time=114.048..114.055 rows=1 loops=1)
-- -> Seq Scan on applications (cost=0.00..3291.00 rows=33333 width=0) (actual time=0.388..109.580 rows=100000 loops=1)
-- " Filter: jsonb_path_exists(application, '$.""persons""[*]?(@.""type_code"" == 3)'::jsonpath, '{}'::jsonb, false)"
-- Planning Time: 1.514 ms
-- Execution Time: 114.674 ms
Is it possible to make Postgres use GIN index in the second query?
Using jsonb_* functions is preferred for me because I can use positional parameters to build query:
SELECT * FROM applications
WHERE jsonb_path_exists(
applications.application,
'$.persons[*] ? (@.type_code == $person_type_code)',
jsonb_build_object('person_type_code', $1)
);
@@and@?with JSONPath expressions, but even though there are equivalentjsonb_path_X()functions to what those operators do, the index will only kick in if you use the operator and not the function. There are cases like PostGIS where functions do in fact use the index but that's because they wrap one or add an operator-based condition that's using the index, then use the actual function to just re-check pre-filtered rows.