I've already posted a similar question but I have to precise a little bit...
This is the original post : [Postgres jsonb search in array with greater operator (with jsonb_array_elements)
If I sum up,
Here is the database declaration (simplified):
CREATE TABLE documents (
document_id int4 NOT NULL GENERATED BY DEFAULT AS IDENTITY,
data_block jsonb NULL,
type varchar(10)
);
And this is an example of insert.
INSERT INTO documents (document_id, data_block)
VALUES(878979,
{"COMMONS": {"DATE": {"value": "2017-03-11"}},
"CARS": [
{"MODEL": {"value": "FERRARI F40"}},
{"MODEL": {"value": "PORSCHE CAYENNE"}},
{"MODEL": {"value": "FERRARI Testarossa"}}
]}, 'garage');
INSERT INTO documents (document_id, data_block)
VALUES(977656,
{"INVOICE": {"TOTAL_AMOUNT": {"value": "100.00"}},
"PAYABLE_INVOICE_LINES": [
{"AMOUNT": {"value": 75.00}},
{"AMOUNT": {"value": 25.00}}
]}, 'invoices');
INSERT INTO documents (document_id, data_block)
VALUES(345,
{"INVOICE": {"TOTAL_AMOUNT": {"value": "200.00"}},
"PAYABLE_INVOICE_LINES": [
{"AMOUNT": {"value": 125.00}},
{"AMOUNT": {"value": 75.00}}
]}, 'invoices');
In fact, I can store whatever in my JSONB column and now I want to search with specific operators.
Example of queries :
All documents with at least one line in PAYABLE_INVOICE_LINES greater than 100.00 : data_block.PAYABLE_INVOICE_LINES.AMOUNT > 100.00
All documents with a line in CARS starting with 'FERRARI' : data_block.CARS.MODEL like 'FERRARI%'
All documents where the TOTAL_AMOUNT = 100.00
All documents where the COMMONS.DATE > "2018-04-30"
All documents where the CARS.MODEL in the list ('PORSCHE CAYENNE')
All documents where the data_block.PAYABLE_INVOICE_LINES.AMOUNT is between 100.00 and 150.00
Bref, I want to query all I want. And I want to limit to 50 results with a sort.
My database contains millions of lines and I have some performance problem...
I've implemented the solution using a EXISTS :
select *
from documents d
where exists (
select 1
from jsonb_array_elements(d.data_block -> 'PAYABLE_INVOICE_LINES') as pil
where (pil->'AMOUNT'->>'value')::decimal >= 1000)
limit 50;
It is good but not enough... Some queries are slow. > 20s.
I don't know the structure of the document. We can insert cars, invoices, identity card and the customer can create his own structure. So, I can't create index easily and I can't create function to prepare the results.
Queries are very slow when search are on the array in the JSON object. Like PAYABLE_INVOICE_LINES.
Do you have ideas to how increase performance... ?
data_blockeditable? or is fixed. Meaning, once you save a row, it never changes, hardly changes, or it changes a lot? That would be important to know to give you some tips.