1

I have some working code which pulls records from a forms table where '[email protected]' appears as a JSON value in one of the JSONB fields,column_a, column_b, column_c, or column_d.

SELECT * 
FROM forms f
WHERE exists (SELECT * 
              FROM jsonb_each_text(f.column_a) as e(ky,val)
              WHERE e.val = '[email protected]')   
UNION

SELECT * 
FROM forms f
WHERE exists (SELECT * 
              FROM jsonb_each_text(f.column_b) as e(ky,val)
              WHERE e.val = '[email protected]')
UNION

SELECT * 
FROM forms f
WHERE exists (SELECT * 
              FROM jsonb_each_text(f.column_c) as e(ky,val)
              WHERE e.val = '[email protected]')
UNION

SELECT * 
FROM forms f
WHERE exists (SELECT * 
              FROM jsonb_each_text(f.column_d) as e(ky,val)
              WHERE e.val = '[email protected]');

The JSON in the columns is similar to:

{ "xyz":"[email protected]", "def":"[email protected]", "lmn":"[email protected]" }

Although the code works, it looks highly repetitive/long-winded. Given that I can't change the JSON structure, is there are more concise way of building this query, and what indexes should I be building for those columns for the best performance?

1 Answer 1

2

Why not use or?

SELECT f.* 
FROM forms f
WHERE EXISTS (SELECT * 
              FROM jsonb_each_text(f.column_a) as e(ky,val)
              WHERE e.val = '[email protected]'
             ) OR
      EXISTS (SELECT * 
              FROM jsonb_each_text(f.column_b) as e(ky,val)
              WHERE e.val = '[email protected]'
             ) OR
      . . .

Presumably, this also eliminates the need for duplicate elimination, so the query should be faster as well.

EDIT:

If you want the sources, you can use correlated subqueries:

SELECT f.* 
FROM (SELECT f.*,
             EXISTS (SELECT * 
                     FROM jsonb_each_text(f.column_a) as e(ky,val)
                     WHERE e.val = '[email protected]'
                    ) as in_column_a,
             EXISTS (SELECT * 
                     FROM jsonb_each_text(f.column_b) as e(ky,val)
                     WHERE e.val = '[email protected]'
                    ) as in_column_b,
             . . .
      FROM forms f
     ) 
WHERE in_column_a OR in_column_b OR . . .

This is not quite as efficient, because it does not short-circuit the evaluation when it finds a match. On the other hand, it lists all columns that have a match.

Sign up to request clarification or add additional context in comments.

2 Comments

Yes, good point. I simplified the original which had a SELECT roletype='blah', * at the start of each sub-query - so that the source column can be differentiated. But the result seet should be quite small, so I should be iterate over them in the requesting code (NodeJS).
Nice. But I think I prefer your option 1 and do the munging post fetch. Thanks for the insight though :-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.