2

In my database I have the following schema:

CREATE TABLE survey_results (
    id integer NOT NULL
);

CREATE TABLE slide_results (
    id integer NOT NULL,
    survey_result_id integer,
    tags character varying[] DEFAULT '{}'::character varying[],
    content character varying,
    created_at timestamp with time zone NOT NULL
);

INSERT INTO survey_results (id)
  VALUES (1);

INSERT INTO slide_results (id, survey_result_id, tags, content, created_at)
  VALUES (1, 1, '{food}', 'Food slide', now());

INSERT INTO slide_results (id, survey_result_id, tags, content, created_at)
  VALUES (2, 1, '{motivation}', 'Motivation slide', now());

Now I want to have an SQL query that will return survey result id and content for slide results with specified tags. I wrote something like this:

select distinct on(sr.id)
  sr.id,
  slr.content AS food,
  slr2.content AS motivation
  from survey_results sr

  LEFT JOIN slide_results slr ON slr.survey_result_id = sr.id AND slr.id IN (
    SELECT id as id
    FROM slide_results
    WHERE 'food' = ANY(tags)
    ORDER BY created_at desc
  )

  LEFT JOIN slide_results slr2 ON slr2.survey_result_id = sr.id AND slr2.id IN (
    SELECT id as id
    FROM slide_results
    WHERE 'motivation' = ANY(tags)
    ORDER BY created_at desc
  )
  group by slr.content, slr2.content, sr.id

which returns:

| id  | food       | motivation       |
| --- | ---------- | ---------------- |
| 1   | Food slide | Motivation slide |

This query works fine, but I'm wondering if there is better way of doing this?

EDIT:

I forgot to add link do db-fiddle:

https://www.db-fiddle.com/f/gP761psywgmovfdTT7DjP4/0

2 Answers 2

1

I would write the query like this:

SELECT DISTINCT ON (sr.id)
       sr.id,
       slr.content AS food,
       slr2.content AS motivation
FROM survey_results AS sr
   LEFT JOIN (SELECT survey_result_id, content, created_at
              FROM slide_results
              WHERE '{food}' <@ tags) AS slr
      ON slr.survey_result_id = sr.id
   LEFT JOIN (SELECT survey_result_id, content, created_at
              FROM slide_results
              WHERE '{motivation}' <@ tags) AS slr2
      ON slr2.survey_result_id = sr.id
ORDER BY sr.id, slr.created_at DESC, slr2.created_at DESC;

The ORDER BY has to be in the outer query to be effective.

Using <@ rather than =ANY allows you to use a GIN index on slide_results.tags.

Using a subselect in the FROM list avoids an unnecessary join and an inefficient IN subquery.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, Your improvements looks very good. The problem is that using this technique I will need to add around 10 LEFT JOINS because I have 10 domains to search...
With this data model and this requirement, that's probably the best you will get.
1

I can't promise this is any better than what you have, but it seems slightly more scalable. Without seeing your full dataset and desired results, it's hard to know if this will backfire in any way:

select
  sl.survey_result_id,
  array_to_string (array_agg (distinct sl.content) filter
      (where 'food' = any (sl.tags)), ',') as food,
  array_to_string (array_agg (distinct sl.content) filter
      (where 'motivation' = any (sl.tags)), ',') as motivation
from
  survey_results s
  join slide_results sl on s.id = sl.survey_result_id
group by survey_result_id

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.