0

Hello postgres experts,

I have an app where users can vote on a poll. My schema looks like this:

polls table:

id name
1 Favorite fruit

options table:

id poll_id content
1 1 apple
2 1 orange
3 1 grape
4 1 banana

participants table:

id poll_id name
1 1 John
2 1 Jane

votes table:

id poll_id participant_id option_id type
1 1 1 1 yes
2 1 1 3 yes
3 1 2 2 yes

I made the poor choice of deciding to not create rows for "no" votes in the votes table thinking it would "save space". I realize now that it was not such a great idea because in the future I would like to know whether the user explicitly voted "no" or if perhaps the option was added after they voted and thus did not have the option to choose it. So I need to run a query that will fill all the missing "no" votes in the votes table for existing participants. The final result should look like this:

votes table:

id poll_id participant_id option_id type
1 1 1 1 yes
2 1 1 3 yes
3 1 2 2 yes
4 1 1 2 no
5 1 1 4 no
6 1 2 1 no
7 1 2 3 no
8 1 2 4 no

I have a dbfiddle with all the data already in it:

https://dbfiddle.uk/?rdbms=postgres_14&fiddle=7d0f4c83095638cc6006b1d7876d0e01

Side question: Should I be concerned about the size of the votes table in this schema? I expect it to quickly blow up to millions of rows. Is a schema where options are stored as an array in the polls table and votes stored in the participants table a better idea?

Thank you for your help.

2
  • What indices do you have on the tables? Especially UNIQUE ones Commented May 11, 2022 at 22:44
  • "Should I be concerned about the size of the votes table in this schema?" - this should probably be better asked as a separate question, maybe at Database Administrators. Regardless: in general, no, you should no be concerned about millions of rows. Storing arrays won't save that much space, and make querying much more complicated. Whether it's a "better idea" depends a lot on what are you doing with these tables and what exactly you are optimising for. Commented May 11, 2022 at 22:48

1 Answer 1

1

You seem to be looking for a JOIN of participants with options, EXCEPT the rows that already are in votes. There are various ways to do that, but most straightforward:

INSERT INTO votes(poll_id, participant_id, option_id, type)
SELECT poll_id, participant_id, option_id, 'no'
FROM (
  SELECT o.poll_id, p.id, o.id
  FROM options o
  JOIN participants p ON o.poll_id = p.poll_id
EXCEPT
  SELECT poll_id, participant_id, option_id
  FROM votes
) AS missing;

Alternatively:

INSERT INTO votes(poll_id, participant_id, option_id, type)
SELECT o.poll_id, p.id, o.id, 'no'
FROM options o
JOIN participants p ON o.poll_id = p.poll_id
WHERE NOT EXISTS (
  SELECT *
  FROM votes
  WHERE poll_id = o.poll_id AND participant_id = p.id AND option_id = o.id
);

Or, assuming you already have UNIQUE index on votes, just

INSERT INTO votes(poll_id, participant_id, option_id, type)
SELECT o.poll_id, p.id, o.id, 'no'
FROM options o
ON CONFLICT ON CONSTRAINT votes_p_key
DO NOTHING;
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks so much! I think you summed up what I was trying to do really well. I managed to get all 3 working with some minor adjustments ("no -> 'no', adding column aliases).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.