I am using PostgreSQL for its array functionality. Here's my schema:
CREATE TABLE questions (
id INTEGER PRIMARY KEY,
product_id INTEGER UNIQUE NOT NULL,
body VARCHAR(1000) NOT NULL,
date_written DATE NOT NULL DEFAULT current_date,
asker_name VARCHAR(60) NOT NULL,
asker_email VARCHAR(60) NOT NULL,
reported BOOLEAN DEFAULT FALSE,
helpful INTEGER NOT NULL DEFAULT 0
);
CREATE TABLE answers (
id PRIMARY KEY NOT NULL,
question_id INTEGER NOT NULL,
body VARCHAR(1000) NOT NULL,
date_written DATE NOT NULL DEFAULT current_date,
answerer_name VARCHAR(60) NOT NULL,
answerer_email VARCHAR(60) NOT NULL,
reported BOOLEAN DEFAULT FALSE,
helpful INTEGER NOT NULL DEFAULT 0
);
CREATE TABLE photos (
id INTEGER UNIQUE,
answer_id INTEGER NOT NULL,
photo VARCHAR(200)
);
I am trying to query my answers table to get a list of all the answers for a given question id and include an array of all photos that exist for that given answer_id. The results should be sorted in descending order of helpfulness. So far, I have a massive query that displays the results I'm looking for, but the execution time is 729.595 ms. I am trying to optimize to get the query's time down to 200 ms. I have the following indexes to try and optimize my query times:
indexname | indexdef
-----------------+---------------------------------------------------------------------------
answer_id | CREATE UNIQUE INDEX answer_id ON public.answers USING btree (id)
question_id | CREATE INDEX question_id ON public.answers USING btree (question_id)
idx_reported_id | CREATE INDEX idx_reported_id ON public.answers USING btree (reported, id)
answers_pkey | CREATE UNIQUE INDEX answers_pkey ON public.answers USING btree (id)
indexname | indexdef
----------------+----------------------------------------------------------------------------
id | CREATE UNIQUE INDEX id ON public.questions USING btree (id)
idx_q_reported | CREATE INDEX idx_q_reported ON public.questions USING btree (id, reported)
questions_pkey | CREATE UNIQUE INDEX questions_pkey ON public.questions USING btree (id)
indexname | indexdef
---------------+---------------------------------------------------------------------
photos_id_key | CREATE UNIQUE INDEX photos_id_key ON public.photos USING btree (id)
p_links | CREATE INDEX p_links ON public.photos USING btree (photo)
In my analysis, I noticed that the GroupAggregate is time-consuming: GroupAggregate (cost=126222.21..126222.71 rows=25 width=129) (actual time=729.497..729.506 rows=5 loops=1) Group Key: answers.id
Is there a way I can avoid the time-consuming GROUP BY? Am I missing something with the indexes? Here is the query itself:
SELECT answers.id,
question_id,
body,
date_written,
answerer_name,
answerer_email,
reported,
helpful,
ARRAY_AGG(photo) as photos
FROM answers
LEFT JOIN photos ON answers.id = photos.answer_id
WHERE reported IS
false AND answers.id IN (SELECT id
FROM answers
WHERE question_id = 20012)
GROUP BY answers.id
ORDER BY helpful DESC;
Thanks!