1

Given the following starting data:

CREATE TABLE t1 AS
  SELECT generate_series(1, 20) AS id,
    (SELECT array_agg(generate_series) FROM generate_series(1, 6)) as array_1;

CREATE TABLE t2 AS
  SELECT generate_series(5, 10) AS id,
    (SELECT array_agg(generate_series) FROM generate_series(7, 10)) as array_2;

CREATE TABLE t3 AS
  SELECT generate_series(8, 15) AS id,
    (SELECT array_agg(generate_series) FROM generate_series(11, 15)) as array_3;

I would like to do an outer join between several tables, each with a fixed-length array column that is uniform within a given table, but may differ from table to table (as in the examples above), concatenating the array columns in each table into one large array column. I was wondering if there is an efficient or straightforward way to maintain consistent indexing in the new combined column, replacing NULL column values (caused by the outer join) with an array of NULL values so that the final array column will have a uniform length. Unlike in the above example, in my actual use case, I won't know the length of each table's array column a priori, only that it will be of a uniform length throughout that table. In other words, instead of this query:

SELECT id, (array_1 || array_2 || array_3 ) AS combined_array FROM
t1 LEFT OUTER JOIN t2 USING(id) LEFT OUTER JOIN t3 USING (id);

Which produces:

id |            combined_array
----+---------------------------------------
 1 | {1,2,3,4,5,6}
 2 | {1,2,3,4,5,6}
 3 | {1,2,3,4,5,6}
 4 | {1,2,3,4,5,6}
 5 | {1,2,3,4,5,6,7,8,9,10}
 6 | {1,2,3,4,5,6,7,8,9,10}
 7 | {1,2,3,4,5,6,7,8,9,10}
 8 | {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}
 9 | {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}
10 | {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}
11 | {1,2,3,4,5,6,11,12,13,14,15}
12 | {1,2,3,4,5,6,11,12,13,14,15}
13 | {1,2,3,4,5,6,11,12,13,14,15}
14 | {1,2,3,4,5,6,11,12,13,14,15}
15 | {1,2,3,4,5,6,11,12,13,14,15}
16 | {1,2,3,4,5,6}
17 | {1,2,3,4,5,6}
18 | {1,2,3,4,5,6}
19 | {1,2,3,4,5,6}
20 | {1,2,3,4,5,6}
(20 rows)

I would like the result to look like:

id |            combined_array
----+---------------------------------------
 1 | {1,2,3,4,5,6,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL}
 2 | {1,2,3,4,5,6,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL}
 3 | {1,2,3,4,5,6,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL}
 4 | {1,2,3,4,5,6,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL}
 5 | {1,2,3,4,5,6,7,8,9,10,NULL,NULL,NULL,NULL,NULL}
 6 | {1,2,3,4,5,6,7,8,9,10,NULL,NULL,NULL,NULL,NULL}
 7 | {1,2,3,4,5,6,7,8,9,10,NULL,NULL,NULL,NULL,NULL}
 8 | {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}
 9 | {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}
10 | {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}
11 | {1,2,3,4,5,6,NULL,NULL,NULL,NULL,11,12,13,14,15}
12 | {1,2,3,4,5,6,NULL,NULL,NULL,NULL,11,12,13,14,15}
13 | {1,2,3,4,5,6,NULL,NULL,NULL,NULL,11,12,13,14,15}
14 | {1,2,3,4,5,6,NULL,NULL,NULL,NULL,11,12,13,14,15}
15 | {1,2,3,4,5,6,NULL,NULL,NULL,NULL,11,12,13,14,15}
16 | {1,2,3,4,5,6,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL}
17 | {1,2,3,4,5,6,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL}
18 | {1,2,3,4,5,6,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL}
19 | {1,2,3,4,5,6,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL}
20 | {1,2,3,4,5,6,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL}
(20 rows)

So that each row contains an array of length 15.

2
  • n-length? Can each row have a different length? Do you take biggest value then? If so what to do with shorter rows? Commented Jun 23, 2016 at 17:49
  • Sorry, my description is somewhat confusing. Each table will have an array column with a uniform length throughout that table, but I don't necessarily know the length of the array for a particular table a priori if that makes sense. So table one may have an array column of length 10 and table 2 might have an array column of length 3. In my use case, I can't hard code these values. Commented Jun 23, 2016 at 18:09

1 Answer 1

1

To answer my own question, here is the query that I came up with that appears to do the job. It doesn't seem particularly elegant or efficient to me so definitely still open to other answers.

SELECT id, (
  coalesce(array_1, array_fill(NULL::INT,
    ARRAY[(SELECT max(array_length(array_1, 1)) FROM t1)])) ||
  coalesce(array_2, array_fill(NULL::INT,
    ARRAY[(SELECT max(array_length(array_2, 1)) FROM t2)])) ||
  coalesce(array_3, array_fill(NULL::INT,
    ARRAY[(SELECT max(array_length(array_3, 1)) FROM t3)]))
) AS combined_array FROM
t1 LEFT OUTER JOIN t2 USING(id) LEFT OUTER JOIN t3 USING (id);
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.