How to merge all integer arrays from all records into single array in postgres

Question

I have a column which is of type integer array. How can I merge all of them into a single integer array?

For example: If I execute query:

select column_name from table_name

I get result set as:

-[RECORD 1]----------
column_name | {1,2,3}
-[RECORD 2]----------
column_name | {4,5}

How can I get {1,2,3,4,5} as final result?

mu is too short · Accepted Answer · 2014-03-27 03:57:06Z

76

You could use unnest to open up the arrays and then array_agg to put them back together:

select array_agg(c)
from (
  select unnest(column_name)
  from table_name
) as dt(c);

answered Mar 27, 2014 at 3:57

mu is too short

436k71 gold badges863 silver badges822 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Craig Ringer Over a year ago

Likely much more efficient than mine, but won't necessarily retain element order; you'd have to use with ordinality for that.

mu is too short Over a year ago

@Craig: Which version of PostgreSQL has WITH ORDINALITY? Anyway, custom aggregates are kinda' cool

Craig Ringer Over a year ago

Added in PostgreSQL 9.4, so "coming soon". I'm too used to working with git master...

Erwin Brandstetter Over a year ago

Details for WITH ORDINALITY.

AntonOfTheWoods Mar 25 at 6:52

Also works well to get the unique elements of all the arrays, by simply adding a distinct before the unnest.

Craig Ringer · Accepted Answer · 2021-03-24 13:29:33Z

25

Define a trivial custom aggregate:

CREATE AGGREGATE array_cat_agg(anyarray) (
  SFUNC=array_cat,
  STYPE=anyarray
);

and use it:

WITH v(a) AS ( VALUES (ARRAY[1,2,3]), (ARRAY[4,5,6,7]))
SELECT array_cat_agg(a) FROM v;

If you want a particular order, put it within the aggregate call, i.e. array_cat_agg(a ORDER BY ...)

This is ~~roughly O(n log n) for n rows (I think)~~ O(n²) so it is unsuitable for long sets of rows. For better performance you'd need to write it in C, where you can use the more efficient (but horrible to use) C API for PostgreSQL arrays to avoid re-copying the array each iteration.

edited Mar 24, 2021 at 13:29

answered Mar 27, 2014 at 3:54

Craig Ringer

329k83 gold badges742 silver badges820 bronze badges

6 Comments

John Bledsoe Over a year ago

FWIW this aggregate runs in quadratic time O(n^2) and so isn't suitable for large datasets. source: I used it on some large datasets in production and had to rip it out =)

Craig Ringer Over a year ago

@JohnBledsoe I'm surprised it's O(n^2), are you sure? It copies the whole array once per iteration, including all prior members, when it constructs a new one. Either way, it'll still be plenty slow for long inputs.

John Bledsoe Over a year ago

I've been out of CS school for a long time, so I'm not sure =) but yeah copying an N-length array N times is O(n^2) IIRC.

Craig Ringer Over a year ago

@JohnBledsoe The array starts at length 1. Each time you copy it, it grows by 1 element. Assuming each input array is of the same length (treated as 1 element for this purpose): 1 elements copied @ n=1 . 3 elements copied @ n=2 . 6 elements copied @ n=3. 10 elements copied @ n=4. It's a series sum n∑n . Which is (n·n)/2 or n²/2 .. so O(n^2). You're quite right. Pg doesn't have mutable arrays at the SQL level so you'd need to use a PL (say, Python with numpy or intarray) or use C to do it more efficiently.

Ezra Steinmetz Over a year ago

Not sure about the mathematics here, but from my experience it was very slow too. Took forever (I gave up after 30 seconds) on a 300K rows table, while mu-is-too-short's solution took 215ms.

|

Maxim · Accepted Answer · 2021-12-01 12:34:47Z

4

to merge arrays you can use || operator more

to put result in flat list use uunest func

example:

select unnest(ARRAY[1,2] || ARRAY[3,2] || ARRAY[4,5]) as number;

edited Dec 1, 2021 at 12:34

answered Dec 1, 2021 at 12:29

Maxim

2,40110 silver badges17 bronze badges

Comments

mouse500 · Accepted Answer · 2021-09-02 01:41:16Z

3

string_to_array(string_agg(array_to_string(column_name ,','),','),',')

This can be some clue for your situation. I've done like this.

answered Sep 2, 2021 at 1:41

mouse500

515 bronze badges

1 Comment

Community Over a year ago

Please provide additional details in your answer. As it's currently written, it's hard to understand your solution.

Vladimir Aleshin · Accepted Answer · 2018-03-30 19:42:18Z

2

You could use lateral subquery for that:

select array_agg(u.a)
from (values (array[1, 2, 3]), (array[4, 5])) t (a)
    join lateral unnest(t.a) u (a) on true;

answered Mar 30, 2018 at 19:42

Vladimir Aleshin

1914 bronze badges

Comments

jdmichal · Accepted Answer · 2024-08-29 16:27:12Z

To enrich Craig Ringer's answer, his CREATE AGGREGATE solution is good but reportedly slow. That's because it only has an SFUNC defined, which means it can only aggregate a single item into the array, one at a time.

This can be improved by defining the aggregate as a partial aggregate. In this mode, it can collect sub-arrays -- possibly in parallel -- then merge those arrays into larger arrays. This is done by adding a COMBINEFUNC definition, which in this case is also just array_cat. I also mark the function as parallel safe.

CREATE AGGREGATE public.array_agg_flat(anycompatiblearray) (
    SFUNC = array_cat,
    COMBINEFUNC = array_cat,
    STYPE = anycompatiblearray,
    PARALLEL = SAFE
);

This should drastically reduce the number of array (re)allocations occurring behind the scenes.

Patrick · Accepted Answer · 2014-03-27 03:34:01Z

The only way you can do this is inside a function:

CREATE FUNCTION merge_arrays() RETURNS int[] AS $$
DECLARE
  this record;
  res  int[];
BEGIN
  FOR this IN
    SELECT column_name FROM table_name
  LOOP
    array_cat(res, this.column_name);
  END LOOP;
  RETURN res;
END; $$ LANGUAGE plpgsql;

Then you can

SELECT merge_arrays();

to get the result you are looking for.

This of course hard-codes your table definition into the function, which may (or may not) be an issue. In addition, you may want to put a WHERE clause in the loop query to restrict the records whose arrays you want to append; you might use an additional function parameter to do this.

Keep in mind that you might get a really large array as your table increases in size and that may affect performance. Do you really need all sub-arrays from all records in one large array? Have a look at your application and see if you can do the merge at that level, rather than in a single query.

Collectives™ on Stack Overflow

How to merge all integer arrays from all records into single array in postgres

7 Answers 7

5 Comments

6 Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

5 Comments

6 Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related