PostgreSQL data transformation - Turn rows into columns

Question

I have a table whose structure looks like the following:

k | i | p | v

Notice that the key (k) is not unique, there are no keys, nothing. Each key can have multiple attributes (i = 0, 1, 2, ...) which can be of different types (p) and have different values (v). One attribute type may also appear multiple times (p(i-1) = p(i)).

What I want to do is pick certain attribute types and their corresponding values and place them in the same row. For example I want to have:

k | attr_name1 | attr_name2

I have managed to make a query that does this and works for all keys (k) for which attr_name1 and attr_name2 appear in the column p of the initial table:

SELECT DISTINCT ON (key) fn.k AS key, fn.v AS attr_name1, a.v AS attr_name2 
FROM Table fn 
  LEFT JOIN Table a ON fn.k = a.k 
              AND a.p = 'attr_name2' 
WHERE fn.p = 'attr_name1'

I would like, however, to take into account the case where a certain key has no attribute named attr_name1 and insert a NULL value into the corresponding column of the new table. I am not sure how to achieve that. I have no issue using multiple queries or intermediate tables etc, but there are quite a lot of rows in the table and I need something that scales to millions of rows.

Any help would be appreciated.

Example:

Would turn into (assuming I am only interested in columns a, b, c):

k a  b  c
1 10 12 34
2 11 13 NULL

In SQL the resulting column names must be set in stone at the beginning, and cannot be produced on the fly by the query. Apart from that, the rest of your query should be pretty easy to write. If you really need variable column names, you can use Dynamic SQL as a step #1, to inspect the data and determine the column names, and then as as step #2 the Dynamic SQL could assemble a tailored query for your specific needs. — The Impaler
– The Impaler, Commented Oct 17, 2021 at 15:01
@TheImpaler I know the column names. But I still cannot figure out how to write the query. Notice that there other columns as well that I do not want to include in the final table. I updated the example. — Desperados
– Desperados, Commented Oct 17, 2021 at 16:29

MatBailie · Accepted Answer · 2021-10-17 16:34:44Z

2

I would use conditional aggregation. That is, an aggregate function around a CASE expression.

SELECT
  k,
  MAX(CASE WHEN p='a' THEN v END)   AS a,
  MAX(CASE WHEN p='b' THEN v END)   AS b,
  MAX(CASE WHEN p='c' THEN v END)   AS c
FROM
  your_table
GROUP BY
  k

This presumes that (k, p) is unique. If there are duplicate keys, this will clearly find the one v with the highest value (for each (k,p))

As a general rule this kind of pivoting makes the data harder to process in SQL. This is often done for display purposes because humans find this easier to read. However, from a software engineering perspective, such formatting should not be done in the data layer; be careful that by doing this you don't actually make your future life harder.

answered Oct 17, 2021 at 16:34

MatBailie

87.4k19 gold badges112 silver badges144 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Desperados Over a year ago

I am not sure I understand what you mean by 'this kind of pivoting makes the data harder to process in SQL'. The format of the initial table? What I am trying to do is get the data into that table so that I can then insert them into a new schema.

MatBailie Over a year ago

@Desperados - SQL is designed to for expressing operations on normalised data. The starting format is normalised, the target format is partially de-normalised. As you say that you're inserting these results into another schema, that means the new schema is partially de-normalised. Operations on that new schema will tend to be harder as a result. A trivialised example being the averages of a, b & c? SELECT p, AVG(v) FROM old GROUP BY p vs SELECT AVG(a), AVG(b), AVG(c) FROM new. The more complex the need, the heavier the overhead/repetition will be, due to that partial de-normalisation.

Collectives™ on Stack Overflow

PostgreSQL data transformation - Turn rows into columns

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related