0

I have a table whose structure looks like the following:

k | i | p | v

Notice that the key (k) is not unique, there are no keys, nothing. Each key can have multiple attributes (i = 0, 1, 2, ...) which can be of different types (p) and have different values (v). One attribute type may also appear multiple times (p(i-1) = p(i)).

What I want to do is pick certain attribute types and their corresponding values and place them in the same row. For example I want to have:

k | attr_name1 | attr_name2

I have managed to make a query that does this and works for all keys (k) for which attr_name1 and attr_name2 appear in the column p of the initial table:

SELECT DISTINCT ON (key) fn.k AS key, fn.v AS attr_name1, a.v AS attr_name2 
FROM Table fn 
  LEFT JOIN Table a ON fn.k = a.k 
              AND a.p = 'attr_name2' 
WHERE fn.p = 'attr_name1'

I would like, however, to take into account the case where a certain key has no attribute named attr_name1 and insert a NULL value into the corresponding column of the new table. I am not sure how to achieve that. I have no issue using multiple queries or intermediate tables etc, but there are quite a lot of rows in the table and I need something that scales to millions of rows.

Any help would be appreciated.

Example:

k i p v
1 0 a 10
1 1 b 12
1 2 c 34
1 3 d 44
1 4 e 09
2 0 a 11
2 1 b 13
2 2 d 22
2 3 f 34

Would turn into (assuming I am only interested in columns a, b, c):

k a  b  c
1 10 12 34
2 11 13 NULL
4
  • 1
    Can you please post sample data and and expected result? Commented Oct 17, 2021 at 14:49
  • @Stefanov.sm I added an example for a couple of rows. Commented Oct 17, 2021 at 14:56
  • 1
    In SQL the resulting column names must be set in stone at the beginning, and cannot be produced on the fly by the query. Apart from that, the rest of your query should be pretty easy to write. If you really need variable column names, you can use Dynamic SQL as a step #1, to inspect the data and determine the column names, and then as as step #2 the Dynamic SQL could assemble a tailored query for your specific needs. Commented Oct 17, 2021 at 15:01
  • @TheImpaler I know the column names. But I still cannot figure out how to write the query. Notice that there other columns as well that I do not want to include in the final table. I updated the example. Commented Oct 17, 2021 at 16:29

1 Answer 1

2

I would use conditional aggregation. That is, an aggregate function around a CASE expression.

SELECT
  k,
  MAX(CASE WHEN p='a' THEN v END)   AS a,
  MAX(CASE WHEN p='b' THEN v END)   AS b,
  MAX(CASE WHEN p='c' THEN v END)   AS c
FROM
  your_table
GROUP BY
  k

This presumes that (k, p) is unique. If there are duplicate keys, this will clearly find the one v with the highest value (for each (k,p))

As a general rule this kind of pivoting makes the data harder to process in SQL. This is often done for display purposes because humans find this easier to read. However, from a software engineering perspective, such formatting should not be done in the data layer; be careful that by doing this you don't actually make your future life harder.

Sign up to request clarification or add additional context in comments.

2 Comments

I am not sure I understand what you mean by 'this kind of pivoting makes the data harder to process in SQL'. The format of the initial table? What I am trying to do is get the data into that table so that I can then insert them into a new schema.
@Desperados - SQL is designed to for expressing operations on normalised data. The starting format is normalised, the target format is partially de-normalised. As you say that you're inserting these results into another schema, that means the new schema is partially de-normalised. Operations on that new schema will tend to be harder as a result. A trivialised example being the averages of a, b & c? SELECT p, AVG(v) FROM old GROUP BY p vs SELECT AVG(a), AVG(b), AVG(c) FROM new. The more complex the need, the heavier the overhead/repetition will be, due to that partial de-normalisation.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.