1

This is how the data is formatted:

item_name | item_serial | sub_group | conc_stuff | other_data | more_data
----------+-------------+-----------+------------+------------+-----------
foo        bar-01-a      widget      b-3          towel        dent
foo        bar-02-a      widget      a-1          42           mouse
foo        bar-03-a      widget      p-1          babel        dolphin
foo3       bar-21-f      widget      f-1          42           marvin
foo3       bar-22-f      widget      x-1          poetry       vogon

I have gotten the query to perform the way I want it to, the problem is I need to return more data.

SELECT item_name, 
       array_to_string(array_agg(conc_stuff), ',') as stuff 
FROM dataset 
WHERE some_selector = 'X' 
GROUP BY item_name 
ORDER BY item_name;

I have tried what seems simple yet logical to me:

SELECT item_name, 
       item_serial, 
       sub_group, 
       array_to_string(array_agg(conc_stuff), ',') as stuff 
FROM dataset 
WHERE some_selector = 'X' 
GROUP BY item_name 
ORDER BY item_name;

I need to return something that looks like this:

item_name | item_serial | sub_group |   stuff
----------+-------------+-----------+-------------
foo        bar-01-a      widget      a-1,b-3,p-1
foo3       bar-21-f      widget      f-1,x-1,g-5
foo6       bar-81-z      widget      r-1,d-8,w-0

instead of just this:

item_name |   stuff
----------+--------------
foo        a-1,b-3,p-1
foo3       f-1,x-1,g-5
foo6       r-1,d-8,w-0

When I try to add additional fields to the query, I get :

ERROR: column "stuff.item_serial" must appear in the GROUP BY clause or be used in an aggregate function

But I dont want to GROUP BY item_serial, I just want it to be returned with the aggregate, right?

Do I need to run a subquery? Im sure this is simple. The if there are multiple methods, which is most efficient? Some of the text I will be concatenating are coordinates (LARGE string of text).

7
  • 1
    Is item_serial a unique column? If there is more than one item_serial for a given item_name value, which one should be displayed? Commented Sep 11, 2012 at 19:13
  • none of the columns are, all basic. there can be multiple item_serials that belong to different sub_group Commented Sep 11, 2012 at 19:14
  • So which item_serial (or sub_group) do you want to take for a specific item_name? Commented Sep 11, 2012 at 19:36
  • very sorry I just noticed that error... editing Commented Sep 11, 2012 at 19:57
  • 1
    But the accepted answer will not return your example output. Commented Sep 11, 2012 at 20:20

2 Answers 2

2

You need to pick one value for each name you have. You can't have each name only returned once but the item_serial value multiple times.

Picking one value out of those that are there for a group value is done through aggregate functions:

SELECT item_name,
       min(item_serial),
       min(sub_group),
       array_to_string(array_agg(conc_stuff),',') AS stuff
FROM dataset
WHERE sub_group = 'widget'
GROUP BY item_name
ORDER BY item_name;

This picks th4e "first" item_serial and sub_group for each name.

If you want the last value, use max instead.

But the important thing to understand is that you have to pick one value for the non-grouped columns. And you need to tell the DBMS exactly which one it should use by supplying an aggregate function which picks one value.

SQLFiddle example: http://www.sqlfiddle.com/#!1/58009/1

Sign up to request clarification or add additional context in comments.

1 Comment

brilliant! this solves the secondary issue! I checked slightly too soon! though @greg did supply a correct solution for the question I asked. I apologize for the confusion. and +10 for introducing me to sqlfiddle!
1

Postgres 8.4 is somewhat stricter than more recent versions of Pg regarding this. When using a GROUP BY, all fields not containing an aggregate functions need to be grouped.

From Pg9.1, if you GROUP BY primary key, all other non grouped fields can be omitted.

Hope it helps.

Edit:

SELECT 
  item_name, 
  item_serial, 
  sub_group, 
  array_to_string(array_agg(conc_stuff), ',') as stuff 
FROM 
  dataset 
WHERE 
  some_selector = 'X' 
GROUP BY 
  item_name,
  item_serial,
  sub_group 
ORDER BY 
  item_name;

2 Comments

so I can select the others if I group by even though I dont need them grouped? Now I see the example it makes sense. Ill try this method, thanks!
your answer solves the issue I started with, though after posting I realized that wasnt my issue. it was a faulty line of logic on my part.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.