1

I have a simple table of the form:

id gender a_feature (bool) b_feature (bool) ... xyz_feature (bool)

and I want to sum over all feature columns dependent on gender.

metric male female
a_feature 345 3423
b_feature 65 143
... ... ...
xyz_feature 133 5536

Is there a simple way to do this, e.g. using the information_schema.

I found only the solution below, but this is very ugly:

select
       'a_feature' as feature_name,
       count(case a_feature and gender = 'male') as male,
       count(case a_feature and gender = 'female') as female
from table
union
select
       b_feature as feature_name,
       count(case b_feature and gender = 'male') as male,
       count(case b_feature and gender = 'female') as female
from table
.
.
.
select
       xyz_feature as feature_name,
       count(case xyz_feature and gender = 'male') as male,
       count(case xyz_feature and gender = 'female') as female
from table
1
  • Tag your question with the database you are using. The code you have is not syntactically valid either. Commented Jun 23, 2021 at 12:29

2 Answers 2

2

You can unpivot and aggregate. One method is:

select name,
       sum(case when feature and gender = 'male' then 1 else 0 end) as num_male,
       sum(case when feature and gender = 'female' then 1 else 0 end) as num_female
from ((select 'a_feature' as name, a_feature as feature, gender
       from t
      ) union all
      (select 'b_feature' as name, b_feature, gender
       from t
      ) union all
      . . .
     ) f
group by name;

In Postgres, you would unpivot using a lateral join:

select name,
       sum(case when feature and gender = 'male' then 1 else 0 end) as num_male,
       sum(case when feature and gender = 'female' then 1 else 0 end) as num_female
from t cross join lateral
     (values ('a_feature', a_feature),
             ('b_feature', b_feature),
             . . .
     ) v(name, feature)
group by name;

You can generate the list for values() using information_schema.columns if you are reluctant to type it all in.

EDIT:

You can construct the values clause using something like this:

select string_agg('(''' || column_name || ''', column_name)', ', ') 
from information_schema.columns
where table_name = ?
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks Gordon, this is what i was looking for!
Dear Gordon, can you please draft an example how to use the information_schema.columns, to get a dynamic query?
0

When you use this in Postgres, what do you mean by t (I think t is for table) in "from t cross join lateral" and what do you mean by v in "v(name, feature)" ?

select name,

           sum(case when feature and gender = 'male' then 1 else 0 end) as num_male,
           sum(case when feature and gender = 'female' then 1 else 0 end) as num_female
    from t cross join lateral
         (values ('a_feature', a_feature),
                 ('b_feature', b_feature),
                 . . .
         ) v(name, feature)
    group by name;

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.