Postgres Return 0 for Missing Rows

Question

I have 3 tables that I need for reporting:

    *dates*         
date_sk | full_date         
1       | 2013-01-01            
2       | 2013-02-01            
3       | 2013-03-01            

    *person*            
person_sk   | person_id  | person_name      
1           |   10       |   John       
2           |   11       |   Bob        
3           |   12       |   Jill       



    *person_portfolio*          
person_portfolio_sk | date_sk | person_sk | res_value | report_month
1                   |   1     |     1     |     15    |  2013-01-01
2                   |   1     |     2     |     10    |  2013-01-01
3                   |   1     |     3     |      1    |  2013-01-01
4                   |   2     |     1     |     30    |  2013-02-01

(imagine the 'date' table filled with every date for the past 10 and next 10 years)

I have been struggling to find out, for comparison reporting purposes using a date range, how to replace no entries during that timeframe with 0 values for the person. Here is the query I have tried:

SELECT
 p.person_id,
 COALESCE(pp.res_value,0)::NUMERIC(16,2) AS res_value,
 pp.report_month
FROM person p
LEFT JOIN person_portfolio pp
ON p.person_sk = pp.person_sk
LEFT JOIN date d
ON d.date_sk = pp.date_sk
WHERE person_id IN ('10','11','12')
AND pp.report_month >= '2013-01-01' --From Date
AND pp.report_month <= '2013-05-01' -- To Date
AND d.day_number_of_month = 1
ORDER BY p.person_id DESC;

The output I want to return would end up being 15 rows total. 3 people x 5 months of data = 15 total rows. I left out the day_number_of_month column in the date table but it holds the number 1 for the first of each month, 2 for the second, etc (every day of every month is in this table). It should look like this:

person_id   | res_value | report_month
10          |   15      |   2013-01-01
10          |   30      |   2013-02-01
10          |   0       |   2013-03-01
10          |   0       |   2013-04-01
10          |   0       |   2013-05-01
11          |   10      |   2013-01-01
11          |   0       |   2013-02-01
11          |   0       |   2013-03-01
11          |   0       |   2013-04-01
11          |   0       |   2013-05-01
12          |   1       |   2013-01-01
12          |   0       |   2013-02-01
12          |   0       |   2013-03-01
12          |   0       |   2013-04-01
12          |   0       |   2013-05-01

but I am only getting these results:

person_id   | res_value | report_month
10          |   15      |  2013-01-01
10          |   30      |  2013-02-01
11          |   10      |  2013-01-01
12          |    1      |  2013-01-01

So basically... is there currently a feasible way that I could inject the 0 value rows into the results when there is no entry for the 'report_month' for a specific person(s)? I would appreciate any kind of help as I have been working on this for 2 weeks now trying to complete this report. Thanks!

Gordon Linoff · Accepted Answer · 2015-01-30 22:00:08Z

1

Your description of the output provides guidance on how to solve the problem. First generate the rows, using a cross join. Then bring in the rest of the data.

Given the structure of your query, I don't see the purpose of the date table. If I assume that there is at least one report for each reporting period, I can do:

SELECT p.person_id,
       COALESCE(pp.res_value,0)::NUMERIC(16,2) AS res_value,
       d.report_month
FROM (SELECT DISTINCT person_id FROM person p WHERE person_id IN ('10', '11', '12')
     ) p CROSS JOIN
     (SELECT DISTINCT pp.report_month
      FROM person_portfolio pp
      WHERE pp.report_month >= '2013-01-01' AND
            pp.report_month <= '2013-05-01' 
     ) d LEFT JOIN
     person_portfolio pp
     ON p.person_sk = pp.person_sk and
        d.report_month = pp.report_month
ORDER BY p.person_id DESC, d.report_month asc;

However, this is not true in your data. You can generate the dates. In your environment, I don't know if it is better to use generate_series() or the date table. In any case, this would be replacing the d subquery above with one that has all the dates of interest.

edited Jan 30, 2015 at 22:00

answered Jan 30, 2015 at 18:26

Gordon Linoff

1.3m62 gold badges706 silver badges857 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

John B Over a year ago

The purpose of the date table in the query is because there are ALL dates (including days) in the date table. I left that out to keep it a bit easier to understand but I would eventually use an AND clause to do 'AND d.day_number_of_month = 1' for this monthly reporting. Thanks a ton for the help. I will check it out!

John B Over a year ago

Gordon: I just applied your solution and it does give me all person_id with 0.00 but only for one month of the result set and not the entire range.

Gordon Linoff Over a year ago

Replace the d subquery with something like (SELECT d.date FROM date where d.day_number_of_month = 1).

John B Over a year ago

Amazing. Thanks a ton. This is now working and returning exactly what I have been trying to return for the last couple weeks. I thought I was going to have to solve this at a PHP level but I am glad I didn't. Thanks Gordon!

Ditto · Accepted Answer · 2015-01-30 18:27:33Z

0

look up "OUTER JOIN" ..

Untested, but you could try something like this? (start with your date table, restrict the date range by the range you want, then start joining them to your other tables ... OUTER JOIN says "Even if you can't find a person with data on this date, keep the date .. I want to see it)

SELECT
 p.person_id,
 COALESCE(pp.res_value,0)::NUMERIC(16,2) AS res_value,
 pp.report_month
FROM date d
   LEFT OUTER JOIN person p
   ON d.date_sk = p.date_sk
   LEFT OUTER JOIN person_portfolio pp
   ON p.person_sk = pp.person_sk
WHERE person_id IN ('10','11','12')
AND d.date_sk >= '2013-01-01' --From Date
AND d.date_sk <= '2013-05-01' -- To Date
ORDER BY p.person_id DESC;

answered Jan 30, 2015 at 18:27

Ditto

3,3693 gold badges16 silver badges31 bronze badges

Collectives™ on Stack Overflow

Postgres Return 0 for Missing Rows

2 Answers 2

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related