I used the groupby method from pandas that can group by id and time in this example csv for example:
| id | month | average tree growth (cm)|
|----|-------|-------------------------|
| 1 | 4 | 9 |
| 1 | 5 | 4 |
| 1 | 6 | 7 |
| 2 | 1 | 9 |
| 2 | 2 | 9 |
| 2 | 3 | 8 |
| 2 | 4 | 6 |
However, each id should have 12 months and I will need to fill in the average tree height at that missing month to be null value, like this:
| id | month | average tree growth (cm)|
|----|-------|-------------------------|
| 1 | 1 | nan |
| 1 | 2 | nan |
| 1 | 3 | nan |
| 1 | 4 | 9 |
| 1 | 5 | 4 |
| 1 | 6 | 7 |
| 1 | 7 | nan |
| 1 | 8 | nan |
| 1 | 9 | nan |
| 1 | 10 | nan |
| 1 | 11 | nan |
| 1 | 12 | nan |
| 2 | 1 | 9 |
This is for bokeh plotting purpose, how do I add the missing month to each id and fill the average height to nan in this case using python? Is there any easier way than brute force looping all id and check for months? Any hint would be appreciated!