2

Current data :

  |ID | DT     | STATE | V|
  |1  | 201901 | PA    | 1|
  |1  | 201902 | PA    | 6|
  |2  | 201902 | PA    | 3|
  |1  | 201902 | CA    | 3|
  |2  | 201901 | CA    | 1|

I want to create rows with all combinations of ID, DT and STATE with V being 0 where its not available like this :

  |ID | DT     | STATE | V|
  |1  | 201901 | PA    | 1|
  |1  | 201902 | PA    | 6|
  |1  | 201901 | CA    | 0|
  |1  | 201902 | CA    | 3|
  |2  | 201901 | PA    | 0|
  |2  | 201902 | PA    | 3|
  |2  | 201901 | CA    | 1|
  |2  | 201902 | CA    | 0|

Thanks

0

2 Answers 2

3

You can do MultiIndex index then reindex

idx=pd.MultiIndex.from_product([df.ID.unique(),df.DT.unique(),df.STATE.unique()])
df=df.set_index(['ID','DT','STATE']).reindex(idx,fill_value=0).reset_index()
df
   level_0  level_1 level_2  V
0        1   201901      PA  1
1        1   201901      CA  0
2        1   201902      PA  6
3        1   201902      CA  3
4        2   201901      PA  0
5        2   201901      CA  1
6        2   201902      PA  3
7        2   201902      CA  0
Sign up to request clarification or add additional context in comments.

1 Comment

Nice~ +1 Very similar answer. Slightly more concise. It's a good idea to use set_index rather than groupby unless there are multiple rows for ['ID','DT','STATE'] where that info might need to be summarize by a .mean(), .sum(), etc. -- otherwise there would be duplicates. Is that correct @YOBEN_S ?
-1

groupby the first three columns and .reindex by those columns and .sort_values as desired.

input:

    ID  DT  STATE   V
0   1   201901  PA  1
1   1   201902  PA  6
2   2   201902  PA  3
3   1   201902  CA  3
4   2   201901  CA  1

code

i = [df['ID'].unique(), df['DT'].unique(), df['STATE'].unique()]
df = df.groupby(['ID', 'DT', 'STATE']).sum() \
   .reindex(index=pd.MultiIndex.from_product(i, names=['ID', 'DT', 'STATE']), fill_value=0) \
   .reset_index().sort_values(['ID', 'STATE', 'DT'], ascending=[True,False,True])
df

output:

    ID  DT      STATE   V
0   1   201901  PA      1
8   1   201902  PA      6
2   1   201901  CA      0
10  1   201902  CA      3
256 2   201901  PA      0
264 2   201902  PA      3
258 2   201901  CA      1
266 2   201902  CA      0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.