I have the following data in my dataframe:
uniquecode1 year month Name Sale
1029 2020 5 ABC 10
1029 2020 6 ABC 20
1029 2020 10 ABC 30
1029 2020 11 ABC 35
1029 2020 12 ABC 38
1050 2020 4 DEF 39
1050 2020 5 DEF 40
1050 2020 6 DEF 31
1050 2020 7 DEF 45
1050 2020 8 DEF 55
1079 2020 4 GHI 65
1079 2021 2 GHI 75
10810 2021 1 XYZ 85
Let us say we are sitting in Mar'21. For the upper range of month in 2021, we will limit ourselves to Mar'21 minus 1 i.e. Feb'21
We see that data is divided into groups of different uniquecode1. For every group of uniquecode1, we have values missing in the column 'month'.
- For 1029, we have missing month values 7,8,9 for 2020 and 1,2 for 2021
- For 1050, we have missing month values 9,10,11,12 for 2020 and 1,2 for 2021
- For 1079, we have missing month values 5,6,7,8,9,10,11,12 for 2020 and 1 for 2021
- For 10810, we have missing month values 4,5,6,7,8,9,10,11,12 for 2020 and 2 for 2021
I am new to pandas. I am trying to build a logic which takes care of the above missing values. When the missing month and year values are inserted into the data, 'uniquecode1' and 'name' should be copied from their respective group values and 'Sale' should have value 0 or NaN.
Can somebody help me write a code for it in pandas? Let me know what other details you might require.