I have a Dataframe consisting of multiple date fields as follows
df = pd.DataFrame({
'Date1': ['2017-12-14', '2017-12-14', '2017-12-14', '2017-12-15', '2017-12-14', '2017-12-14', '2017-12-14'],
'Date2': ['2018-1-17', "NaT","NaT","NaT","NaT","NaT","NaT"],
'Date3': ['2018-2-15',"NaT","NaT",'2018-4-1','NaT','NaT','2018-4-1'],
'Date4': ['2018-3-11','2018-4-1','2018-4-1',"NaT",'2018-4-1','2018-4-2',"NaT"]})
df
Date1 Date2 Date3 Date4
2017-12-14 2018-1-17 2018-2-15 2018-3-11
2017-12-14 NaT NaT 2018-4-1
2017-12-14 NaT NaT 2018-4-1
2017-12-15 NaT 2018-4-1 NaT
2017-12-14 NaT NaT 2018-4-1
2017-12-14 NaT NaT 2018-4-2
2017-12-14 NaT 2018-4-1 NaT
| Date1 | Date2 | Date3 | Date4 |
|---|---|---|---|
| 2017-12-14 | 2018-1-17 | 2018-2-15 | 2018-3-11 |
| 2017-12-14 | NaT | NaT | 2018-4-1 |
| 2017-12-14 | NaT | NaT | 2018-4-1 |
| 2017-12-15 | NaT | 2018-4-1 | NaT |
| 2017-12-14 | NaT | NaT | 2018-4-1 |
| 2017-12-14 | NaT | NaT | 2018-4-2 |
| 2017-12-14 | NaT | 2018-4-1 | NaT |
As you can see there are lots of empty date values which i need to be filled up with dates from the immediate next column.
Expected Output:
| Date1 | Date2 | Date3 | Date4 |
|---|---|---|---|
| 2017-12-14 | 2018-1-17 | 2018-2-15 | 2018-3-11 |
| 2017-12-14 | 2018-4-1 | 2018-4-1 | 2018-4-1 |
| 2017-12-14 | 2018-4-1 | 2018-4-1 | 2018-4-1 |
| 2017-12-15 | 2018-4-1 | 2018-4-1 | NaT |
| 2017-12-14 | 2018-4-1 | 2018-4-1 | 2018-4-1 |
| 2017-12-14 | 2018-4-2 | 2018-4-2 | 2018-4-2 |
| 2017-12-14 | 2018-4-1 | 2018-4-1 | NaT |
Please note : the last column can remain NaT
I have tried bfill method in vain :
df.bfill(axis=1)
NaTis not really null, it is a string. see if you can change to actual nan. maybe something like this :df.replace("NaT", pd.NaT).bfill(1)