I have a dataframe with two columns, seconds passed and a value. In the seconds passed row, the dataframe sometimes skips a second (data missing). I would like to fill in the missing seconds and intrapolate the missing value.
What I have tried so far is taking the first and last measurement of the dataframe, and arange a Numpy array containing all seconds passed from start to finish, converted this into a dataframe matching the first and tried to join or merge them.
The original df looks like this:
seconds value
0 1 5.560000
1 3 5.590000
2 4 5.620000
3 5 5.646667
4 7 5.653333
5 9 5.760000
I then create another dataframe, df2:
seconds value
0 1 NaN
1 2 NaN
2 3 NaN
3 4 NaN
4 5 NaN
5 6 NaN
6 7 NaN
7 8 NaN
8 9 NaN
The I tried merging them together, both ways around like so
df = df.merge(df2, how='left')
What I expect the output to be is
seconds value
0 1 5.560000
1 2 NaN
2 3 5.590000
3 4 5.620000
4 5 5.646667
5 6 NaN
6 7 5.653333
7 8 NaN
8 9 5.760000
but the actual output is either df or df2, unmerged. Is there a way to achieve the expected result, and am I on the right track or could this be done much more easily?
df.merge(df2, how='outer'). Outer merge: "use union of keys from both frames, similar to a SQL full outer join; sort keys lexicographically".