I have a DataFrame that looks like this:
df = pd.DataFrame.from_dict({'id': [1, 2, 1, 1, 2, 3],
'reward': [0.1, 0.25, 0.15, 0.05, 0.4, 0.45],
'time': ['10:00:00', '12:00:00', '10:00:05', '10:00:07', '12:00:03', '15:00:00']} )
What I want to get is:
out = pd.DataFrame.from_dict({'id': [1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3],
'reward': [0.1, 0, 0, 0, 0, 0.15, 0.0, 0.05, 0.25, 0.0, 0.0, 0.4, 0.45],
'time': ['10:00:00', '10:00:01', '10:00:02', '10:00:03', '10:00:04', '10:00:05', '10:00:06', '10:00:07',
'12:00:00', '12:00:01', '12:00:02', '12:00:03', '15:00:00']} )
In short, for each id, add the time rows missing with value 0. How do I do this? I wrote something with a loop, but it's going to be prohibitively slow for my use case which has several million rows