I am currently working to find a more faster solution than the one in this link. The issue is, when my data reaches a relatively huge amount (e.g. 1M rows), it is considerably slow, especially when it is a second-by-second instead of the minute-by-minute in the original post.
So I am trying to find a more efficient way of doing it using Numpy arange. But I am running into an error
#First- with pd.to_datetime
x = pd.DataFrame({ "ID": np.repeat(df.ID.values, df.time_delta.values),
"time": np.arange(pd.to_datetime(df.FROM.values), pd.to_datetime(df.TO.values), np.timedelta64(1,'s'))})
#Second - without pd.to_datetime
x = pd.DataFrame({ "ID": np.repeat(df.ID.values, df.time_delta.values),
"time": np.arange(df.FROM.values, df.TO.values, np.timedelta64(1,'s'))})
The idea here is to repeat the ID for how many seconds from the column FROM to column TO (time_delta). But I keep getting the error ValueError: Could not convert object to NumPy timedelta.
Here is the dtypes for my df,
ID object
FROM datetime64[ns, UTC]
TO datetime64[ns, UTC]
time_delta int64
dtype: object
Can anyone tell me what I am doing wrong?
Thank you in advance.
df.head()ordf.head().to_dict()?