I'm having trouble getting a string in pandas to be replaced in the correct manner. I'm not sure if I'm limited to using pandas and there might not be a way to do this with solely using pandas.
This is how my dataframe looks:
(ID: 10) 247333605 0.0
(ID: 20) 36738870 0.0
(ID: 40) 4668036427 0.0
(ID: 50) 1918647972 0.0
(ID: 60) 4323165902 44125.0
(ID: 80) 145512255 0.0
Assigned (ID: 30) 42050340 0.0
Assigned (ID: 40) 130880371376 0.0
Assigning (ID: 30) 1095844753 0.0
Cancelled (ID: 40) 937280 0.0
Cancelled (ID: 80) 16857720813 0.0
Planned (ID: 20) 9060392597 0.0
Planning (ID: 10) 108484297031 0.0
Processed (ID: 70) 133289880880 0.0
Revoked (ID: 50) 2411903072 0.0
Writing (ID: 50) 146408550024 0.0
Written (ID: 60) 139458227923 1018230.0
For each (ID: x), it should be matched to the assigned (ID: x), cancelled (ID: x), etc with the correct ID.
Using lines similar to this line:
input_data['last_status'] = input_data.last_status.str.replace('(ID: 10)', 'Planning (ID: 10)')
My output is:
(Assigned (ID: 40)) 0.0
(Cancelled (ID: 80)) 0.0
(Planned (ID: 20)) 0.0
(Planning (ID: 10)) 0.0
(Writing (ID: 50)) 0.0
(Written (ID: 60)) 44125.0
Assigned (Assigned (ID: 40)) 0.0
Assigned (ID: 30) 0.0
Assigning (ID: 30) 0.0
Cancelled (Assigned (ID: 40)) 0.0
Cancelled (Cancelled (ID: 80)) 0.0
Planned (Planned (ID: 20)) 0.0
Planning (Planning (ID: 10)) 0.0
Processed (ID: 70) 0.0
Revoked (Writing (ID: 50)) 0.0
Writing (Writing (ID: 50)) 0.0
Written (Written (ID: 60)) 1018230.0
As you can see, all the (ID: x) got replaced and it still doesn't match the correct term.
My ideal dataframe would look like this:
Assigned (ID: 30) 42050340 0.0
Assigned (ID: 40) 130880371376 0.0
Assigning (ID: 30) 1095844753 0.0
Cancelled (ID: 40) 937280 0.0
Cancelled (ID: 80) 16857720813 0.0
Planned (ID: 20) 9060392597 0.0
Planning (ID: 10) 108484297031 0.0
Processed (ID: 70) 133289880880 0.0
Revoked (ID: 50) 2411903072 0.0
Writing (ID: 50) 146408550024 0.0
Written (ID: 60) 139458227923 1018230.0
I'm bound to using pandas because the dataset is huge, I have a different implementation but they take me days to run. Is there a way to do this right in pandas?
I've never asked something before on stackoverflow. I hope my question is clear.