I'm currently trying to iterate through a dataframe/csv and compare the dates of the rows with the same ID. If the dates are different or are a certain time-frame apart I want to create a '1' in another column (not shown) to mark that ID and row/s.
I'm looking to save the DATE values as variables and compare them against other DATE variables with the same ID. If the dates are set amount of time apart I'll create a 1 in another column on the same row.
| ID | DATE |
|---|---|
| 1 | 11/11/2011 |
| 1 | 11/11/2011 |
| 2 | 5/05/2011 |
| 2 | 20/06/2011 |
| 3 | 2/04/2011 |
| 3 | 10/08/2011 |
| 4 | 8/12/2011 |
| 4 | 1/02/2012 |
| 4 | 12/03/2012 |
For this post, I'm mainly looking to save the multiple values as variables or a list. I'm hoping to figure out the rest once this roadblock has been removed.
Here's what I got currently, but I don't think it'll be much help. Currently it iterates through and converts the date strings to dates. Which is what I want to happen AFTER getting a list of all the dates with the same ID value.
import pandas as pd
from datetime import *
filename = 'TestData.csv'
df = pd.read_csv(filename)
print (df.iloc[0,1])
x = 0
for i in df.iloc:
FixDate = df.iloc[x, 1]
d1, m1, y1 = FixDate.split('/')
d1 = int(d1)
m1 = int(m1)
y1 = int(y1)
finaldate = date(y1, m1, d1)
print(finaldate)
x = x + 1
Any help is appreciated, thank you!
1/0or something else?But I'd imagine I'd add that at the start before going through the dates.Not understand.test, if need sometnigh else new column is not necessary.