I am currently working on a Python/Pandas data science project for fun. The data that I am looking at has a Date column where the date looks like the following: 2016-07-16. The data type is also an object. What I want to do is go through each date and pull data from across that row. Now, some rows may have the same date because two separate attacks occurred on that date. (I am looking at terrorism data.) What I currently have done is the following:
dates = []
start = 0;
while start < 300:
date = data.iat[start, 1]
dates.append(date)
start += 1
This will get me ALMOST what I want. However, I have two problems, the start variable is set to 0 but I cannot go to 365 since, like I said, each date may have multiple attacks. So one year may have like 400 attacks. Is there a way that I could end the data collection at 2016-12-31 or 2017-01-01 for example? Basically, is there a way to quickly determine the number of attacks, per year for year after year? Thank you for any help!
Oh I will say that I was trying something like:
newDate = pd.to_datetime(startdate) + pd.DateOffset(days=1)
or
data['Date']) + timedelta(days=1)
to add one to the date to end at the year. Not getting what I wanted plus, there could be more than one entry per day.
to explain further I could have something like this:
Date Deaths Country
2002-01-01 2 India
2002-01-02 0 Pakistan
2001-01-02 1 France
The data has about 20,000 points and I need to find a way to stop it at the end of each year. That is my main issue. I cannot go to 365 because there may be multiple terrorist attacks on the same date around the world.