0

I'm trying to loop through a table that contains covid-19 data. My table has 4 columns: month, day, location, and cases. The values of each column in the table is stored in its own list, so each list has the same length. (Ie. there is a month list, day list, location list, and cases list). There are 12 months, with up to 31 days in a month. Cases are recorded for many locations around the world. I would like to figure out what day of the year had the most total combined global cases. I'm not sure how to structure my loops appropriately. An oversimplified sample version of the table represented by the lists is shown below.

In this small example, the result would be month 1, day 3 with 709 cases (257 + 452).

Month Day Location Cases
1 1 CAN 124
1 1 USA 563
1 2 CAN 242
1 2 USA 156
1 3 CAN 257
1 3 USA 452
. . ... ...
12 31 ... ...

4 Answers 4

1

I assume that you've put all the data in the same data frame, df.

df = pandas.DataFrame()
df['Month'] = name_of_your_month_list
df['Day'] = name_of_your_daylist
df['Location'] = name_of_your_location_list
df['Cases'] = name_of_your_cases_list

df.Cases.max() gives you the biggest number of cases. I assume that there is on year only in the dataset. So df[df.Cases==df.Cases.max()].index gives youth index that you search

For the the day, just filter :

df[df.index==df[df.Cases==df.Cases.max()].index].Day

For the month:

df[df.index==df[df.Cases==df.Cases.max()].index].Month

For the number of cases:

df[df.index==df[df.Cases==df.Cases.max()].index].Cases

For the country :

df[df.index==df[df.Cases==df.Cases.max()].index].Location

Reading the comment, it is not clear if you search the biggest cases in a Location or of the day. If its from the day, you'll have to filter first with a groupby('Day') function, to use it as groupby('Day').max()

Sign up to request clarification or add additional context in comments.

Comments

0

You group your dataframe by month and day. Then iterate through the groups to find the group in which the sum of cases in all locations was max as shown below:

import pandas as pd
df = pd.DataFrame({'Month':[1,1,1,1,1,1], 'Day':[1,1,2,2,3,3],
                   'Location':['CAN', 'USA', 'CAN', 'USA','CAN', 'USA'],
                   'Cases':[124,563,242,156,257,452]})

grouped = df.groupby(['Month', 'Day'])
max_sum = 0
max_day = None
for idx, group in grouped:
    if group['Cases'].sum() > max_sum:
        max_sum = group['Cases'].sum()
        max_day = group

month = max_day['Month'].iloc[1]
day = max_day['Day'].iloc[1]
print(f'Maximum cases of {max_sum} occurred on {month}/{day}.')

#prints: Maximum cases of 709 occurred on 1/3

If you don't want to use Pandas, this is how you do it:

months = [1,1,1,1,1,1]
days = [1,1,2,2,3,3]
locations = ['CAN', 'USA', 'CAN', 'USA','CAN', 'USA']
cases = [124,563,242,156,257,452]
dic = {}
target_day = 0
count = 0

for i in range(len(days)):
    if days[i] != target_day:
        target_day = days[i]
        count = cases[i]
    else:
        count += cases[i]
        dic[f'{months[i]}/{days[i]}'] = count

max_cases = max(dic.values())
worst_day = list(dic.keys())[list(dic.values()).index(max_cases)]

print(f'Maximum cases of {max_cases} occurred on {worst_day}.')

#Prints: Maximum cases of 709 occurred on 1/3.

2 Comments

Thank you for your solution. How would you do it without importing pandas? Only using lists.
@ls14 See my amended answer.
0

you can check the max value in your cases list first. then map the max case's index with other three lists and obtain their values. ex: caseList = [1,2,3,52,1,0]

the maximum is 52. its index is 3. in your case you can get the monthList[3], dayList[3], locationList[3] respectively. then you get the relevant day, month and country which is having the most total global cases.

check whether this will help in your scenario.

1 Comment

The problem at hand though is that we must find the total global cases for each day, so not just the max value in the cases table. We must find the combined cases for each day of each month.
0

You may use this strategy to get the required result.

daylist,monthlist,location,Cases = [1, 2, 3, 4], [1,1,1,1],['CAN','USA','CAN','USA'],[124,563,242,999]    
maxCases = Cases.index(max(Cases))
print("Max Case:",Cases[maxCases])
print("Location:",location[maxCases])
print("Month:",monthlist[maxCases])
print("Day:",daylist[maxCases])

1 Comment

But it needs to find the highest number of of combined cases (for each location, for each day, for each month). Not just the max value in the cases list. So in my example, the date with the most cases would be month 1, day 3 with 709 total cases.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.