0

I have a few dataframes that look like this that follow the same format.

this is df1:

    country ticker 
0   US      MSFT
1   US      AAPL
2   GERMANY NSU.DE
3   SG      D05.SI
4   AUS     WOW.AX

this is df2:

    country ticker 
0   HK      0700.HK
1   HK      1337.HK
2   SWISS   NESN.SW
3   SG      OV8.SI

The dataframes are saved into csv files with multiple sheets. I can cycle over them easily.

I want to create a frame or dictionary or variables that counts the total times the countries appear like this.

    country count
0   US      2
1   GERMANY 1
2   SG      2
3   AUS     1
4   SWISS   1
5   HK      2

How can I do that? It doesnt have to be a dataframe.

3
  • 2
    a = dict(df['country'].value_counts(dropna=False)), should give you a dict obj like {"US": 2, ...}, you can loop across your sheets and update your dict object to get all counts. Commented May 6, 2020 at 20:09
  • doesnt work, just gives me the number of times the last instance of something appears Commented May 6, 2020 at 20:16
  • 3
    Use pd.concat to concatenate the dataframes then use 'value_counts'. Commented May 6, 2020 at 20:20

1 Answer 1

1

You can group each df by country, and then merge all dfs and sum counts:

#group by countries and get count
df1 = csvDf1.groupby('country').count().reset_index()
df2 = csvDf2.groupby('country').count().reset_index()
df3 = csvDf3.groupby('country').count().reset_index()

#merge all dfs
combinedDf = df1.merge(df2, how='outer', on=['country'])
combinedDf = combinedDf.merge(df3, how='outer', on=['country'])

#sum all counts per country
combinedDf ['total']=combinedDf .iloc[:,1:].sum(1)
combinedDf  = combinedDf[['country', 'total']]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.