3

I am using pivot tables, trying to write code to display the number of consumer accounts for each customer. I have the following so far:

import pandas as pd
df1=pd.DataFrame({'custID':[1,1,2,2,2,3,3,4,4],
              'accountID':[1,2,1,2,3,1,2,1,2],
              'tenure_mo':[2,3,4,4,5,6,6,6,7],
             'account_type':['BusiNESS','CONSUMER',
                            'consumer',
                            'BUSINESS',
                            'BuSIness',
                            'CONSUmer',
                            'consumer',
                            'CONSUMER',
                            'BUSINESS']},columns=['custID','accountID','tenure_mo','account_type'])
print(df1)
df2=pd.DataFrame({'custID':[1,2,3,4],
             'cust_age':[20,35,50,85]},columns=['custID','cust_age'])

This is my desired output:

custID num_cons_accounts
     1                 1
     2                 1
     3                 2
     4                 1

How can I modify/expand my code to produce this output?

1
  • 2
    Using groupby: df.groupby["custID"].count() Commented Feb 1, 2021 at 21:27

2 Answers 2

5

According to your description the following code should work:

df1=pd.DataFrame({'custID':[1,1,2,2,2,3,3,4,4],
              'accountID':[1,2,1,2,3,1,2,1,2],
              'tenure_mo':[2,3,4,4,5,6,6,6,7],
             'account_type':['BusiNESS','CONSUMER',
                            'consumer',
                            'BUSINESS',
                            'BuSIness',
                            'CONSUmer',
                            'consumer',
                            'CONSUMER',
                            'BUSINESS']},columns=['custID','accountID','tenure_mo','account_type'])

df1 = df1[df1['account_type'].str.lower() == "consumer"]

print(df1.groupby("custID").count())

Select where lowercase version of account type is equal to "consumer" and then get counts for each custID.

The output:

        accountID  tenure_mo  account_type
custID                                    
1               1          1             1
2               1          1             1
3               2          2             2
4               1          1             1

A side note: if you want only the one column, drop the rest :)

Sign up to request clarification or add additional context in comments.

Comments

0

use set to find the distinct count of accounts by account_type2 using an apply and lambda function

 df1=pd.DataFrame({'custID':[1,1,2,2,2,3,3,4,4],
          'accountID':[1,2,1,2,3,1,2,1,2],
          'tenure_mo':[2,3,4,4,5,6,6,6,7],
         'account_type':['BusiNESS','CONSUMER','consumer','BUSINESS','BuSIness','CONSUmer',
                        'consumer', 'CONSUMER','BUSINESS']},columns=['custID','accountID','tenure_mo','account_type'])

 df1['account_type2']=df1['account_type'].apply(lambda row: row.lower())
 
 grouped=df1.groupby('custID').apply(lambda row: len(set(row.account_type2)))
 print(grouped)

output:

 custID distinct count
 1    2
 2    2
 3    1
 4    2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.