1

I have a dataframe like the following:

boss_id    employee_id      designation        
 -1           100              CEO
100           39               Manager 
100          4567              Manager
100          9843              Manager
39            47               entry level
39            45               entry level
4567          8                entry level
9843          9                entry level 

In this boss_id gives the boss of the employee. Designation is for the employee. I want to find how many people each person manages in total.

For instance, since CEO is the ultimate person, he should be managing all 7 people in this dataframe. Managers manage just the entry level. For instance, employee 39 who is a manager manages 2 people in this dataframe. Finally, the entry levels don't manage anyone, so their count should be 0.

I want a dataframe like this:

boss_id    employee_id      designation              count
 -1           100              CEO                     7
100           39               Manager                 2
100          4567              Manager                 1
100          9843              Manager                 1
39            47               entry level             0
39            45               entry level             0
4567          8                entry level             0
9843          9                entry level             0

I can't get my head around this and any help would be much appreciated! Thanks in advance.

1
  • I cannot give you proper Dataframe equation, but logic should be something like count(employee_ID) where boss_id = selectedItem.employee_id Commented Mar 6, 2017 at 6:37

2 Answers 2

1

You can recursively call employee_ids and find their counts

    def findCount(employee_id):
        if df.loc[df['employee_id'] == employee_id]['designation'].as_matrix()[0] == 'd':
            return 0
        eIds = df.loc[df['boss_id']==employee_id]['employee_id'].as_matrix()  
        cnt = 0
        for eid in eIds:
            cnt += (findCount(eid) + 1)
        return cnt

    for index, row in df.iterrows():
        cnt = findCount(row['employee_id'])
        df.loc[index, 'count'] = cnt
Sign up to request clarification or add additional context in comments.

Comments

0

Do a groups = df.groupby([boss_id])

go to the group's and get the count.

`for boss_id, group in groups:
     count = len(group)`

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.