Create new columns in pandas dataframe using apply

Question

I am looking to create new columns in a pandas dataframe based on other column value using apply. I receive this error and I don't understand why:

File "C:\dev\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2448, in _setitem_array
    raise ValueError('Columns must be same length as key')
ValueError: Columns must be same length as key

Am I misunderstanding the apply function? Can you update/create multiple columns using a single apply call?

Here is my sample data:

import pandas as pd

x = pd.DataFrame({'VP': ['Brian', 'Sarah', 'Sarah', 'Brian', 'Sarah'],
                  'Director': ['Jim', 'Ian', 'Ian', 'Jim', 'Jerry'],
                  'Requester': ['Kelly', 'Dave', 'Jordan', 'Matt', 'Rob'],
                  'VP from Query': ['Jordan', 'Justin', 'Sarah', 'Brian', 'Sarah'],
                  'Director from Query': ['Other', 'Other', 'Ian', 'Jim', 'Jerry'],
                  'Requester from Query': ['Kelly', 'Dave', 'Jordan', 'Matt', 'Rob']
                  })
x = x[['VP', 'Director', 'Requester', 'VP from Query', 'Director from Query', 'Requester from Query']]


def set_suggested_hierarchy(row):
    if row['VP'] != row['VP from Query']:
        return row[['VP', 'Director']]
    else:
        return row[['VP from Query', 'Director from Query']]


x[['Suggested VP', 'Suggested Director']] = x.apply(lambda row: set_suggested_hierarchy(row), axis=1)

Thank you so much

Ian · Accepted Answer · 2018-10-01 14:45:17Z

1

I found the answer here: https://datascience.stackexchange.com/questions/29115/pandas-apply-return-must-have-equal-len-keys-and-value-when-setting-with-an-ite

Basically, I needed to change the lambda function to return a series:

def set_suggested_hierarchy(row):
    if row['VP'] != row['VP from Query']:
        return pd.Series([row['VP'], row['Director']])
    else:
        return pd.Series([row['VP from Query'], row['Director from Query']])

answered Oct 1, 2018 at 14:45

Ian

1,05313 silver badges20 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

rahlf23 · Accepted Answer · 2018-10-01 14:45:25Z

1

One solution would be to return the entire row of the dataframe, since you are applying this function to the full dataframe:

def set_suggested_hierarchy(row):

    if row['VP'] != row['VP from Query']:
        row['Suggested VP'] = row['VP']
        row['Suggested Director'] = row['Director']
    else:
        row['Suggested VP'] = row['VP from Query']
        row['Suggested Director'] = row['Director from Query']

    return row

x = x.apply(lambda row: set_suggested_hierarchy(row), axis=1)

answered Oct 1, 2018 at 14:45

rahlf23

9,0494 gold badges30 silver badges57 bronze badges

Comments

ALollz · Accepted Answer · 2018-10-01 14:50:51Z

0

I think you should get rid of the apply(axis=1) all together. It seems like your logic can be implemented as:

import numpy as np

x['Suggested VP'] = x.VP
x['Suggested Director'] = np.where(x.VP != x['VP from Query'], 
                                   x.Director, x['Director from Query'])

answered Oct 1, 2018 at 14:50

ALollz

59.7k7 gold badges73 silver badges97 bronze badges

Collectives™ on Stack Overflow

Create new columns in pandas dataframe using apply

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related