Pandas .apply() function with multiple args

Question

Question: Write a function that will take a row of a DataFrame and print out the song, artist, and whether or not the release date is < 1970.

Defining my function:

def release_info(row):
    """Checks if song is released before or after 1970."""
    if rocksfile.loc[row, 'Release_Year'] < 1970:
        print str(rocksfile.loc[row,'Song_Clean']) + " by " + 
str(rocksfile.loc[row,'Artist_Clean']) \
            + " was released before 1970."
    else:
        print str(rocksfile.loc[row,'Song_Clean']) + " by " + str(rocksfile.loc[row,'Artist_Clean']) \
            + " was released after 1970."

Using the .apply() function, apply the function you wrote to the first four rows of the DataFrame. You will need to tell the apply function to operate row by row. Setting the keyword argument as axis=1 indicates that the function should be applied to each row individually.

Using .apply:

rocksfile.apply(release_info, axis = 1, row=1)

Error Message:

TypeError                                 Traceback (most recent call last)
<ipython-input-61-fe0405b4d1e8> in <module>()
  1 #a = [1]
  2 
----> 3 rocksfile.apply(release_info, axis = 1, row=1)


TypeError: ("release_info() got multiple values for keyword argument 'row'", u'occurred at index 0')

release_info(1)

jezrael · Accepted Answer · 2017-09-07 11:47:26Z

2

In pandas working with arrays (Series, DataFrames) so better is used vectorized pandas or numpy function, here the best is use numpy.where:

#condition
m = rocksfile['Release_Year'] < 1970
#concatenate columns together
a = rocksfile['Song_Clean'] + " by " + rocksfile['Artist_Clean']
#add different string to end
b =  a + " was released before 1970."
c =  a + " was released after 1970."

rocksfile['new'] = np.where(m, a, b)
print (rocksfile)

answered Sep 7, 2017 at 11:47

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Comments

bruno desthuilliers · Accepted Answer · 2017-09-07 11:54:57Z

1

Here:

rocksfile.apply(release_info, axis = 1, row=1)

row is not part of DataFrame.apply() expected arguments, so it get passed as a keyword arg to release_info(), in addition of the first positional argument, so release_info() ends up being called like this:

release_info(row_index, row=1)

answered Sep 7, 2017 at 11:54

bruno desthuilliers

78.3k6 gold badges103 silver badges129 bronze badges

Comments

cs95 · Accepted Answer · 2017-09-07 11:50:53Z

0

You can use np.where and reduce this to 1 line.

s = rocksfile['Song_Clean'] 
    + ' was released by ' 
    + rocksfile['Artist_Clean'] 
    + pd.Series(np.where(rocksfile['Release_Year'] < 1970, 'before', 'after'))
    + ' 1970'

rocksfile['new'] = s

answered Sep 7, 2017 at 11:50

cs95

406k106 gold badges744 silver badges797 bronze badges

Collectives™ on Stack Overflow

Pandas .apply() function with multiple args

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related