1

I'm trying to iterate each row in a Pandas dataframe named 'cd'. If a specific cell, e.g. [row,empl_accept] in a row contains a substring, then updates the value of an other cell, e.g.[row,empl_accept_a] in the same dataframe.

for row in range(0,len(cd.index),1):
    if 'Master' in cd.at[row,empl_accept]:
        cd.at[row,empl_accept_a] = '1'
    else:
        cd.at[row,empl_accept_a] = '0'

The code above not working and jupyter notebook displays the error:

TypeError                                 Traceback (most recent call last)
<ipython-input-70-21b1f73e320c> in <module>
      1 for row in range(0,len(cd.index),1):
----> 2     if 'Master' in cd.at[row,empl_accept]:
      3         cd.at[row,empl_accept_a] = '1'
      4     else:
      5         cd.at[row,empl_accept_a] = '0'

TypeError: argument of type 'float' is not iterable

I'm not really sure what is the problem there as the for loop contains no float variable.

0

2 Answers 2

2

Please do not use loops for this. You can do this in bulk with:

cd['empl_accept_a'] = cd['empl_accept'].str.contains('Master').astype(int).astype(str)

This will store '0' and '1' in the column. That being said, I am not convinced if storing this as strings is a good idea. You can just store these as bools with:

cd['empl_accept_a'] = cd['empl_accept'].str.contains('Master')

For example:

>>> cd
    empl_accept  empl_accept_a
0        Master           True
1         Slave          False
2         Slave          False
3  Master Windu           True
Sign up to request clarification or add additional context in comments.

3 Comments

thank you for your answer, I already tried out the method you mentioned above (using str.contains), and it works flawlessly. I just want to ask why we should not use the loop for this process?
@flamingheart: because pandas is constructed to process data in bulk. It uses C objects behind the curtain. If you use it to retrieve single elements, the entire performance boost of pandas is lost.
thank you very much, you clarified all the problems. The reason why I need value '0' and '1' on those cells is I need to export the dataframe to excel and those cells require '0' and '1' on it by format (I actually prefer your solution storing bool value).
0

You need to check in your dataframe what value is placed at [row,empl_accept]. I'm sure there will be some numeric value at this location in your dataframe. Just print the value and you'll see the problem if any.

 print (cd.at[row,empl_accept])

1 Comment

Thank you, I should do the cleanning for the data before processing on it, but the problem still exists even if i fix the dataframe.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.