0

I have the following dataframe:

    col1   col2
0    a      7                    
1    b      3                  
2    c      1                  
3    d      6                  

I'm trying to add a new column to the dataframe, with the value equal to a variable x. This variable will depend on the values of col1 and col2. I have tried:

for row in df:
    row['col3'] = x

However I get the following error:

TypeError: 'tuple' object does not support item assignment

I had a look into iterrows() however I'm not sure this is the right approach. According to the documentation:

"You should never modify something you are iterating over. This is not guaranteed to work in all cases. Depending on the data types, the iterator returns a copy and not a view, and writing to it will have no effect."

Edit - Additional Info:

What I'm trying to do is create a new dataframe with col3 being a string based on a pre-sorted order of the dataframe. For example, the following dataframe:

    col1   col2
0    a      7                    
1    b      3                  
2    c      1                  
3    d      6                  

Should become:

    col1   col2   col 3 
0    a      7      001              
1    b      3      002            
2    c      1      003            
3    d      6      004            

Where col3 is a string in the format '000' (i.e. with leading zeros where applicable so that the string always contains 3 characters). There will never be more than 999 rows in the dataframe.

This is the code I have so far:

x = 1

for row in df:

    if x < 10:
        formatting = str('00' + str(x))
    elif x < 100:
        formatting = str('0' + str(x))
    else:
        formatting = str(str(x))

    x += 1

    row['col3'] = x

However this seems to change the col3 values for all rows in the dataframe, intsead of just the row in the loop. For example after 4 loops the result is:

    col1   col2   col 3 
0    a      7      004              
1    b      3      004            
2    c      1      004            
3    d      6      004            
4
  • 1
    In what way does x depend on col1 and col2? Commented Aug 30, 2019 at 4:13
  • You can apply a function to a dataframe using apply. pandas.pydata.org/pandas-docs/stable/reference/api/… Commented Aug 30, 2019 at 4:15
  • what is x here? Commented Aug 30, 2019 at 4:34
  • Can you check edited answer? Commented Aug 30, 2019 at 8:04

1 Answer 1

1

EDIT:

Better here is use Series.str.zfill with index values converted to strings:

df['col3'] = (df.index + 1).astype('str').str.zfill(3)
print (df)
  col1  col2 col3
0    a     7  001
1    b     3  002
2    c     1  003
3    d     6  004

If index is not default RangeIndex create helper Series:

df['col3'] = pd.Series(np.arange(1, len(df) + 1)).astype('str').str.zfill(3)
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks jezrael. I tried this and it sucessfully creates col3 with string value equal to the index + 1. However since the dataframe will be sorted, the index will not always start at zero, causing col3 to be based on the original index rather than the new order after sorting. Do I need to reset the index after sorting?
@Alan - If not default index, use second solution.
Or exacly df = df.reset_index(drop=True) and first solution.
Thank you :) I'd never previously used zfill - this will be a huge time saver.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.