1

I have a data frame where I want to add a new column with values based on the index.

This is my fake df:

{'fruit': [
'Apple', 'Kiwi', 'Clementine', 'Kiwi', 'Banana', 'Clementine', 'Apple', 'Kiwi'],
'bites': [1, 2, 3, 1, 2, 3, 1, 2]})

I have found a similar question and tried the solution there but I get error messages. This is what I tried:

conds = [(my.index >= 0) & (my.index <= row_2),
         (my.index > row_2) & (my.index<=row_5),
         (my.index > row_5) & (my.index<=row_6),
         (my.index > row_6)]


names = ['Donna', 'Kelly', 'Andrea','Brenda']


my['names'] = np.select(conds, names)
2
  • What are row_2, row_5...? what's the error message that you got? Commented May 23, 2019 at 13:25
  • @QuangHoang I might have missed how I define the rows, the help I took was from this post. The error message is row_2 is not defined which makes me feel stupid since apparently that's the same question you're asking... Commented May 23, 2019 at 13:29

2 Answers 2

2

For me it working nice (variables changed to numeric), also added alternative solutions with cut with include_lowest=True parameter for match 0 value and selecting by DataFrame.loc:

conds = [(my.index >= 0) & (my.index <= 2),
         (my.index > 2) & (my.index<=5),
         (my.index > 5) & (my.index<=6),
         (my.index > 6)]


names = ['Donna', 'Kelly', 'Andrea','Brenda']


my['names'] = np.select(conds, names)
my['names1'] = pd.cut(my.index, [0,2,5,6,np.inf], labels=names, include_lowest=True)

my.loc[:2, 'names2'] = 'Donna'
my.loc[3:5, 'names2'] = 'Kelly'
my.loc[6:7, 'names2'] = 'Andrea'
my.loc[7:, 'names2'] = 'Brenda'

print (my)
        fruit  bites   names  names1  names2
0       Apple      1   Donna   Donna   Donna
1        Kiwi      2   Donna   Donna   Donna
2  Clementine      3   Donna   Donna   Donna
3        Kiwi      1   Kelly   Kelly   Kelly
4      Banana      2   Kelly   Kelly   Kelly
5  Clementine      3   Kelly   Kelly   Kelly
6       Apple      1  Andrea  Andrea  Andrea
7        Kiwi      2  Brenda  Brenda  Brenda
Sign up to request clarification or add additional context in comments.

1 Comment

Great! I feel kind of stupid, of course I should've figured that I had to use the row index number....
2

You can try pd.cut:

df['names'] = (pd.cut(df.index, 
                      [0, 2, 5, 6, np.inf], 
                      labels=names)
                 .fillna(names[0])
              )

1 Comment

Thanks! a lot less to write, which to me is better. :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.