Python - .apply() function returning entire column of rows in one column

Question

First time posting; apologies for formatting errors. I have a data set that contains age ranges in separate columns, and I'm trying to create a new column based on a string evaluation of the AGE_OPERATOR_TXT column:

I've tried using .apply() functions, lambda, for loops with iterrows(), etc... but either I can't get anything to return, or the function returns a series with ALL of the rows:

def multum_age_ops(s):
    if s == "<":
        return data['AGE_LOW_NBR'] + " + " + data['AGE_UNIT_DISP']
else:
    return 0

data['age_op_test'] = data['AGE_OPERATOR_TXT'].apply(multum_age_ops)

I would expect that the column returned would actually look something like:

age_ops_test
0 0
1 18 + years
2 1 + months
3 4 + months
4 4 + months

What I'm getting is:

age_ops_test
0                                                        0
1        0        18\n1        18\n2         1\n3      ...
2        0        18\n1        18\n2         1\n3      ...
3        0        18\n1        18\n2         1\n3      ...
4        0        18\n1        18\n2         1\n3      ...
5        0        18\n1        18\n2         1\n3      ...
6        0        18\n1        18\n2         1\n3      ...

Any help is appreciated.

Because you are returning a series: return data['AGE_LOW_NBR'] + " + " + data['AGE_UNIT_DISP'] that's a series... — juanpa.arrivillaga
– juanpa.arrivillaga, Commented Aug 7, 2019 at 20:30
You should be doing this, if you really want to do it with apply, by applying across the whole dataframe with axis=1. Perhaps consider just concatenating across by element using the series syntax? — ifly6
– ifly6, Commented Aug 7, 2019 at 20:37
Thanks for the quick replies! Again, first question, so I should have mentioned that I did try data['age_op_test'] = data['AGE_OPERATOR_TXT'].apply(multum_age_ops, axis=1) but it returns an "unexpected argument" error. — keith_o
– keith_o, Commented Aug 8, 2019 at 17:37
My final workaround (not very Pythonic) was to create the first column based on sinanggul's suggestion, then create a second that evaluates and instead of returning 0 in the else clause, returns the first column. Then a third that evaluates as above and returns the second column in the else clause. — keith_o
– keith_o, Commented Aug 8, 2019 at 17:56

sinanggul · Accepted Answer · 2019-08-07 21:06:37Z

1

As mentioned in ifly's comment, the key is to use apply on the entire dataframe over axis=1 so that the function/lambda gets applied to each row. In your case, that would look like this:

data['age_op_test'] = data.apply(lambda row: row['AGE_LOW_NBR'] + " + " + row['AGE_UNIT_DISP'] if row['AGE_OPERATOR_TXT'] == "<" else "0", axis=1)

answered Aug 7, 2019 at 21:06

sinanggul

1493 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

keith_o Over a year ago

This worked perfectly - now I need to extend it to include two more "elif" clauses. Thanks for the help!!

lvzxy · Accepted Answer · 2019-08-07 23:14:17Z

1

You can also use np.where (doc):

data['age_op_test'] = np.where(data['AGE_OPERATOR_TXT'] == "<", data['AGE_LOW_NBR'] + " + " + data['AGE_UNIT_DISP'],0)

What np.where does in this case is returns "0" if data['AGE_OPERATOR_TXT'] == "<" is False. If True, it returns data['AGE_LOW_NBR'] + " + " + data['AGE_UNIT_DISP'].

edited Aug 7, 2019 at 23:14

answered Aug 7, 2019 at 22:35

lvzxy

877 bronze badges

2 Comments

ifly6 Over a year ago

There's no pd prefix before np

lvzxy Over a year ago

You are correct @ifly6, thank you. Answer was edited to reflect change.

Benoit Drogou · Accepted Answer · 2019-08-07 21:15:16Z

0

Can you try to do something like this :


df.loc[df['AGE_OPERATOR_TXT']=='<', "age_op_test"] = df["AGE_LOW_NBR"].astype(str).str.cat(df["AGE_UNIT_DISP"].astype(str), sep=" + ")

answered Aug 7, 2019 at 21:15

Benoit Drogou

9691 gold badge5 silver badges15 bronze badges

Collectives™ on Stack Overflow

Python - .apply() function returning entire column of rows in one column

3 Answers 3

1 Comment

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related