0

I have the following dataframe:

df = pd.DataFrame({'ISIN': ['A1kT23', '4523', 'B333', '49O33'], 'Name': ['Example A', 'Name Xy', 'Example B', 'Test123'], 'Sector_x': ['test1', 'test2', 'test3', 'test4'],  'Sector_y': ['abc', '', '', 'xyz']})

I would like to replace value in column Sector_y by using column Sector_x, if Sector_y = ''

so that I get the following result:

df = pd.DataFrame({'ISIN': ['A1kT23', '4523', 'B333', '49O33'], 'Name': ['Example A', 'Name Xy', 'Example B', 'Test123'], 'Sector_x': ['test1', 'test2', 'test3', 'test4'],  'Sector_y': ['abc', 'test2', 'test3', 'xyz']})

I tried using the code

df['Sector_y'] = np.where('',['Sector_x'],['Sector_y'])

but didn't deliver the result I wanted.

Any suggestions how to solve the problem?

1
  • 2
    your syntax is off: df['Sector_y'] = np.where(df['Sector_y'] == '',df['Sector_x'],df['Sector_y']). Also, generally speaking you should always have the dataframe in front of the column name -- otherwise you are passing a list with one string instead of a dataframe series. Obviously, if a pandas method, expects a list of column names like in groupby, then this syntax works, but np.where takes: 1. a conditional series and either a series or a string. Commented Jul 6, 2021 at 22:23

3 Answers 3

3

You can use .loc to specify the filtering condition and specify the target column Sector_y and assign with values from column Sector_x, as follows:

df.loc[df['Sector_y'] =='', 'Sector_y'] = df['Sector_x']

Result:

print(df)

     ISIN       Name Sector_x Sector_y
0  A1kT23  Example A    test1      abc
1    4523    Name Xy    test2    test2
2    B333  Example B    test3    test3
3   49O33    Test123    test4      xyz

Sign up to request clarification or add additional context in comments.

Comments

1

Fix np.where

df['Sector_y'] = np.where(df['Sector_y'] =='', df['Sector_x'], df['Sector_y'])

Comments

-1

Another option is to use apply row-wise, and simply compare string across columns using the or operator (taking advantage from the fact that empty strings evaluate to False):

df['Sector_y'] = df.apply(lambda row: row['Sector_y'] or row['Sector_x'], axis=1)

output:

     ISIN       Name Sector_x Sector_y
0  A1kT23  Example A    test1      abc
1    4523    Name Xy    test2    test2
2    B333  Example B    test3    test3
3   49O33    Test123    test4      xyz

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.