1

I have dataframe with the following values

        Bird    Color
   0    Parrot  ['Light_Blue','Green','Dark_Blue']
   1    Eagle   ['Sky_Blue','Black','White', 'Yellow','Gray']
   2    Seagull ['White','Jet_Blue','Pink', 'Tan','Brown', 'Purple']

I want to create a column called 'No Blue', where it will only list array elements without the word "Blue" in it.

Like this:

    Bird    Color                                                No Blue
0   Parrot  ['Light_Blue','Green','Dark_Blue']                   ['Green']
1   Eagle   ['Sky_Blue','Black','White', 'Yellow','Gray']        ['Black', 'White', 'Yellow', 'Gray']
2   Seagull ['White','Jet_Blue','Pink', 'Tan','Brown', 'Purple'] ['White', 'Pink', 'Tan', 'Brown', 'Purple']

This is the closest thing I have to a solution

>>> Eagle = ['Sky_Blue','Black','White', 'Yellow','Gray']
>>> matching = [x for x in Eagle if "Blue" not in x]
>>> matching
['Black', 'White', 'Yellow', 'Gray']
2
  • I'm wondering how to do it using str.extract or str.replace etc Commented Aug 15, 2019 at 12:36
  • Since you are iterating over a list within in each row, I think passing .apply(lambda...) is more efficient Commented Aug 15, 2019 at 12:54

3 Answers 3

1

I would use this code:

df["noBlue"]=df.Color.apply(lambda x: [v for v in x if "Blue" not in v])
Sign up to request clarification or add additional context in comments.

Comments

0

I'm running this from command, so bare my prints!:

import pandas as pd
a = {'Bird':['Parrot','Eagle','Seagull'],'Color':[['Light_Blue','Green','Dark_Blue'],['Sky_Blue','Black','White', 'Yellow','Gray'],['White','Jet_Blue','Pink', 'Tan','Brown', 'Purple']]}
df = pd.DataFrame(a)
print(df)

Here I'm matching your results:

      Bird                                        Color
0   Parrot               [Light_Blue, Green, Dark_Blue]
1    Eagle       [Sky_Blue, Black, White, Yellow, Gray]
2  Seagull  [White, Jet_Blue, Pink, Tan, Brown, Purple]

This will create your new column based on a condition:

df["Not_Blue"] = df['Color'].apply(lambda x: [a for a in x if "Blue" not in a])
print(df)

Output:

      Bird                                        Color                           Color_Not_Blue
0   Parrot               [Light_Blue, Green, Dark_Blue]                            [Green]
1    Eagle       [Sky_Blue, Black, White, Yellow, Gray]       [Black, White, Yellow, Gray]
2  Seagull  [White, Jet_Blue, Pink, Tan, Brown, Purple]  [White, Pink, Tan, Brown, Purple]

Comments

0

Try with this:

>>> df['color'].str.replace(r'\w+_Blue\b', "")
0                                 ['','Green','']
1           ['','Black','White', 'Yellow','Gray']
2    ['White','','Pink', 'Tan','Brown', 'Purple']

For personal curiosity, I have opened another SO thread to get it with replace and got below solution if you are using pandas version 0.25 .

See the thread for another solutions..

 df['color'].str.replace(r'\w+_Blue\b', '').explode().loc[lambda x : x!=''].groupby(level=0).apply(list)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.