Remove element from list in pandas dataframe based on value in column

Question

Let's say I have following dataframe:

a = [[1,2,3,4,5,6],[23,23,212,223,1,12]]
b = [1,1]


df = pd.DataFrame(zip(a,b), columns = ['a', 'b'])

And my goal is to remove the elements in the lists in series A that are in series B. My attempt at doing so is below:

df['a'] = [i.remove(j) for i,j in zip(df.a, df.b)]

The logic seems sounds to me however I'm ending up with df['a'] being a series of nulls. What is going on here?

AChampion · Accepted Answer · 2019-11-15 20:43:37Z

6

Here's an alternative way of doing it:

In []:
df2 = df.explode('a')
df['a'] = df2.a[df2.a != df2.b].groupby(level=0).apply(list)
df

Out[]:
                        a  b
0         [2, 3, 4, 5, 6]  1
1  [23, 23, 212, 223, 12]  1

answered Nov 15, 2019 at 20:43

AChampion

30.5k4 gold badges63 silver badges79 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Ajit Wadalkar · Accepted Answer · 2019-11-15 20:40:27Z

5

list.remove(x) removes the value in-place and returns None. That is why the above code is failing for you. You can also do something like the following.

a = [[1,2,3,4,5,6],[23,23,212,223,1,12]]
b = [1,1]
df = pd.DataFrame(zip(a,b), columns = ['a', 'b'])
for i, j in zip(df.a, df.b):
    i.remove(j)

print df

                        a  b
0         [2, 3, 4, 5, 6]  1
1  [23, 23, 212, 223, 12]  1

answered Nov 15, 2019 at 20:40

Ajit Wadalkar

1708 bronze badges

Comments

Celius Stingher · Accepted Answer · 2019-11-15 20:35:48Z

2

Assuming row b only contains one value, then you can try with the following using a list comprehension within a function, and then simply apply it:

import pandas as pd
a = [[1,2,3,4,5,6],[23,23,212,223,1,12]]
b = [1,1]


df = pd.DataFrame(zip(a,b), columns = ['a', 'b'])
def removing(row):
    val = [x for x in row['a'] if x != row['b']]
    return val
df['c'] = df.apply(removing,axis=1)
print(df)

Output:

                           a  b                       c
0         [1, 2, 3, 4, 5, 6]  1         [2, 3, 4, 5, 6]
1  [23, 23, 212, 223, 1, 12]  1  [23, 23, 212, 223, 12]

answered Nov 15, 2019 at 20:35

Celius Stingher

18.4k6 gold badges26 silver badges54 bronze badges

Comments

BENY · Accepted Answer · 2019-11-15 20:41:02Z

2

What I will do

s=pd.DataFrame(df.a.tolist(),index=df.index)
df['a']=s.mask(s.eq(df.b,0)).stack().astype(int).groupby(level=0).apply(list)
Out[264]: 
0           [2, 3, 4, 5, 6]
1    [23, 23, 212, 223, 12]
dtype: object

answered Nov 15, 2019 at 20:41

BENY

324k22 gold badges176 silver badges250 bronze badges

Comments

LocoGris · Accepted Answer · 2019-11-15 21:00:26Z

0

How about this:

b = [[1],[1]] 

df['a'] = df.apply(lambda row: list(set(row['a']).difference(set(row['b']))), axis=1)

b must be in this way, but you can get the difference even if you want to remove more than an element.

Example:

import pandas as pd
a = [[1,2,3,4,5,6],[23,23,212,223,1,12]]
b = [[1,5],[1,23]]


df = pd.DataFrame(zip(a,b), columns = ['a', 'b'])



df['a'] = df.apply(lambda row: list(set(row['a']).difference(set(row['b']))), axis=1)

answered Nov 15, 2019 at 21:00

LocoGris

4,5003 gold badges17 silver badges31 bronze badges

Collectives™ on Stack Overflow

Remove element from list in pandas dataframe based on value in column

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related