1

I have the following data:

df = pd.DataFrame({ 'Column_A': [1,2,3,4],
                'Column_B': [["X1", "X2", "Y1"],
                            ["X3", "Y2"],
                            ["X4", "X5"],
                            ["X5", "Y3", "Y4"]],})

   Column_A      Column_B
0         1  [X1, X2, Y1]
1         2      [X3, Y2]
2         3      [X4, X5]
3         4  [X5, Y3, Y4]

I wish to remove all strings starting with Y in the second column. Desired output:

   Column_A  Column_B
0         1  [X1, X2]
1         2      [X3]
2         3  [X4, X5]
3         4      [X5]

1 Answer 1

3

Use nested list comprehension with filtering with startswith:

df['Column_B'] = [[y for y in x if not y.startswith('Y')] for x in df['Column_B']]

apply alternative:

df['Column_B'] = df['Column_B'].apply(lambda x: [y for y in x if not y.startswith('Y')])

Or use filter:

df['Column_B'] = [list(filter(lambda y: not y.startswith('Y'), x)) for x in df['Column_B']]

print (df)
   Column_A  Column_B
0         1  [X1, X2]
1         2      [X3]
2         3  [X4, X5]
3         4      [X5]

Performance:

Depends of number of rows, number of values in lists and number of matched values:

#[40000 rows x 2 columns]
df = pd.concat([df] * 10000, ignore_index=True)
#print (df)


In [142]: %timeit df['Column_B'] = [[y for y in x if not y.startswith('Y')] for x in df['Column_B']]
23.7 ms ± 410 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [143]: %timeit df['Column_B'] = [list(filter(lambda y: not y.startswith('Y'), x)) for x in df['Column_B']]
36.5 ms ± 204 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [144]: %timeit df['Column_B'] = df['Column_B'].apply(lambda x: [y for y in x if not y.startswith('Y')])
30.4 ms ± 1.86 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.