0

I have a data frame like this:

df
col1     col2      col3
 ab       1        prab
 cd       2        cdff
 ef       3        eef

I want to remove col1 values from the col3 values

the final data frame should look like<

df
col1     col2      col3
 ab       1        pr
 cd       2        ff
 ef       3        e

How to do it using pandas in most effective way ?

1

3 Answers 3

2

Use .apply with replace over axis=1:

df['col3'] = df.apply(lambda x: x['col3'].replace(x['col1'], ''), axis=1)

Output

  col1  col2 col3
0   ab     1   pr
1   cd     2   ff
2   ef     3    e
Sign up to request clarification or add additional context in comments.

Comments

1

It looks like a loop is unavoidable since you have to work with replacing/removing substrings. In that case, list comprehension might come in handy:

%%timeit
df.apply(lambda x: x['col3'].replace(x['col1'], ''), axis=1)

# 767 µs ± 24.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

while

%%timeit
[a.replace(b,'') for a,b in zip(df['col3'], df['col1'])]

# 24.4 µs ± 3.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Comments

0

Suppose df is a matrix :

df = [["ab",1,"prab"],["cd",2,"cdff"],["ef",3,"eef"]]

You want to remove the key (col1) in each value (col3) for each row :

for row in df:
  row[2] = row[2].replace(row[0],"")

Following this documentation each occurence of col1 is replaced by an empty string: "".

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.