2

I'm trying to extract a text from a column so I can transfer this text to another column using a python pattern, but I lose some results at the same time I need to hold the remaining strings in the current column.

My code is:

import pandas as pd
df = pd.DataFrame({
    'col': ['abcd (30-10) hijk', 'hijk (200-100) abcd', 'abcd (100 FS) hijk', 'hijk (100+) abcd', 'abcd (1000-2000) hijk' ]
})

pattern = "(abcd\d*)\s(\(.*\))"

df['remainingcol'] = df['col'].str.extract(pattern)[0]
df['newcol'] = df['col'].str.extract(pattern)[1]

print(df)

Output is:

                col            remainingcol      newcol
0       abcd (30-10) hijk      abcd              (30-10)
1       hijk (200-100) abcd    hijk             (200-100)
2       abcd (100 FS) hijk     abcd             (100 FS)
3       hijk (100+) abcd       hijk              (100+)
4       abcd (1000-2000) hijk  abcd            (1000-2000)

Output should be

                col            remainingcol      newcol
0       abcd (30-10) hijk      abcd hijk        (30-10)
1       hijk (200-100) abcd    hijk abcd       (200-100)
2       abcd (100 FS) hijk     abcd hijk        (100 FS)
3       hijk (100+) abcd       hijk abcd        (100+)
4       abcd (1000-2000) hijk  abcd hijk       (1000-2000)

I tried Tim's solution but I get this output, there's an issue with newcol:

              col                        remainingcol      newcol
0     abcd (30-10) abcd                     abcd abcd        
1  abcd (200-100) abcd                     abcd abcd         
2    abcd (100 FS) abcd                    abcd abcd         
3      abcd (100+) abcd                    abcd abcd         
4  abcd (1000-2000) ZZZ                     abcd ZZZ    

Last output

1 Answer 1

3

For the remaining column I would use str.replace and a a regex replacement:

df['remainingcol'] = df['col'].str.replace('\s+\(.*?\)\s+', ' ')

For the new column, I would use str.replace with a capture group:

df['newcol'] = df['col'].str.replace('.*(\(.*?\)).*', '\1')
Sign up to request clarification or add additional context in comments.

1 Comment

Hi Tim I get this char in newcol

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.