I'm trying to extract a text from a column so I can transfer this text to another column using a python pattern, but I lose some results at the same time I need to hold the remaining strings in the current column.
My code is:
import pandas as pd
df = pd.DataFrame({
'col': ['abcd (30-10) hijk', 'hijk (200-100) abcd', 'abcd (100 FS) hijk', 'hijk (100+) abcd', 'abcd (1000-2000) hijk' ]
})
pattern = "(abcd\d*)\s(\(.*\))"
df['remainingcol'] = df['col'].str.extract(pattern)[0]
df['newcol'] = df['col'].str.extract(pattern)[1]
print(df)
Output is:
col remainingcol newcol
0 abcd (30-10) hijk abcd (30-10)
1 hijk (200-100) abcd hijk (200-100)
2 abcd (100 FS) hijk abcd (100 FS)
3 hijk (100+) abcd hijk (100+)
4 abcd (1000-2000) hijk abcd (1000-2000)
Output should be
col remainingcol newcol
0 abcd (30-10) hijk abcd hijk (30-10)
1 hijk (200-100) abcd hijk abcd (200-100)
2 abcd (100 FS) hijk abcd hijk (100 FS)
3 hijk (100+) abcd hijk abcd (100+)
4 abcd (1000-2000) hijk abcd hijk (1000-2000)
I tried Tim's solution but I get this output, there's an issue with newcol:
col remainingcol newcol
0 abcd (30-10) abcd abcd abcd
1 abcd (200-100) abcd abcd abcd
2 abcd (100 FS) abcd abcd abcd
3 abcd (100+) abcd abcd abcd
4 abcd (1000-2000) ZZZ abcd ZZZ
