How to take part of string value from column in DataFrame in Python Pandas?

Question

I have DataFrame in Python Pandas like below ("col1" is as string data type):

col1
-------
ACC_P:Indiv|ACC_K:3886|GROUP:gr1|COK:P1K
ACC_P:Group|ACC_K:42|GROUP:gr2|COK:P1C
ACC_P:Indiv|ACC_K:455|GROUP:gpk22|COK:AD2
...

And I need to create new column "col2" where will be only value between "GROUP:" and "|" from values in "col1", so as a result I need something like below:

col1                                        | col2
--------------------------------------------|-------
ACC_P:Indiv|ACC_K:3886|GROUP:gr1|COK:P1K    | gr1
ACC_P:Group|ACC_K:42|GROUP:gr2|COK:P1C      | gr2
ACC_P:Indiv|ACC_K:455|GROUP:gpk22|COK:AD2   | gpk22
...                                         | ...

How can I do that in Python Pandas ?

Chris · Accepted Answer · 2022-08-18 13:12:09Z

1

You could use regular expressions to extract it. Something like capture everything after GROUP: up until the |

df['col2'] = df.col1.str.extract('GROUP:(.*)\|')

Another less elegant, but perhaps more intuitive option if you aren't familiar w/regex.

Split on the pipe, take the second from last item, split on : and take the last element from that. This of course works only if the structure of your col1 values always follows the same format.

df.col1.str.split('|').str[-2].str.split(':').str[-1]

Output

                                        col1   col2
0   ACC_P:Indiv|ACC_K:3886|GROUP:gr1|COK:P1K    gr1
1     ACC_P:Group|ACC_K:42|GROUP:gr2|COK:P1C    gr2
2  ACC_P:Indiv|ACC_K:455|GROUP:gpk22|COK:AD2  gpk22

answered Aug 18, 2022 at 13:12

Chris

16.3k3 gold badges26 silver badges41 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

wizz · Accepted Answer · 2022-08-18 13:15:59Z

0

The more elegant way to do it would be through regex but here is my two cents:

d.col1.str.split('GROUP:').apply(lambda x: x[1].split('|')[0])

answered Aug 18, 2022 at 13:15

wizz

2512 silver badges15 bronze badges

Collectives™ on Stack Overflow

How to take part of string value from column in DataFrame in Python Pandas?

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related