2

I am trying to do the Regex in Python dataframe using this script

import pandas as pd
df1 = {'data':['1gsmxx,2gsm','abc10gsm','10gsm','18gsm hhh4gsm','Abc:10gsm','5gsmaaab3gsmABC55gsm','abc - 15gsm','3gsm,,ff40gsm','9gsm','VV - fg 8gsm','kk 5gsm 00g','001….abc..5gsm']}
df1 = pd.DataFrame(df1)
df1
df1['Result']=df1['Data'].str.findall('(\d{1,3}\s?gsm)')

OR

df2=df1['data'].str.extractall('(\d{1,3}\s?gsm)').unstack()

However, it turnout into multiple results in one column. Is it possible I could have a result like the attached below?

enter image description here

2
  • Have you tried df1['Data'].str.extract('(\d{1,3}\s?gsm)')? Commented Jul 16, 2020 at 3:44
  • It turns out one result only. Is it possible I could keep all results with separating to multiple columns? Commented Jul 16, 2020 at 3:53

1 Answer 1

1

Use pandas.Series.str.extractall with unstack.

If you want your original series, use pandas.concat.

df2 = df1['data'].str.extractall('(\d{1,3}\s?gsm)').unstack()
df = pd.concat([df1, df2.droplevel(0, 1)], 1)
print(df)

Output:

                    data      0      1      2
0            1gsmxx,2gsm   1gsm   2gsm    NaN
1               abc10gsm  10gsm    NaN    NaN
2                  10gsm  10gsm    NaN    NaN
3          18gsm hhh4gsm  18gsm   4gsm    NaN
4              Abc:10gsm  10gsm    NaN    NaN
5   5gsmaaab3gsmABC55gsm   5gsm   3gsm  55gsm
6            abc - 15gsm  15gsm    NaN    NaN
7          3gsm,,ff40gsm   3gsm  40gsm    NaN
8                   9gsm   9gsm    NaN    NaN
9           VV - fg 8gsm   8gsm    NaN    NaN
10           kk 5gsm 00g   5gsm    NaN    NaN
11        001….abc..5gsm   5gsm    NaN    NaN
Sign up to request clarification or add additional context in comments.

3 Comments

Here is the dataset import pandas as pd df1 = {'data':['1gsmxx,2gsm','abc10gsm','10gsm','18gsm hhh4gsm','Abc:10gsm','5gsmaaab3gsmABC55gsm','abc - 15gsm','3gsm,,ff40gsm','9gsm','VV - fg 8gsm','kk 5gsm 00g','001….abc..5gsm']} df1 = pd.DataFrame(df1) df2=df1['data'].str.extractall('(\d{1,3}\s?gsm)').unstack() Is there any way I could keep the ['data'] column in the table?
Thanks for your advice. I have updated the question
@NgCt I've updated my answer accordingly. Please check ;)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.