1

I have a pandas dataframe with a column, which I need to extract the word with [ft,mi,FT,MI] of the state column using regular expression and stored in other column.

 df1 = {
    'State':['Arizona 4.47ft','Georgia 1023mi','Newyork 2022 NY 74.6 FT','Indiana 747MI(In)','Florida 453mi FL']}

Expected output

               State  Distance
0     Arizona 4.47ft  4.47ft
1     Georgia 1023mi  1023mi
2  Newyork NY 74.6ft  74.6ft
3  Indiana 747MI(In)   747MI
4   Florida 453mi FL   453mi

Would anyone please help?

1 Answer 1

1

Build a regex pattern with the help of list l then use str.extract to extract the occurrence of this pattern from the State column

l = ['ft','mi','FT','MI']
df1['Distance'] = df1['State'].str.extract(r'(\S+(?:%s))\b' % '|'.join(l))

                    State Distance
0          Arizona 4.47ft   4.47ft
1          Georgia 1023mi   1023mi
2  Newyork 2022 NY 74.6FT   74.6FT
3       Indiana 747MI(In)    747MI
4        Florida 453mi FL    453mi
Sign up to request clarification or add additional context in comments.

2 Comments

I forgot to mention that i have some spaces after 74.6 FT
Check with: df1['State'].str.extract(r'(\S+\s?(?:%s))\b' % '|'.join(l))

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.