0

Need to extract specific value from excel column using python pandas dataframe

The column Product that I am trying to extract looks the below & need to extract only Product # from it. The column also has other numbers but the Product # always comes after the term 'UK Pro' & Product # could be 3 to 4 digit number in a particular row of data.

In[1]:

df['Product'].head()

#Dataframe looks like this:

Out[1]:

Checking center : King 2000 : UK Pro 1000 : London

Checking center : Queen 321 : UK Pro 250 : Spain

CC : UK Pro 3000 : France

CC : UK Pro 810 : Poland

Expected Output:

Product #

1000

250

3000

810

Started with this:

df['Product #'] = df1['Product'].str.split(':').str[1]

But this does split only based on the first two occurrence of : operator.

Then tried this:

df1['Product #'] = df1['Product'].str.split('UK Pro', 1).str[0].str.strip()

1 Answer 1

2

You can use pandas.Series.str.extract :

df["Product #"] = df["Product"].str.extract("UK Pro (\d+)", expand=False)

# Output :

print(df)
   Product #
0        NaN
1        NaN
2       1000
3        NaN
4        NaN
5        250
6        NaN
7       3000
8        NaN
9        810
10       NaN
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.