0

I have a python dataframe with a string column that I want to separate into several more columns.

Some rows of the DF look like this:

COLUMN

ORDP//NAME/iwantthispart/REMI/MORE TEXT
/REMI/SOMEMORETEXT
/ORDP//NAME/iwantthispart/ADDR/SOMEADRESS
/BENM//NAME/iwantthispart/REMI/SOMEMORETEXT

So basically i want everything after '/NAME/' and up to the next '/'. However. Not every row has the '/NAME/iwantthispart/' field, as can be seen in the second row.

I've tried using split functions, but ended up with the wrong results.

mt['COLUMN'].apply(lambda x: x.split('/NAME/')[-1])

This just gave me everything after the /NAME/ part, and in the cases that there was no /NAME/ it returned the full string to me.

Does anyone have some tips or solutions? Help is much appreciated! (the bullets are to make it more readable and are not actually in the data).

1

2 Answers 2

5

You could use str.extract to extract the pattern of choice, using a regex:

# Generally, to match all word characters:
df.COLUMN.str.extract('NAME/(\w+)')

OR

# More specifically, to match everything up to the next slash:
df.COLUMN.str.extract('NAME/([^/]*)')

Both of which returns:

0    iwantthispart
1              NaN
2    iwantthispart
3    iwantthispart
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks! This is exactly what I wanted! Problem solved :)
0

These two lines will give you the second word regardless if the first word is name or not

mt["column"]=mt["column"].str.extract(r"(\w+/\w+/)")
mt["column"].str.extract(r"(\/\w+)")

This will give the following result as a column in pandas dataframe:

/iwantthispart
/SOMEMORETEXT
/iwantthispart
/iwantthispart

and incase you are only interested in the lines that contain NAME this will work for you just fine:

mt["column"]=mt["column"].str.extract(r"(\NAME/\w+/)")
mt["column"].str.extract(r"(\/\w+)")

This will give the following result:

/iwantthispart
/NaN
/iwantthispart
/iwantthispar

2 Comments

This looks nice and elegant. But seems it needs some tweaking, as the OP is asking for anything after NAME, so the 2nd result should be blank.
@JAponte ohh if that is the case we can easily put NAME in first line instead of w+

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.