4

How do you split the text in a column to create a new column in a dataframe using "(" and ")"? Current data frame:

Item Description
0 coat Boys (Target)
1 boots Womens (DSW)
2 socks Girls (Kohls)
3 shirt Mens (Walmart)
4 boots Womens (DSW)
5 coat Boys (Target)

What I want to create:

Item Description Retailer
0 coat Boys Target
1 boots Womens DSW
2 socks Girls Kohls
3 shirt Mens Walmart
4 boots Womens DSW
5 coat Boys Target

I've tried the following:

df[['Description'], ['Retailer']] = df['Description'].str.split("(")

I get an error: "TypeError: unhashable type: 'list'"

4 Answers 4

2

Hi I have run this tiny test and seems to work; note the space and the \ in the split string.

import pandas as pd
df = pd.Series(['Boys (Target)','Womens (DSW)','Girls (Kohls)'])
print(df)
d1 = df.str.split(' \(')
print(d1)
Sign up to request clarification or add additional context in comments.

Comments

1

Try this:

import pandas as pd

# creating the df
item = ['coat','boots']
dec = ["Boys (Target)", "Womens (DSW)"]
df = pd.DataFrame(item, columns=['Item'])
df['Description'] = dec


def extract_brackets(row):
    return row.split('(', 1)[1].split(')')[0].strip()


def extract_first_value(row):
    return row.split()[0].strip()


df['Retailer'] = df['Description'].apply(extract_brackets)
df['Description'] = df['Description'].apply(extract_first_value)

print(df)

Comments

1

You have to include the parameter expand=True within split function, and rearrange the way you assign back your two columns. Consider using the following code:

df[['Description','Retailer']]  = df.Description.str.replace(')','',regex=True)\
    .str.split('(',expand=True)

print(df)

    Item Description Retailer
0   coat       Boys    Target
1  boots     Womens       DSW
2  socks      Girls     Kohls
3  shirt       Mens   Walmart
4  boots     Womens       DSW
5   coat       Boys    Target

I am first removing the closing bracket from Description, and then expanding based on the opening bracket.

Comments

0

One way using pandas.Series.str.findall:

df[["Description", "Retailer"]] = df["Description"].str.findall("\w+").apply(pd.Series)
print(df)

Output:

    Item Description Retailer
0   coat        Boys   Target
1  boots      Womens      DSW
2  socks       Girls    Kohls
3  shirt        Mens  Walmart
4  boots      Womens      DSW
5   coat        Boys   Target

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.