1

I am using a pandas dataframe and I would like to remove all information after a space occures. My dataframe is similar as this one:

import pandas as pd
d = {'Australia' : pd.Series([0,'1980 (F)\n\n1957 (T)\n\n',1991], index=['Australia', 'Belgium', 'France']),
     'Belgium' : pd.Series([1980,0,1992], index=['Australia','Belgium', 'France']),
    'France' : pd.Series([1991,1992,0], index=['Australia','Belgium', 'France'])}
df = pd.DataFrame(d, dtype='str')

df

I am able to remove the values for one specific column, however the split() function does not apply to the whole dataframe.

f = lambda x: x["Australia"].split(" ")[0]
df = df.apply(f, axis=1)

Anyone an idea how I could remove the information after a space occures for each value in the dataframe?

1
  • Yes, I have seen a similar question. But I want to return my whole dataframe without the information after the space. Commented Mar 27, 2018 at 12:38

3 Answers 3

1

I think need convert all columns to strings and then apply split function:

df = df.astype(str).apply(lambda x: x.str.split().str[0])

Another solution:

df = df.astype(str).applymap(lambda x: x.split()[0])

print (df)
          Australia Belgium France
Australia         0    1980   1991
Belgium        1980       0   1992
France         1991    1992      0
Sign up to request clarification or add additional context in comments.

Comments

1

Let's try using assign since the column names in this dataframe are "well tame" meaning not containing a space nor special characters:

df.assign(Australia=df.Australia.str.split().str[0])

Output:

          Australia Belgium France
Australia         0    1980   1991
Belgium        1980       0   1992
France         1991    1992      0

Or you can use apply and a lamda function if all your column datatypes are strings:

df.apply(lambda x: x.str.split().str[0])

Or if you have a mixture of numbers and string dtypes then you can use select_dtypes with assign like this:

df.assign(**df.select_dtypes(exclude=np.number).apply(lambda x: x.str.split().str[0]))

Comments

0

You could loop over all columns and apply below:

for column in df:

    df[column] = df[column].str.split().str[0]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.