11

I created this dataframe:

import pandas as pd
columns = pd.MultiIndex.from_tuples([("x", "", ""), ("values", "a", "a.b"), ("values", "c", "")])
df0 = pd.DataFrame([(0,10,20),(1,100,200)], columns=columns)
df0

I unload df0 to excel:

df0.to_excel("test.xlsx")

and load it again:

df1 = pd.read_excel("test.xlsx", header=[0,1,2])
df1

And I have Unnamed :... column names.

To make df1 look like inital df0 I run:

def rename_unnamed(df, label=""):
    for i, columns in enumerate(df.columns.levels):
        columns = columns.tolist()
        for j, row in enumerate(columns):
            if "Unnamed: " in row:
                columns[j] = ""
        df.columns.set_levels(columns, level=i, inplace=True)
    return df

rename_unnamed(df1)

Well done. But is there any pandas way from box to do this?

3 Answers 3

6

Since pandas 0.21.0 the code should be like this

def rename_unnamed(df):
    """Rename unamed columns name for Pandas DataFrame

    See https://stackoverflow.com/questions/41221079/rename-multiindex-columns-in-pandas

    Parameters
    ----------
    df : pd.DataFrame object
        Input dataframe

    Returns
    -------
    pd.DataFrame
        Output dataframe

    """
    for i, columns in enumerate(df.columns.levels):
        columns_new = columns.tolist()
        for j, row in enumerate(columns_new):
            if "Unnamed: " in row:
                columns_new[j] = ""
        if pd.__version__ < "0.21.0":  # https://stackoverflow.com/a/48186976/716469
            df.columns.set_levels(columns_new, level=i, inplace=True)
        else:
            df = df.rename(columns=dict(zip(columns.tolist(), columns_new)),
                           level=i)
    return df
Sign up to request clarification or add additional context in comments.

1 Comment

Great solution. I'm trying to do something similar based off your function but can't do it. Would you care to take a look:stackoverflow.com/questions/61111336/…?
4

Mixing answers from @jezrael and @dinya, and limited for pandas above 0.21.0 (after 2017) an option to solve this will be:

for i, columns_old in enumerate(df.columns.levels):
    columns_new = np.where(columns_old.str.contains('Unnamed'), '-', columns_old)
    df.rename(columns=dict(zip(columns_old, columns_new)), level=i, inplace=True)

Comments

2

You can use numpy.where with condition by contains:

for i, col in enumerate(df1.columns.levels):
    columns = np.where(col.str.contains('Unnamed'), '', col)
    df1.columns.set_levels(columns, level=i, inplace=True)

print (df1)
   x values     
          a    c
        a.b     
0  0     10   20
1  1    100  200

2 Comments

Unfortunately in pandas is no function for this. str.contains works only with Series, so need for.
As @dinya says this code works for pandas version below 0.21.0. See his/her answer for an update.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.