2

Is there a way to create a dataframe having multi-indexing on both rows and columns without using tuples? My labels are too long to enter manually as tuples (96 countries and 26 sectors per country). Example of what I want

I tried:

df_data.columns=label_df 

df_data_w = pd.concat([label_df, data],axis=1,ignore_index=False) 

This added the label df to the first two columns, but didn't index it. I instead get this following dataframe

Here is some code to use:

import numpy as np
import pandas as pd

a = np.random.randint(low=0, high=10,size=9)
b = np.random.randint(low=0, high=10,size=9)
c = np.random.randint(low=0, high=10,size=9)
d = np.random.randint(low=0, high=10,size=9)
e = np.random.randint(low=0, high=10,size=9)
f = np.random.randint(low=0, high=10,size=9)
g = np.random.randint(low=0, high=10,size=9)
h = np.random.randint(low=0, high=10,size=9)
i = np.random.randint(low=0, high=10,size=9)

df = pd.DataFrame(data=[a,b,c,d,e,f,g,h,i])

Continent = ['Africa','Africa','Africa','North America', 'North America', 'North America', 'Europe','Europe','Europe']

Sectors = ['Agriculture','Industry','Domestic','Agriculture','Industry','Domestic','Agriculture','Industry','Domestic']

label_df = pd.DataFrame(data=[Continent, Sectors])

df.columns=label_df  

df_w_labels = pd.concat([label_df, data],axis=1,ignore_index=False)` 

This gives me the labels as headers in my df, but I need them as columns as well, so I tried concat, which added the label df to the first two columns, but didn't index it.

4
  • Welcome to SO. Please provide a minimal reproducible example. That means no links, no images, just text in your question. Good luck! Commented Mar 30, 2018 at 16:04
  • Thanks @jpp - my first SO post. Have edited to hopefully be more helpful. Commented Mar 30, 2018 at 18:32
  • To clarify, while you have a lot of labels you have only two levels, correct? "Country" and "sector"? Commented Mar 30, 2018 at 19:36
  • Hi @Ajean yes, only two levels: Country and Sector. Commented Apr 1, 2018 at 16:43

1 Answer 1

0

You can use zip and list with pd.MultiIndex:

a = np.random.randint(low=0, high=10,size=9)
b = np.random.randint(low=0, high=10,size=9)
c = np.random.randint(low=0, high=10,size=9)
d = np.random.randint(low=0, high=10,size=9)
e = np.random.randint(low=0, high=10,size=9)
f = np.random.randint(low=0, high=10,size=9)
g = np.random.randint(low=0, high=10,size=9)
h = np.random.randint(low=0, high=10,size=9)
i = np.random.randint(low=0, high=10,size=9)

df = pd.DataFrame(data=[a,b,c,d,e,f,g,h,i])

Continent = ['Africa','Africa','Africa','North America', 'North America', 'North America', 'Europe','Europe','Europe']
Sectors = ['Agriculture','Industry','Domestic','Agriculture','Industry','Domestic','Agriculture','Industry','Domestic']

indx = pd.MultiIndex.from_tuples(list(zip(Continent,Sectors)))

df.index = indx
df.columns = indx

print(df)
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks Scott Boston, that’s helpful. I think my main issue is turning the columns in my .csv file into tuples, as the labels in my real data set are too long to enter manually (96 countries with 26 sectors each). Am going to try the xlrd package described here and report on results: stackoverflow.com/questions/37403460/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.