2

I am trying to set the values of certain cells in a Pandas MultiIndex DataFrame by selecting these cells using a list.

Note the sequence of both lists.

df.loc[(['Peter','John','Tom'],'AAPL'),1] = ['Peter', 'John', 'Tom']

Problem: However, the values are being set to the wrong cell. For example, I expect the value Peter to be set under the index Peter, but it is being set under Tom!

Anyone knows the reason, and what the proper way of doing this is?

In other words, how do we ensure the sequence of the list used in df.loc() (eg: ['Peter','John','Tom'] inside df.loc) to be the same sequence as the list of values (eg: ['Peter','John','Tom'] to the right of =)

Expected Result

             0      1   2
Name  Stock              
Tom   AAPL   0    Tom   0
      GOOG   0      0   0
      NFLX   0      0   0
John  AAPL   0   John   0
      GOOG   0      0   0
      NFLX   0      0   0
Peter AAPL   0  Peter   0
      GOOG   0      0  46
      NFLX   0      0   0

Actual Result

             0      1   2
Name  Stock              
Tom   AAPL   0  Peter   0   <----- should be Tom
      GOOG   0      0   0
      NFLX   0      0   0
John  AAPL   0   John   0
      GOOG   0      0   0
      NFLX   0      0   0
Peter AAPL   0    Tom   0   <----- should be Peter
      GOOG   0      0  46
      NFLX   0      0   0

Code to reproduce problem

import pandas as pd

# Initialize MultiIndex DataFrame
stocks = ['AAPL', 'GOOG', 'NFLX']
names = ['Tom', 'John', 'Peter']
midx = pd.MultiIndex.from_product([names, stocks], names=['Name','Stock'])
df = pd.DataFrame(index=midx, columns=[0,1,2])
df.loc[pd.IndexSlice[:,:],:] = 0

# Partially populate the empty MultiIndex DataFrame
df.loc[('Tom', 'AAPL'), 1] = 36
df.loc[('Peter', 'GOOG'), 2] = 46
print(df)  # looks correct
# Set values for some cells
df.loc[(['Peter','John','Tom'],'AAPL'),1] = ['Peter', 'John', 'Tom']
print(df)  # wrong!!!

2 Answers 2

2

Like this, by giving the entire index for each elements.

df.loc[[('Peter', 'AAPL'), ('John', 'AAPL'),('Tom','AAPL')],1] = ['Peter', 'John', 'Tom']
print(df)

Fron pandas documentation

Note It is important to note that tuples and lists are not treated identically in pandas when it comes to indexing. Whereas a tuple is interpreted as one multi-level key, a list is used to specify several keys. Or in other words, tuples go horizontally (traversing levels), lists go vertically (scanning levels).

Sign up to request clarification or add additional context in comments.

Comments

1

While I don't know what is causing the issue, it can be stepped around by being more precise with the multiply indexed data.

df.loc[[('Peter','AAPL'),('John','AAPL'),('Tom','AAPL')],1] = ['Peter','John','Tom']
print(df) # this one works as you would expect

# to make it a bit more automated (create index from list, set 1st column to appropriate list item
# name list:
pjt = ['Peter','John','Tom']
# index list built from name list
pjt_aapl = [ (name,'AAPL') for name in ['Peter','John','Tom'] ]
# set first column to name
df.loc[ pjt_aapl, 1] = pjt

Cheers, Jesse

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.