4

I have a Pandas dataframe with a multiindex (Reg, Type, Part, IsExpired)-

Reg        Type      Part     IsExpired    Quantity
APAC       Disk      A        False        10
                              True         12
EMEA       Disk      A        False        22
EMEA       Disk      B        False        13
                              True         17

I want to make sure that every (Reg, Type, Part) tuple has True and False for IsExpired. E.g. I'd like to insert a row for (EMEA, Disk, A, True)-

Reg        Type      Part     IsExpired    Quantity
APAC       Disk      A        False        10
                              True         12
EMEA       Disk      A        False        22
                              True         0   <-- inserted row
EMEA       Disk      B        False        13
                              True         17

2 Answers 2

3

Have you considered just adding the relevant row? Since you're really just adding one cell of a value you could do it efficiently like this:

df.at[('EMEA', 'DISC', 'A', False), 'Quantity'] = 0 
Sign up to request clarification or add additional context in comments.

1 Comment

I've been using pandas for almost 5 years now. I can't believe that this is the first time I came across this need and just heard about .at() from your comment. Thank you!
2

You could unstack and then fillna:

In [11]: df2
Out[11]:
                          Quantity
Reg  Type Part IsExpired
APAC Disk A    False            10
               True             12
EMEA Disk A    False            22
          B    False            13
               True             17

In [12]: df2.unstack()
Out[12]:
               Quantity
IsExpired         False True
Reg  Type Part
APAC Disk A          10    12
EMEA Disk A          22   NaN
          B          13    17

In [13]: df2.unstack().fillna(0)
Out[13]:
               Quantity
IsExpired         False True
Reg  Type Part
APAC Disk A          10    12
EMEA Disk A          22     0
          B          13    17

Perhaps it makes sense to keep this as a column? Otherwise stack it back:

In [14]: df2.unstack().fillna(0).stack()
Out[14]:
                          Quantity
Reg  Type Part IsExpired
APAC Disk A    False            10
               True             12
EMEA Disk A    False            22
               True              0
          B    False            13
               True             17

5 Comments

Note: when doing stack/unstack there is often an alternative pivot/pivot_table method...
Thanks Andy! I have a slight variant of the above question - For some special processing that I am doing, I am processing these records one row at a time. So one tuple of (Reg, Type, Part, IsExpired) at a time. So for the row in question, I end up with - [EMEA Disk A False 22] where I can't use unstack/stack method. Is there a way to somehow insert a row for True here?
@VivekSharma if you're processing one time... I think either you should do this in batches (wait til you have many, then use pandas). Or just use python, and perhaps something like collections.deque. Or am I misunderstanding you? Why do you have to process them one at a time?
@VivekSharma (Also, creating DataFrames one row at a time doesn't scale well it's O(n^2) in time/memory.
In my problem description I left one level of index out where I have to fill some missing data but within the context of one (Reg, Type, Part, IsExpired). So it's processing a bunch of rows at a time. However, I used your suggestion to fill IsExpired values before I created the multi level index and it worked perfectly! Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.