3

I have a Dataframe with 2 data columns and a multiindex for every month and every day in a month, so something like:

           temperatureMax  temperatureMin
time time                                
1    1          14.167500        7.744167
     2          13.735000        7.480833
     3          14.228333        7.901667
     4          13.891667        7.350833
     5          13.735833        6.903333
     6          13.670000        6.494167
     7          13.642500        7.040000
     8          13.005000        6.175000
     9          13.034167        5.253333
     10         13.260833        5.628333
     11         13.783333        5.511667
     12         13.823333        6.630000
     13         13.265833        6.712500
     14         13.112500        6.130000
     15         12.355833        7.213333
     16         13.032500        6.533333
     17         13.175833        7.030000
     18         13.184167        8.225000
     19         13.896667        6.658333
     20         13.711667        5.693333
     21         13.442500        5.944167
     22         13.245000        6.468333
     23         13.765000        5.555833
     24         14.260000        5.212500
     25         13.523333        5.850000
     26         13.000000        5.519167
     27         12.554167        5.264167
     28         12.806667        5.311667
     29         12.755000        6.012500
     30         13.240833        6.136667
...                   ...             ...
12   2          13.545000        5.855833
     3          14.380833        6.214167
     4          14.502500        7.610833
     5          15.379167        8.201667
     6          15.161667        8.593333
     7          15.101667        8.940833
     8          14.886667        7.217500
     9          14.701667        7.680000
     10         14.756667        7.160000
     11         14.575000        6.057500
     12         14.172500        7.138333
     13         14.360833        7.244167
     14         14.285833        7.430000
     15         13.545000        6.167500
     16         14.082500        5.516667
     17         13.780833        5.871667
     18         13.345833        5.357500
     19         13.909167        5.682500
     20         13.264167        5.570833
     21         14.828333        6.620833
     22         14.431667        6.689167
     23         13.564167        6.491667
     24         14.343333        6.074167
     25         13.470000        5.594167
     26         13.468333        4.400833
     27         13.403333        5.600833
     28         14.506667        7.085833
     29         14.173333        6.999167
     30         14.211667        7.810000
     31         13.604167        7.382500

[366 rows x 2 columns]

How can I remove a specific day of the month? Namely, I want to remove the 29th of February from this list.

The only thing I can do is to remove every 29th day of every 12 month with daily_mean.drop(29, level=1), but I want to do that only for February.

6
  • programming aside, I'm really curious why it'd be appropriate to simply ignore an arbitrary day in a continuous time series Commented Jan 20, 2018 at 22:59
  • It is for a visualization of average temperatures by day of year, removing the 29th of February makes it easier for the user to get the data they need. Commented Jan 20, 2018 at 23:06
  • but that's a day that occurred in the year. why drop that day? why not Dec 31? why not june 4? Commented Jan 20, 2018 at 23:07
  • Because that specific day has 4 times less data and as such a much higher statistical error for the quantity being plotted. Commented Jan 20, 2018 at 23:13
  • Incorrect. Feb 29 is the 60th day of the year. Dec 31st (on a leap year) is the 366th day of the year. If you're plotting by day of the year, I would argue that is the anomalous day. Commented Jan 20, 2018 at 23:17

1 Answer 1

7

Drop still work for this , you just need pass to a tuple

df.drop((1,1))
Out[821]: 
             temperatureMax  temperatureMin
time time.1                                
1    2            13.735000        7.480833
2    1            14.228333        7.901667
     4            13.891667        7.350833
     5            13.735833        6.903333

Data input

df
Out[820]: 
             temperatureMax  temperatureMin
time time.1                                
1    1            14.167500        7.744167
     2            13.735000        7.480833
2    1            14.228333        7.901667
     4            13.891667        7.350833
     5            13.735833        6.903333
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, this does it. I was trying with [2,29] and a lot of slightly different alterations to that but always with a list and never realized I obviously had to use a tuple.
@jbssm, yw~ :-) happy coding

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.