3

I want to resample a pandas dataframe and apply different functions to different columns. The problem is that I cannot properly process a column with strings. I would like to apply a function that merges the string with a delimiter such as " - ". This is a data example:

import pandas as pd
import numpy as np
idx = pd.date_range('2017-01-31', '2017-02-03')
data=list([[1,10,"ok"],[2,20,"merge"],[3,30,"us"]])
dates=pd.DatetimeIndex(['2017-01-31','2017-02-03','2017-02-03'])
d=pd.DataFrame(data, index=,columns=list('ABC'))

            A   B          C
2017-01-31  1  10         ok
2017-02-03  2  20      merge
2017-02-03  3  30         us 

Resampling the numeric columns A and B with a sum and mean aggregator works. Column C however kind of works with sum (but it gets placed on the second place, which might mean that something fails).

d.resample('D').agg({'A': sum, 'B': np.mean, 'C': sum})

              A               C     B
2017-01-31  1.0               a  10.0
2017-02-01  NaN               0   NaN
2017-02-02  NaN               0   NaN
2017-02-03  5.0        merge us  25.0

I would like to get this:

...
2017-02-03  5.0      merge - us  25.0

I tried using lambda in different ways but without success (not shown).

If I may ask a second related question: I can do some post processing for this, but how to fill missing cells in different columns with zeros or ""?

2
  • IIUC, you can use df.fillna() to fill in missing values. pandas.pydata.org/pandas-docs/stable/missing_data.html Commented Dec 2, 2017 at 23:37
  • yes sure, and possibly apply different replacements (eg 0, None, "") to different columns. I was wondering if there was a more elagant way to do it in .resample Commented Dec 3, 2017 at 2:33

1 Answer 1

7

Your agg function for column 'C' should be a join

d.resample('D').agg({'A': sum, 'B': np.mean, 'C': ' - '.join})

              A     B           C
2017-01-31  1.0  10.0          ok
2017-02-01  NaN   NaN            
2017-02-02  NaN   NaN            
2017-02-03  5.0  25.0  merge - us
Sign up to request clarification or add additional context in comments.

1 Comment

Ah, so the "input" to the functions are iterables and I can use a function taking iterables as inputs. I can build a function taking an iterable if I need to do fancier operations. Strange, in my console I still see column C in the second place.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.