0

I'm trying to define these two variables, and then use them again in line 6. However, I get the following error when running. This only seems to happen with pandas.date_range. My end goal is to run this as a .py file to produce a chart.

start_date = raw_input('enter start date: ')
end_date = raw_input('enter end date: ')

dataPR['date'] = pd.DatetimeIndex(dataPR['intake_date']).date
grouped_dataPR = dataPR.groupby(['date']).sum()
idx = pd.date_range(start='%s', end='%s') % (start_date, end_date)
grouped_dataPR.index = pd.DatetimeIndex(grouped_dataPR.index)
grouped_dataPR = grouped_dataPR.reindex(idx, fill_value=0)
grouped_dataPR['date'] = grouped_dataPR.index
dataPR_df = pd.DataFrame([grouped_dataPR])
ts = pd.Series(grouped_dataPR['count'], index=grouped_dataPR.index)
ts.plot()
pd.rolling_mean(ts,30).plot(style='k')

Error:

ValueError                                Traceback (most recent call            last)
<ipython-input-33-2ac5fe9d8951> in <module>()
  2 grouped_dataPR = dataPR.groupby(['date']).sum()
  3 #idx = pd.date_range('%s', '%s' % (start_date, end_date))
----> 4 idx = pd.date_range(start='%s', end='%s') % (start_date,     end_date)
      5 grouped_dataPR.index = pd.DatetimeIndex(grouped_dataPR.index)
      6 grouped_dataPR = grouped_dataPR.reindex(idx, fill_value=0)

/Users/abc/anaconda/lib/python2.7/site-    packages/pandas/tseries/index.pyc in date_range(start, end, periods,     freq, tz, normalize, name, closed, **kwargs)
   1921     return DatetimeIndex(start=start, end=end,     periods=periods,
   1922                          freq=freq, tz=tz,     normalize=normalize, name=name,
-> 1923                          closed=closed, **kwargs)
   1924 
   1925 

/Users/abc/anaconda/lib/python2.7/site-    packages/pandas/util/decorators.pyc in wrapper(*args, **kwargs)
     87                 else:
     88                     kwargs[new_arg_name] = new_arg_value
---> 89             return func(*args, **kwargs)
     90         return wrapper
     91     return _deprecate_kwarg

/Users/abc/anaconda/lib/python2.7/site-    packages/pandas/tseries/index.pyc in __new__(cls, data, freq, start, end,     periods, copy, name, tz, verify_integrity, normalize, closed, ambiguous,     dtype, **kwargs)
    235             return cls._generate(start, end, periods, name,     freq,
    236                                  tz=tz, normalize=normalize,     closed=closed,
--> 237                                  ambiguous=ambiguous)
    238 
    239         if not isinstance(data, (np.ndarray, Index,     ABCSeries)):

/Users/abc/anaconda/lib/python2.7/site-    packages/pandas/tseries/index.pyc in _generate(cls, start, end, periods,     name, offset, tz, normalize, ambiguous, closed)
    377 
    378         if start is not None:
--> 379             start = Timestamp(start)
    380 
    381         if end is not None:

pandas/tslib.pyx in pandas.tslib.Timestamp.__new__     (pandas/tslib.c:8973)()

pandas/tslib.pyx in pandas.tslib.convert_to_tsobject     (pandas/tslib.c:22522)()

pandas/tslib.pyx in pandas.tslib.convert_str_to_tsobject     (pandas/tslib.c:24520)()

ValueError: 
2
  • What inputs are you using? E.g. what is start_date and end_date? What format does Timestamp need? Commented Jan 27, 2016 at 18:14
  • % is a string operator; you can't use it with an arbitrary block of syntax. Commented Jan 27, 2016 at 18:22

2 Answers 2

4

You should call the variables directly, without wrapping them in quotes. You're attempting to do string substitution in a funny way that won't work.

idx = pd.date_range(start=start_date, end=end_date)

If for some reason you still want to do string substitution, you would have to do it like this, substituting each string individually:

idx = pd.date_range(start='%s' % (start_date, ), end='%s' % (end_date, ))
Sign up to request clarification or add additional context in comments.

Comments

3

I think you just need to do

pd.date_range(start=start_date, end=end_date)

Reason being is that pandas.data_range expects a string or datetime-like object for both the start and end parameters. '%s' is not datetime-like.

If that was a valid option, however, the code you wrote is trying to do a modulo operation between the pandas date_range and a tuple of strings, which more than likely throws some other error.


If you did need to use string formatting for those values, I would suggest using the new way of string formatting like

pd.date_range(start='{}'.format(start_date), end='{}'.format(end_date))

2 Comments

The ValueError occurs when something in Pandas finally tries to use the literal strings %s that date_range receives as arguments. Python doesn't even get as far as trying to evaluate the % operator.
this works! - thank you so much for your help. I can't believe I hadn't tried that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.