2

I have a problem backward filling a numpy date vector using the current version of pandas. The same code works with an earlier version. The following demonstrates my problem:

The older version (0.7.3) works

C:\WINDOWS\system32>pip show pandas
Name: pandas
Version: 0.7.3
Summary: Powerful data structures for data analysis and statistics
Home-page: http://pandas.pydata.org
Author: The PyData Development Team
Author-email: [email protected]
License: BSD
Location: c:\program files\python\python27\lib\site-packages
Requires: python-dateutil, numpy

C:\WINDOWS\system32>python
Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:24:40) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> d=np.array([None, None, None, None, dt.now(), None])
>>> b = DataFrame(d)
>>> b.fillna(method='backfill')
                            0
0  2017-04-02 12:21:18.175000
1  2017-04-02 12:21:18.175000
2  2017-04-02 12:21:18.175000
3  2017-04-02 12:21:18.175000
4  2017-04-02 12:21:18.175000
5                        None
>>>

The current vesion (0.19.2) doesn't work:

C:\WINDOWS\system32>pip show pandas
Name: pandas
Version: 0.19.2
Summary: Powerful data structures for data analysis, time series,and statistics
Home-page: http://pandas.pydata.org
Author: The PyData Development Team
Author-email: [email protected]
License: BSD
Location: c:\program files\python\python27\lib\site-packages
Requires: pytz, python-dateutil, numpy


C:\WINDOWS\system32>python
Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:24:40) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from datetime import datetime as dt
>>> import numpy as np
>>> from pandas import DataFrame
>>> d=np.array([None, None, None, None, dt.now(), None])
>>> b = DataFrame(d)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Program Files\Python\Python27\lib\site-packages\pandas\core\frame.py", line 297, in __init__
    copy=copy)
  File "C:\Program Files\Python\Python27\lib\site-packages\pandas\core\frame.py", line 474, in _init_ndarray
    return create_block_manager_from_blocks([values], [columns, index])
  File "C:\Program Files\Python\Python27\lib\site-packages\pandas\core\internals.py", line 4256, in create_block_manager_from_blocks
    construction_error(tot_items, blocks[0].shape[1:], axes, e)
  File "C:\Program Files\Python\Python27\lib\site-packages\pandas\core\internals.py", line 4230, in construction_error
    if block_shape[0] == 0:
IndexError: tuple index out of range
>>>

Am I doing something wrong or is it, as I think, a bug in pandas? If its a bug how do I report that?

EDIT: This was filed as a bug report with Pandas and will be fixed in the next minor relase (0.19.3)

2 Answers 2

2

DataFrame(d) fails, and I'm not sure why, but Series(d) works, so you can do this:

pd.DataFrame({0:d})

That is, explicitly tell Pandas that d is a Series called 0, which is what it was implicitly doing in the ancient 0.7 version.

If you do want to report a bug, you can simply say that this works:

pd.DataFrame([None, None, datetime.datetime.now()])

But this fails:

pd.DataFrame([None, None, None, datetime.datetime.now()])
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for that answer. Are you able to tell me where to report this? Can you also explain what is going on in your test. Why does the one work and the other not?
@RichardB: The Pandas issue tracker is here: github.com/pandas-dev/pandas/issues
0

Try to specify (or cast) the dtype explicitly:

In [18]: d=np.array([None, None, None, None, pd.datetime.now(), None])

In [19]: b = DataFrame(d.astype('datetime64[ms]'))

In [20]: b
Out[20]:
                        0
0                     NaT
1                     NaT
2                     NaT
3                     NaT
4 2017-04-02 20:34:20.381
5                     NaT

In [21]: b.bfill()
Out[21]:
                        0
0 2017-04-02 20:34:20.381
1 2017-04-02 20:34:20.381
2 2017-04-02 20:34:20.381
3 2017-04-02 20:34:20.381
4 2017-04-02 20:34:20.381
5                     NaT

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.