On my debian squeeze system, I ran into a python problem that can be distilled to the following:
import numpy
import datetime
from matplotlib import pyplot
x = [datetime.datetime.utcfromtimestamp(i) for i in numpy.arange(100000,200000,3600)]
y = range(len(x))
# See matplotlib handle a series of datetimes just fine..
pyplot.plot(x, y)
# [<matplotlib.lines.Line2D object at 0xad10f4c>]
import pandas
# Now we try exactly what we did before..
pyplot.plot(x, y)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/pymodules/python2.6/matplotlib/pyplot.py", line 2141, in plot
ret = ax.plot(*args, **kwargs)
File "/usr/lib/pymodules/python2.6/matplotlib/axes.py", line 3432, in plot
for line in self._get_lines(*args, **kwargs):
File "/usr/lib/pymodules/python2.6/matplotlib/axes.py", line 311, in _grab_next_args
for seg in self._plot_args(remaining, kwargs):
File "/usr/lib/pymodules/python2.6/matplotlib/axes.py", line 288, in _plot_args
x, y = self._xy_from_xy(x, y)
File "/usr/lib/pymodules/python2.6/matplotlib/axes.py", line 204, in _xy_from_xy
bx = self.axes.xaxis.update_units(x)
File "/usr/lib/pymodules/python2.6/matplotlib/axis.py", line 982, in update_units
self._update_axisinfo()
File "/usr/lib/pymodules/python2.6/matplotlib/axis.py", line 994, in _update_axisinfo
info = self.converter.axisinfo(self.units, self)
File "/usr/local/lib/python2.6/dist-packages/pandas/tseries/converter.py", line 184, in axisinfo
majfmt = PandasAutoDateFormatter(majloc, tz=tz)
File "/usr/local/lib/python2.6/dist-packages/pandas/tseries/converter.py", line 195, in __init__
dates.AutoDateFormatter.__init__(self, locator, tz, defaultfmt)
TypeError: __init__() takes at most 3 arguments (4 given)
I'm not interested in the cause of the particular error shown, it's obvious enough that pandas expected a different version of matplotlib -- that's a fair risk of getting one package from the standard debian repository and the other through pip, and I already 'solved' that part of the problem by allowing pip to upgrade matplotlib.
The real issue is -- and now comes a threefold question: how can it be that just importing pandas broke matplotlib's ability to handle datetime objects, when just two lines earlier pandas was clearly not even involved in that same operation? Does pandas upon import silently alter other modules in the top level namespace to force them to make use of pandas methods? And is this acceptable behavour for a python module? Because I need to be able to rely on it that importing, say, a random number module, won't silently change, say, the pickle module to apply a random salt to everything it writes..
Update with further information
python is 2.6.6 (current debian stable from package 2.6.6-3+squeeze7)
matplotlib version was debian's 0.99.3-1 (current debian stable from package python-matplotlib)
pandas version was 0.9.0 (installed with 'pip install pandas', a while ago -- not today)
Platform is an i386 running debian Squeeze
Steps to replicate
- (obvious) Bootstrap a clean debian squeeze i386 installation and chroot into it.
- apt-get update
- apt-get install python python-matplotlib
- apt-get install python-pip build-essential python-dev
- pip install --upgrade numpy
- pip install pandas
Now start an interactive python session
import numpy
import datetime
# Next two lines added to original example to avoid hassle with DISPLAY in chroot
import matplotlib
matplotlib.use('agg')
from matplotlib import pyplot
x = [datetime.datetime.utcfromtimestamp(i) for i in numpy.arange(100000,200000,3600)]
y = range(len(x))
pyplot.plot(x, y)
import pandas
pyplot.plot(x, y)