Error when reading HDF files in Python 3.7 with Pandas.read_hdf function

Question

Previously, I have saved multi columns of dataset into one HDF file. The procedure can be outlined as follows:

import pandas as pd
from pandas import HDFStore, DataFrame
from pandas import read_hdf

hdf = HDFStore("FILE.h5")
feature =  ['var1','var2']
## noted that the original dataframe is huge, and thus fake dataframe was generated as example.
for k in range(0,len(feature),1):
    df = {'A':['1','2','3','4'],'B':[4,5,6,7]}
    df  = pd.DataFrame(df)
    hdf.put(feature[k], df, format='table', encoding="utf-8")

Then, I can read the file 'FILE.h5' by simply using

df = pd.read_hdf("./FILE.h5,'var1',encoding = 'utf-8')

It always worked well until I have upgraded my Python environment from 2.7 to 3.7.

For now with Python 3.7 and Pandas 0.24.2, the HDF file could not be correctly read. The error shows like:

df = pd.read_hdf("./FILE.h5,'var1',encoding = 'utf-8')
>>> ...
~/anaconda3/lib/python3.7/codecs.py in getdecoder(encoding)
    961 
    962     """
--> 963     return lookup(encoding).decode
    964 
    965 def getincrementalencoder(encoding):

TypeError: lookup() argument must be str, not numpy.bytes_

PS

I have read the GitHub issue which was similar to my situation. But it could not fix my problem. Then, I turned to use h5py package dealing with hdf5-format files, but it was not as convenient as the pandas.

zglin · Accepted Answer · 2019-10-15 15:44:04Z

1

I think you have a prior bug with pandas (since you're using version 0.13). From Github Issues 12304 and 11126 indicate that there's a bug in read_hdf when you attempt to pass encodings in versions under 0.17.

Is upgrading to a modern version of pandas an option since you are already on 3.7?

answered Oct 15, 2019 at 15:44

zglin

2,9192 gold badges18 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Han Zhengzu Over a year ago

Thanks, I'll re-check the Pandas version, and reply to you later.

Han Zhengzu Over a year ago

Sorry to noted that my pandas version is 0.24.2. I have corrected my question

Collectives™ on Stack Overflow

Error when reading HDF files in Python 3.7 with Pandas.read_hdf function

PS

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

PS

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related