hdf5 file to pandas dataframe

Question

I downloaded a dataset which is stored in .h5 files. I need to keep only certain columns and to be able to manipulate the data in it.

To do this, I tried to load it in a pandas dataframe. I've tried to use:

pd.read_hdf(path)

But I get: No dataset in HDF5 file.

I've found answers on SO (read HDF5 file to pandas DataFrame with conditions) but I don't need conditions, and the answer adds conditions about how the file was written but I'm not the creator of the file so I can't do anything about that.

I've also tried using h5py:

df = h5py.File(path)

But this is not easily manipulable and I can't seem to get the columns out of it (only the names of the columns using df.keys()) Any idea on how to do this ?

I think my answe here and the provided links might be of help for you: stackoverflow.com/a/74127100/5838180 — NeStack
– NeStack, Commented Oct 20, 2022 at 11:53

im2527 · Accepted Answer · 2019-10-03 02:12:22Z

17

Easiest way to read them into Pandas is to convert into h5py, then np.array, and then into DataFrame. It would look something like:

df = pd.DataFrame(np.array(h5py.File(path)['variable_1']))

answered Oct 3, 2019 at 2:12

im2527

4214 silver badges4 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Patrick_Chong Over a year ago

What does 'variable_1' represent in this case? - I am having the same issue opening .h5 files

kSureja Over a year ago

variable_1 would be a single dataset. A file in h5py is supposed to be a dictionary of labeled datasets. So "h5py.File(path)" returns a dictionary and then accessing ["variable_1"] on that returns the value for the key "variable_1" which is a single dataset.

Community · Accepted Answer · 2017-05-23 11:52:53Z

6

Pandas HDF support needs the HDF file to be formated very specifically. You can see https://stackoverflow.com/a/33644128/4128030 for more info.

edited May 23, 2017 at 11:52

CommunityBot

11 silver badge

answered Jan 11, 2017 at 18:33

drj

1442 silver badges8 bronze badges

1 Comment

Jérôme Over a year ago

Yes. More about this here as well.

Collectives™ on Stack Overflow

hdf5 file to pandas dataframe

2 Answers 2

2 Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

1 Comment

Linked

Related