There may be a simpler way, but this is how you'd go about doing it, as far as I know:
import numpy as np
import tables
# Generate some data
x = np.random.random((100,100,100))
# Store "x" in a chunked array...
f = tables.open_file('test.hdf', 'w')
atom = tables.Atom.from_dtype(x.dtype)
ds = f.createCArray(f.root, 'somename', atom, x.shape)
ds[:] = x
f.close()
If you want to specify the compression to use, have a look at tables.Filters. E.g.
import numpy as np
import tables
# Generate some data
x = np.random.random((100,100,100))
# Store "x" in a chunked array with level 5 BLOSC compression...
f = tables.open_file('test.hdf', 'w')
atom = tables.Atom.from_dtype(x.dtype)
filters = tables.Filters(complib='blosc', complevel=5)
ds = f.createCArray(f.root, 'somename', atom, x.shape, filters=filters)
ds[:] = x
f.close()
There's probably a simpler way for a lot of this... I haven't used pytables for anything other than table-like data in a long while.
Note: with pytables 3.0, f.createCArray was renamed to f.create_carray. It can also accept the array directly, without specifying the atom,
f.create_carray('/', 'somename', obj=x, filters=filters)
h5pyinstead ofpytables. It's as simple asf.create_dataset('name', data=x)wherexis your numpy array andfis the open hdf file. Doing the same thing inpytablesis possible, but considerably more difficult.pytablesdoes. (Or at least not that I know of, anyway.)pytableswill also give you lots of nice querying abilities.h5pyis better suited to straight-up storage and slicing of on-disk arrays (and is more pythonic, i.m.o., too). Not to plug my own answer too much, but my thoughts on the tradeoff between the two is here: stackoverflow.com/questions/7883646/…