Assume that I have a big float numpy array:
How to save this float numpy array to a binary file with less storage using numpy.save?
np.save(nucleosomenpyOutputFilePath,averageSignalArray)
Thanks @hpaulj. You openedmy eyes.
By playing with dtype=np.float32 or dtype=np.float16 in the following statements
averageSignalArray=np.divide(accumulatedSignalArray,accumulatedCountArray,dtype=np.float32)
averageSignalArray=np.divide(accumulatedSignalArray,accumulatedCountArray,dtype=np.float16)
I got different nparrays and save them in the following step:
np.save(nucleosomenpyOutputFilePath,averageSignalArray)
If your goal is to just save size on the resulting files and you can install additional python packages and use compressed arrays. https://github.com/Blosc/bcolz
Probably one of the fastest and most space-efficient ways of doing this is by using Bloscpack:
https://github.com/blosc/bloscpack
You can read about using the Python API here:
https://github.com/blosc/bloscpack#python-api
And lastly, here is an example:
>>> import numpy as np
>>> import bloscpack as bp
>>> a = np.linspace(0, 1, 3e8)
>>> print a.size, a.dtype
300000000 float64
>>> bp.pack_ndarray_to_file(a, 'a.blp')
>>> b = bp.unpack_ndarray_from_file('a.blp')
>>> (a == b).all()
True
dtype?averageSignalArray=np.divide(accumulatedSignalArray,accumulatedCountArray)I guess its dtype=floatdtype?float32andfloat64, 4 and 8 bytes per element.