As numPy uses c, I don't think a pure python solution is possible but you can avoid the file system using stringIO. Using numpy built in functions np.savez_compression we can then compare the resulting sizes to np.savez,
import StringIO
def get_compression_ratio(a):
uncompressed = StringIO.StringIO()
compressed = StringIO.StringIO()
np.savez_compressed(compressed, a)
np.savez(uncompressed, a)
return uncompressed.len/float(compressed.len)
a = np.zeros([1000,1000])
a[23,60] = 1.
b = np.random.random([1000,1000])
print("one number = ", get_compression_ratio(a),
"random = ", get_compression_ratio(b))
with result,
('one number = ', 1001.0255255255255, 'random = ', 1.0604228730260878)
As the random numbers are incompressible, this makes some sense but the array with one non-zero value should be better. The result relies on the algorithm in savez_compression being efficient/correct.
a = np.array([1, 2, 0, 3, 4, 0, 5]). Does it even make sense to ask for the "compression ratio" ofa? If you have something more specific in mind, please update the question.np.array([1,1,1,1,1,1,1])is more compressible thannp.array([1, 2, 0, 3, 4, 0, 5])for example.