0

I am using numpy.asarray in my project to handle arrays due to its superb efficiency comparing with default Python lists. I am also supposed to take care of memory utilization when allocating the array because my program can receive big data in gigabytes. While checking numpy.asarray, I found out that the data type is inferred from the array itself unless stated. Thus, I have the following array:

np.asarray([list(map(int, list(x))) for x in X])

When I print print X.dtype, I got int64. Since the array X here always contains binary values, 0 or 1, I thought to use dtype=np.int8 to reduce the memory needed when allocating space. But I am not sure if this is a good idea! Should I stick with the default int64? Could int8 lose any data precisions that I cannot think of?

Thank you.

1 Answer 1

2

From NumPy Manual:

Array types and conversions between types

Data type    Description

...
int8         Byte (-128 to 127)
...

If you are only going to put binary values in the array than it will be just fine. You won't lose any data precision.


You could even think to set data type to bool_ which is stored as a byte and will definitely be the best solution for your memory and works as an int too.

>>> import numpy as np
>>> x = np.asarray([1,0,1,0], dtype=np.bool_)
>>> x
array([ True, False,  True, False], dtype=bool)
>>> x + 2
array([3, 2, 3, 2])
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much. Can you explain how to implement bool_ ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.