numpy array metadata change

Question

I know that numpy stores numbers in contiguous memory. So is it possible to take

a = np.array([127,127,127,127,127,127,127,127], dtype=np.uint8)

the binary representation of 'a' is all ones

to this:

b = np.array([72057594037927935], dtype=np.uint64)

as well as back again from b->a.

The binary representation is all ones however the number of elements is combined to one single 64 bit int the representation should be the same in Numpy only the metadata should change.

This sounds like a job for stride tricks but my best guess is:

np.lib.stride_tricks.as_strided(a, shape=(1,), strides=(8,8))

and

np.lib.stride_tricks.as_strided(b, shape=(8,), strides=(1,8))

only to get ValueError: mismatch in length of strides and shape

This only needs to be read only so I have no delusions thinking that I need to change the data.

See also a.tobytes() which contains \x7f rather than \xff blocks. — Andras Deak -- Слава Україні
– Andras Deak -- Слава Україні, Commented Feb 27, 2020 at 10:40
Copyist mistake 0b1111111 (7 x1's *8) I was missing a leading zero to start with. But there it is ... I wonder if I will change this. — Back2Basics
– Back2Basics, Commented Feb 27, 2020 at 10:46

Andras Deak -- Слава Україні · Accepted Answer · 2020-02-27 10:43:53Z

If you want to reinterpret the existing data in an array you need numpy.ndarray.view. That's the main difference between .astype and .view (i.e. the former converts to a new type with the values being preserved, while the latter maintains the same memory and changes how it's interpreted):

import numpy as np 

a = np.array([127,127,127,127,127,127,127,127], dtype=np.uint8)
b = a.view(np.uint64) 
print(a) 
print(b) 
print(b.view(np.uint8))

This outputs

[127 127 127 127 127 127 127 127]
[9187201950435737471]
[127 127 127 127 127 127 127 127]

Note that 127 has a leading zero in its binary pattern, so it's not all ones, which is why the value we get in b is different from what you expect:

>>> bin(b[0])
'0b111111101111111011111110111111101111111011111110111111101111111'

>>> bin(72057594037927935)
'0b11111111111111111111111111111111111111111111111111111111'

What you seem to assume is a set of uint7 values of one bits...

Anyway, the best part about .view is that the exact same block of memory will be used unless you explicitly copy:

>>> b.base is a
True

The corollary, of course, is that mutating b will affect a:

>>> b += 3

>>> a
array([130, 127, 127, 127, 127, 127, 127, 127], dtype=uint8)

To control endianness you'd want to use string-valued dtype specifications, i.e. a.view('<u8') (little endian) or a.view('>u8') (big endian). We can use this to reproduce the faulty number in your question:

>>> a2 = np.array([0] + [255] * 7, dtype=np.uint8)
... a2.view('>u8')
array([72057594037927935], dtype=uint64)

Collectives™ on Stack Overflow

numpy array metadata change

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related