0

I created this numpy array and stored it in disk as follows:

s = (b'foo', b'bar', b'baz', b'buzz')
def build_numpy_array():
  return np.fromiter((s for _ in range(200)), dtype=[('foo','S40'), 
  ('bar', 'S40'), ('baz', 'S40'), 
  ('buzz', 'S40')
  ])

np.save('data.dat', {'data': build_numpy_array()})

This works fine np.load('data.dat.npy')

But, I want to use it in memmap mode. So this fails

np.load('data.dat.npy',mmap_mode='r') 

ValueError: Array can't be memory-mapped: Python objects in dtype.

And this gives weird encoding

np.memmap('data.dat.npy',  mode='r',dtype=[('foo','S40'), 
  ('bar', 'S40'), ('baz', 'S40'), 
  ('buzz', 'S40')
  ])
 (b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00bar', b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00baz', b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00buz', b'z\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00foo')

1 Answer 1

1
data = build_numpy_array()

is a (200,) structured array.

to load:

In [152]: np.save('data.dat', {'data': data})    

I have to use allow_pickle:

In [157]: x=np.load('data.dat.npy',allow_pickle=True)     

x is () shaped object array. That is x.item() is a dictionary, containing the array as an element value.

The problem lies with the save - it's saving a dictionary.


In [161]: np.save('data.dat', data)                                             
In [162]: x=np.load('data.dat.npy')                                             
In [163]: x.shape                                                               
Out[163]: (200,)

now

In [165]: r = np.load('data.dat.npy',mmap_mode='r')                             
In [166]: r                                                                     
Out[166]: 
memmap([(b'foo', b'bar', b'baz', b'buzz'),
        (b'foo', b'bar', b'baz', b'buzz'),
        (b'foo', b'bar', b'baz', b'buzz'),
    ...
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the response. I see the issue. But is the load happening in memory map mode?
Apparently memory map mode cannot handle the object dtype, the pickle. But it works if we simply save data, rather than a dictionary containing data. The dictionary works with: np.savez('data.dat', **{'data': data}) (but that's a zip archive).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.