1

Because plain dict is not adequate for inheritance, I design the following MyDict with UserDict in python standard libraries:

import numpy as np
from collections import UserDict

class MyUserDict(UserDict):
    pass

m_dict = {'a': np.arange(2),
     'b': np.arange(2)}
m_mydict = MyUserDict(m_dict)

Then I meet a problem. When turning a list of UserDict into numpy array, values in UserDict have lost:

m_dict_array = np.array([m_dict, m_dict])
m_mydict_array = np.array([m_mydict, m_mydict])
print(m_dict_array)
print(m_mydict_array)

""" The output is
[{'a': array([0, 1]), 'b': array([0, 1])}
 {'a': array([0, 1]), 'b': array([0, 1])}]
[['a' 'b']
 ['a' 'b']]
"""

The solution I find is adding __array__() method to MyUserDict:

class MyUserDict(UserDict):
     def __array__(self):
        # Prevent NumPy from converting to an array of keys
        return np.array(self.data, dtype=object)

m_mydict = MyUserDict(m_dict)
m_mydict_array = np.array([m_mydict, m_mydict])
print(m_mydict_array)

This indeed solve my problem. But another problem I quickly realized is that self.data is plain dict and elements in m_mydict_array may be also plain dict. So, I check the type as follows:

print(type(m_mydict_array[0]))

The type is MyUserDict I defined. This is certainly the result I hoped for, but I have no idea why it turned out like this.

How does __array__() work? Why is the type of m_mydict_array[0] not default 'dict'?

2
  • 1
    dtype=object means (roughly) to use plain Python objects without any conversion. In fact, m_mydict_array[0] is m_mydict is true. Commented Jun 10 at 8:18
  • The interesting point to this question is that with np.array([m_mydict, m_mydict]) the __array__ method is actually invoked, however the resulting array does not contain the return value of __array__ but rather the object itself. Changing the return to a dummy array (e.g. np.arange) does in fact change the result completely. Commented Jun 10 at 9:56

1 Answer 1

0

I am not entirely sure if this is bug, but it is unexpected. I opened this Issue on GitHub.

I workaround & repo the issue in a different way so it is more obvious to readers. Replace None with self.data in your case.

The bug can be reproduced with:

import numpy as np

class MyUserDict:  # UserDict not necessary.
    def __array__(self, dtype=None, copy=None):
        return np.array(None, dtype=object)

arr = np.array([MyUserDict()])
print(arr)  # [{1:1}]
print(type(arr[0])) 
Output
[<__main__.MyUserDict object at 0xXX]
<class '__main__.MyUserDict'>

Expected:
[None]
NoneType

You receive the correct value when you do not return a single item but with at lease one dimension:

    def __array__(self):
        return np.array(None, dtype=object, ndim=1)

    def __array__(self):
        return np.array([None], dtype=object)
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for your enthusiastic reply and I have noticed explanations on Github from 'seberg'. But I still do not get the point😂. I don't understand how the explanation is related to the behavior of the __array__() method.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.