1

I have a numpy array-like (as stated here, simply an object that can be used to create a numpy array), and want to create a pandas.Series from it. According to its documentation, it supports array-likes. Now consider the following MWE.

import numpy as np
import pandas as pd

class ArrayLike:
    def __array__(self, dtype = None):
        return np.asarray([0, 1])

a = ArrayLike()
print(pd.Series(a))
print(pd.Series(np.asarray(a)))

This results in

0    <__main__.ArrayLike object at [...]>
dtype: object
0    0
1    1
dtype: int64

This is not what I would expect, since the whole point of the array-like is the ability to convert to a numpy array, so the behaviour when creating the series directly from my ArrayLike seems weird to me.

Is this intentional from pandas, and if so, what is the reasoning behind it? And is there any possibility to achive the behaviour of the second statement when directly calling pd.Series on my object?

5
  • what is your goal? Commented Jul 20, 2022 at 11:44
  • 1
    I haven't followed this closely, but I believe the test for a __array__ method is a relatively recent addition to numpy, Even in your linked SO, the original 2016 answers don't mention it, and newer ones talk it about in the context of typing. A Series does have an __array__ method (used by np.asarray(aSeries). But the Series.__init__ is much more complex, creating or using a index as well as the data. Looking for the __array__` method doesn't have same priority as with np.asarray. Commented Jul 20, 2022 at 16:14
  • The Series docs do not define array like in the same way as np.array. There's no mention of the __array__ method. Commented Jul 20, 2022 at 16:20
  • 1
    This maybe relevant github.com/pandas-dev/pandas/issues/41807 Commented Jul 20, 2022 at 17:24
  • @DaniMesejo yes, that is basically the answer I was looking for. So (sadly), pandas does not support array-likes the way I need, and has a different understanding of the term than numpy. Commented Jul 20, 2022 at 17:40

1 Answer 1

2

The problem seems to be that pandas, check if the passed object is list-like first, and if not it wraps a list around the object (see source code):

if index is None:
    if not is_list_like(data):
        data = [data]

then it doesn't find the __array__ attribute when searching for it (see source code) because at this point data points to a list:

if hasattr(data, "__array__"):
    # e.g. dask array GH#38645
    data = np.asarray(data)
else:
    data = list(data)

One solution is to define __iter__:

import numpy as np
import pandas as pd

class ArrayLike:
    def __array__(self, dtype = None):
        return np.asarray([0, 1])

    def __iter__(self):
        return iter(np.asarray([0, 1]))

a = ArrayLike()
print(pd.Series(a))

Output

0    0
1    1
dtype: int64
Sign up to request clarification or add additional context in comments.

1 Comment

The __iter__ solution sounds quite good, should work in most cases. However, I have some checks inside __array__ (raises an Error otherwise), which I do not want to have inside __iter__. So sadly this does not solve it for me.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.