5

I have an array of dtype=object, where the values are either Python lists, or np.nan.

I'd like to replace the values that are np.nan with [None] (not None).

For a pure Python list, I can already do this with [ x if (x is not np.nan) else [None] for x in s ], and converting the array to a list is fine for my purpose, but out of curiosity, I wonder how this can be done with a numpy array. The difficulty is that, when using indexing, numpy tries to interpret any list as a list of values, rather than as the actual value I want to assign.

If I wanted to replace the values with 2, for example, that is easy (normal np, pd imports; as an aside, np.isnan will not work in this instance, a weakness with the choice of float NaN for generic missing values in pandas, so I use pd.isnull, as this is for an issue with pandas internals anyway):

In [53]: s
Out[53]:
array([['asdf', 'asdf'], ['asdf'], nan, ['asdf', 'asdf', 'asdf'],
       ['asdf', 'asdf', 'asdf']], dtype=object)

In [55]: s[pd.isnull(s)] = 2

In [56]: s
Out[56]:
array([['asdf', 'asdf'], ['asdf'], 2, ['asdf', 'asdf', 'asdf'],
       ['asdf', 'asdf', 'asdf']], dtype=object)

Yet trying to replace them with [None] instead replaces them with None:

In [58]: s
Out[58]:
array([['asdf', 'asdf'], ['asdf'], nan, ['asdf', 'asdf', 'asdf'],
       ['asdf', 'asdf', 'asdf']], dtype=object)

In [59]: s[pd.isnull(s)] = [None]

In [60]: s
Out[60]:
array([['asdf', 'asdf'], ['asdf'], None, ['asdf', 'asdf', 'asdf'],
       ['asdf', 'asdf', 'asdf']], dtype=object)

This is, obviously, the behavior that one wants 99% of the time. It just so happens that this time, I want to assign the list as an object. Is there any way to do so?

2
  • You could always explicitly wrap the list up as a scalar array of one object that happens to be a list, the same way you wrapped up s itself. But that's horribly ugly; hopefully someone has a better answer… Commented May 24, 2015 at 22:08
  • If all of your elements were lists, you could just mutate the list in place (with [:] = …), but sadly that's not going to help here, because you obviously can't mutate nan in place into [None]. Commented May 24, 2015 at 22:09

2 Answers 2

4

The first problem is that s[…] = [None] attempts to replace the array slice with the sequence of one value, None. What you actually want is to replace the slice with the sequence of one value, [None], which you'd write as [[None]].

However, that won't actually solve your problem; that just gets you to the problem you were trying to ask in the first place.

What you need to have is explicitly an array of 1 object element that happens to be the list [None]. For example:

>>> n = np.array([[None], 0], dtype=object)[:1]
>>> s[pd.isnull(s)] = n

Or, of course:

>>> n = np.empty((1,), dtype=object)
>>> n[0] = [None]
>>> s[pd.isnull(s)] = n

I'm 90% sure there's a more concise and readable way to create a 1-element array that's guaranteed to have the value [None], and 80% sure there's a simpler way to do the whole thing in the first place, so hopefully someone will come up with a better answer… but if not, this will work.

Sign up to request clarification or add additional context in comments.

Comments

0

I would suggest to use numpy.argmin() since it return the position of nan and than replace them by [None] this way:

import numpy as np
import pandas as pd

def to_none(array_):
    for i in range(array_[pd.isnull(array_)].size):
        array_[np.argmin(array_)] = [None]
    return array_


a = np.array([['asdf', 'asdf'], ['asdf'], np.nan, ['asdf', 'asdf', 'asdf'],np.nan,
       ['asdf', 'asdf', 'asdf']], dtype=object)
a = to_none(a)

print a

>>
[['asdf', 'asdf'] ['asdf'] [None] ['asdf', 'asdf', 'asdf'] [None]
 ['asdf', 'asdf', 'asdf']]

print a.dtype

>>
object

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.