Delete all non strings values in a array of numpy arrays

Question

I have an array of arrays with only str and nan values, like:

x = numpy.recarray(
    [('A', 'B', nan, nan),
     ('B', nan, nan, nan),
     ('A', 'B', 'H', 'Z')],
     dtype=[('D1', 'O'), ('D2', 'O'),  
            ('D3', 'O'), ('D4', 'O')])

and I'm looking for an efficient way to drop all the nan values, and stay with arrays with variable number of elements. The nan values are float type.

type(x[0][3])
out: float

Thank you in advance estimates

just to confirm, lists or numpy arrays? x here is a list. Also, you lose a lot of advantages of numpy if you go for variable length lists inside them, because numpy has to store them as native objects. — Paritosh Singh
– Paritosh Singh, Commented Jun 4, 2019 at 19:37

hpaulj · Accepted Answer · 2019-06-04 20:00:06Z

You have a recarray of shape (3,) and 4 fields:

In [85]: x = np.array( 
    ...:     [('A', 'B', np.nan, np.nan), 
    ...:      ('B', np.nan, np.nan, np.nan), 
    ...:      ('A', 'B', 'H', 'Z')], 
    ...:      dtype=[('D1', 'O'), ('D2', 'O'),   
    ...:             ('D3', 'O'), ('D4', 'O')])                                                          
In [86]: x                                                                                               
Out[86]: 
array([('A', 'B', nan, nan), ('B', nan, nan, nan), ('A', 'B', 'H', 'Z')],
      dtype=[('D1', 'O'), ('D2', 'O'), ('D3', 'O'), ('D4', 'O')])
In [87]: x.shape                                                                                         
Out[87]: (3,)
In [88]: x['D1']                                                                                         
Out[88]: array(['A', 'B', 'A'], dtype=object)
In [89]: x['D3']                                                                                         
Out[89]: array([nan, nan, 'H'], dtype=object)

You can't make that ragged.

But you can make it a 2d array from that, and then do a list comprehension:

In [93]: xx = np.array(x.tolist())                                                                       
In [94]: xx                                                                                              
Out[94]: 
array([['A', 'B', 'nan', 'nan'],
       ['B', 'nan', 'nan', 'nan'],
       ['A', 'B', 'H', 'Z']], dtype='<U3')
In [95]: [[i for i in row if i!='nan'] for row in xx]                                                    
Out[95]: [['A', 'B'], ['B'], ['A', 'B', 'H', 'Z']]

We could also do the comprehension on elements of the structured array:

In [101]: [[i for i in row if i is not np.nan] for row in x]                                             
Out[101]: [['A', 'B'], ['B'], ['A', 'B', 'H', 'Z']]

An element of x is tuple like. Technically it is np.void (compound dtype record), but it iterates like a tuple.

Collectives™ on Stack Overflow

Delete all non strings values in a array of numpy arrays

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related