Why np.array converts to array a list and its elements?

Question

I'm newbie with Python and there is something that I don't understand in this code:

import numpy as np

a_list = []
sub_list = ["apple", "banana", "cherry"]

a_list.append(sub_list)

print(type(a_list))
print(type(a_list[0]))
print(type(sub_list))

array = np.array(a_list)

print(type(array))
print(type(array[0]))
print(type(sub_list))
print(array[0])

When I run it, I get this output:

<class 'list'>
<class 'list'>
<class 'list'>
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'list'>
['apple' 'banana' 'cherry']

Why type(array[0]) is also numpy.ndarray? Shouldn't it be a list?

An array of lists is pretty much pointless in numpy. It makes sense that a nested structure will simply become a nested array. Why do you want to keep them as lists? — roganjosh
– roganjosh, Commented Dec 18, 2019 at 18:07
@roganjosh I don't want to keep it as a list. I thought that np.darray will convert to numpy array only the variable, not the variable and its contents. — VansFannel
– VansFannel, Commented Dec 18, 2019 at 18:10
I'd wager that this is purely for convenience because I can't think of a case where I wouldn't want this behaviour. That said, you should be more concerned about the dtype that it gets converted to, because that can have significant impacts on how the resultant array will behave (e.g. object is not "good news" and will suffer in efficiency if you try to apply numpy operations) — roganjosh
– roganjosh, Commented Dec 18, 2019 at 18:13

hpaulj · Accepted Answer · 2019-12-18 22:47:57Z

In [36]:  
    ...: a_list = [] 
    ...: sub_list = ["apple", "banana", "cherry"] 
    ...:  
    ...: a_list.append(sub_list)                                                
In [37]: arr = np.array(a_list)                                                 
In [38]: a_list                                                                 
Out[38]: [['apple', 'banana', 'cherry']]
In [39]: arr                                                                    
Out[39]: array([['apple', 'banana', 'cherry']], dtype='<U6')
In [40]: arr[0]                                                                 
Out[40]: array(['apple', 'banana', 'cherry'], dtype='<U6')
In [41]: arr.shape                                                              
Out[41]: (1, 3)

np.array tries to make a multidimensional array from its inputs. a_list is a nested list, from which it can make a 2d array. arr[0] is a 1d array, selected from arr.

arr is not an array of lists. It's an array of string elements.

It is possible to make an array that contains lists:

In [42]: arr1 = np.empty(1, object)                                             
In [43]: arr1                                                                   
Out[43]: array([None], dtype=object)
In [44]: arr1[0]=sub_list                                                       
In [45]: arr1                                                                   
Out[45]: array([list(['apple', 'banana', 'cherry'])], dtype=object)
In [46]: arr1[0]                                                                
Out[46]: ['apple', 'banana', 'cherry']

but for most purposes this is little better than a list, a_list, and in some ways worse (you can't for example .append to it).

A classic case of making a 2d array from nested lists:

In [47]: np.array([[1,2,3],[4,5,6]])                                            
Out[47]: 
array([[1, 2, 3],
       [4, 5, 6]])
In [48]: _.shape                                                                
Out[48]: (2, 3)

Math operations on this pure numeric array are considerably faster than if it is an object dtype array containing lists. Python already has nestable lists.

Collectives™ on Stack Overflow

Why np.array converts to array a list and its elements?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related