1

I'm newbie with Python and there is something that I don't understand in this code:

import numpy as np

a_list = []
sub_list = ["apple", "banana", "cherry"]

a_list.append(sub_list)

print(type(a_list))
print(type(a_list[0]))
print(type(sub_list))

array = np.array(a_list)

print(type(array))
print(type(array[0]))
print(type(sub_list))
print(array[0])

When I run it, I get this output:

<class 'list'>
<class 'list'>
<class 'list'>
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
<class 'list'>
['apple' 'banana' 'cherry']

Why type(array[0]) is also numpy.ndarray? Shouldn't it be a list?

3
  • 3
    An array of lists is pretty much pointless in numpy. It makes sense that a nested structure will simply become a nested array. Why do you want to keep them as lists? Commented Dec 18, 2019 at 18:07
  • @roganjosh I don't want to keep it as a list. I thought that np.darray will convert to numpy array only the variable, not the variable and its contents. Commented Dec 18, 2019 at 18:10
  • 2
    I'd wager that this is purely for convenience because I can't think of a case where I wouldn't want this behaviour. That said, you should be more concerned about the dtype that it gets converted to, because that can have significant impacts on how the resultant array will behave (e.g. object is not "good news" and will suffer in efficiency if you try to apply numpy operations) Commented Dec 18, 2019 at 18:13

1 Answer 1

1
In [36]:  
    ...: a_list = [] 
    ...: sub_list = ["apple", "banana", "cherry"] 
    ...:  
    ...: a_list.append(sub_list)                                                
In [37]: arr = np.array(a_list)                                                 
In [38]: a_list                                                                 
Out[38]: [['apple', 'banana', 'cherry']]
In [39]: arr                                                                    
Out[39]: array([['apple', 'banana', 'cherry']], dtype='<U6')
In [40]: arr[0]                                                                 
Out[40]: array(['apple', 'banana', 'cherry'], dtype='<U6')
In [41]: arr.shape                                                              
Out[41]: (1, 3)

np.array tries to make a multidimensional array from its inputs. a_list is a nested list, from which it can make a 2d array. arr[0] is a 1d array, selected from arr.

arr is not an array of lists. It's an array of string elements.

It is possible to make an array that contains lists:

In [42]: arr1 = np.empty(1, object)                                             
In [43]: arr1                                                                   
Out[43]: array([None], dtype=object)
In [44]: arr1[0]=sub_list                                                       
In [45]: arr1                                                                   
Out[45]: array([list(['apple', 'banana', 'cherry'])], dtype=object)
In [46]: arr1[0]                                                                
Out[46]: ['apple', 'banana', 'cherry']

but for most purposes this is little better than a list, a_list, and in some ways worse (you can't for example .append to it).

A classic case of making a 2d array from nested lists:

In [47]: np.array([[1,2,3],[4,5,6]])                                            
Out[47]: 
array([[1, 2, 3],
       [4, 5, 6]])
In [48]: _.shape                                                                
Out[48]: (2, 3)

Math operations on this pure numeric array are considerably faster than if it is an object dtype array containing lists. Python already has nestable lists.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.