Keep original data type when convert list of list into numpy array

Question

How do I keep the original data type when convert list of list into numpy array?

I used np.array, np.matrix to convert list of list into numpy array. But it turns out that all of int become string. Python version is 3.7.x.

X = [[3, 'aa', 10],                 
     [1, 'bb', 22],                      
     [2, 'cc', 28],                      
     [5, 'bb', 32],                      
     [4, 'cc', 32]]
# X is a list of list
X = np.array(X)
return X

# X becomes
[['3' 'aa' '10']
 ['1' 'bb' '22']
 ['2' 'cc' '28']
 ['5' 'bb' '32']
 ['4' 'cc' '32']]

What are you going to do with this array?

hpaulj
– hpaulj

2019-03-25 03:52:58 +00:00
Commented Mar 25, 2019 at 3:52 — hpaulj
– hpaulj, Commented Mar 25, 2019 at 3:52

Michael Butscher · Accepted Answer · 2019-03-25 03:04:04Z

4

Use X = np.array(X, dtype="O") instead. Every item is stored as Python object then.

answered Mar 25, 2019 at 3:04

Michael Butscher

11k4 gold badges28 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

U13-Forward · Accepted Answer · 2019-03-25 03:07:45Z

3

You can use any of these:

X = np.array(X,dtype='object')
X = np.array(X,dtype=object)
X = np.array(X, dtype='O')

They all work, so whole code:

X = [[3, 'aa', 10],                 
     [1, 'bb', 22],                      
     [2, 'cc', 28],                      
     [5, 'bb', 32],                      
     [4, 'cc', 32]]
# X is a list of list
One you picked
return X

P.S. return only works in a function, outside a function, use print

answered Mar 25, 2019 at 3:07

U13-Forward

71.8k15 gold badges100 silver badges125 bronze badges

Comments

hpaulj · Accepted Answer · 2019-03-25 04:39:05Z

Another option is to make a structured array, with a mix of integer and string fields.

In [252]: import numpy.lib.recfunctions as rf 

In [258]: X = [[3, 'aa', 10],                  
     ...:      [1, 'bb', 22],                       
     ...:      [2, 'cc', 28],                       
     ...:      [5, 'bb', 32],                       
     ...:      [4, 'cc', 32]]                                                   
In [259]: dt = np.dtype('i,U10,i')                                              
In [260]: dt                                                                    
Out[260]: dtype([('f0', '<i4'), ('f1', '<U10'), ('f2', '<i4')])

Recent (1.16) numpy has a function that converts unstructured arrays (e.g. the string dtype) to structured:

In [261]: Y = rf.unstructured_to_structured(np.array(X), dt)                    
In [262]: Y                                                                     
Out[262]: 
array([(3, 'aa', 10), (1, 'bb', 22), (2, 'cc', 28), (5, 'bb', 32),
       (4, 'cc', 32)],
      dtype=[('f0', '<i4'), ('f1', '<U10'), ('f2', '<i4')])

Fields are accessed by name:

In [264]: Y['f0']                                                               
Out[264]: array([3, 1, 2, 5, 4], dtype=int32)
In [265]: Y['f1']                                                               
Out[265]: array(['aa', 'bb', 'cc', 'bb', 'cc'], dtype='<U10')

Converting X to a list of tuples will work just as well

In [266]: np.array([tuple(row) for row in X], dtype=dt)                         
Out[266]: 
array([(3, 'aa', 10), (1, 'bb', 22), (2, 'cc', 28), (5, 'bb', 32),
       (4, 'cc', 32)],
      dtype=[('f0', '<i4'), ('f1', '<U10'), ('f2', '<i4')])

The object array and structured array each have their advantages and disadvantages. So which is better will depend on what you intend to do with array. For that matter, the original list may, for many purposes, be just as good. None has the same processing speed (for math operations) as a 2d numeric array.

Collectives™ on Stack Overflow

Keep original data type when convert list of list into numpy array

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related