3

I assume I'm asking a newbie question but have spent too much time today searching for an answer. I get an IndexError: too many indices for array error when naively attempting to perform the same slice operation on a numpy array after saving and reloading with np.genfromtxt.

Note: I see that the dimension has changed from (3,6) to (3,) upon reloading but was unable to convert the result back to dimensions (3,6)- this is the part I assume must be obvious (or maybe I need to specify type differently)

yo = np.arange(18)
yo = yo.reshape(3,6)

print(yo)
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]]

print(yo[:,:2])
[[ 0  1]
 [ 6  7]
 [12 13]]

np.savetxt("test_data.csv", yo, delimiter=",",  fmt='%1.4e')
yo_reloaded = np.genfromtxt("test_data.csv", dtype=(float, float, float, float, float, float), delimiter = ",")

#same as above but doesn't work
print(yo_reloaded[:,:2])
IndexError: too many indices for array

print(yo_reloaded)
[(  0.,   1.,   2.,   3.,   4.,   5.) (  6.,   7.,   8.,   9.,  10.,  11.)
 ( 12.,  13.,  14.,  15.,  16.,  17.)]

# shape changed
print(yo_reloaded.shape)
(3,)
2
  • 3
    Omit the dtype for genfromtxt. float is the default. Giving multiple dtypes tells it to load it as a structured array. Look at the dtype of the reload. Commented Apr 1, 2018 at 20:20
  • It works- thank you @hpaulj Commented Apr 1, 2018 at 20:28

2 Answers 2

1

Use dtype=None to tell genfromtxt to attempt to intelligently guess the dtype. In this case, since all values are floats, genfromtxt will assign a floating-point dtype to the array:

In [19]: yo_reloaded = np.genfromtxt("test_data.csv", dtype=None, delimiter = ",")
In [21]: yo_reloaded.dtype
Out[21]: dtype('float64')

and yo_reload will have shape (3,6).

In contrast, if you set dtype=(float, float, float, float, float, float):

yo_reloaded = np.genfromtxt("test_data.csv", dtype=(float, float, float, float, float, float), delimiter = ",")

then yo_reloaded.dtype becomes:

In [18]: yo_reloaded.dtype
Out[18]: dtype([('f0', '<f8'), ('f1', '<f8'), ('f2', '<f8'), ('f3', '<f8'), ('f4', '<f8'), ('f5', '<f8')])

which is the dtype of a structured array. The shape of the structured array is (3,) become NumPy views this array as consisting of 3 rows with each row having a single value consisting of 6 fields of floating-point dtype. That's simply not what you want, but what you get when you set dtype to equal a tuple of types.

Note you could also obtain the desired array using dtype=float:

In [24]: yo_reloaded = np.genfromtxt("test_data.csv", dtype=float, delimiter = ",")
In [25]: yo_reloaded.shape
Out[25]: (3, 6)
In [26]: yo_reloaded.dtype
Out[26]: dtype('float64')

Or, as hpaulj points out, you could simply omit the dtype parameter altogether, in which case it defaults to dtype=float.

Sign up to request clarification or add additional context in comments.

1 Comment

Your name spelled backwards is ubtunu... o_o
1

if you run print(yo_reloaded.shape) before print(yo_reloaded[:,:2]) you can see that your np.genfromtxt() call returns (3,) which means 3 rows with one column data.

When you use dtype=(float, float, float, float, float, float) you are mapping every row in "test_data.csv" 5-tuple. So np.genfromtxt() returns every row as a 5-tuple element.

In order to get the same results you have to change dtype=dtype=(float, float, float, float, float, float) to dtype=float.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.