0

I have a problem reading the first column of the csv file with numpy. All the values of the first column gets returned as nan instead of [ 2. 4. 1120.] and such.

import genfromtxt from numpy 
my_data = genfromtxt('input.csv', delimiter=',')
first_column = len(my_data[:,0]) - 1 

Inside the csv file:

[   2.    4. 1120.],67.8,63.7,-676.1,-365.2,0.0,0.0,0.0,0.0,0.0,-608.3000000000001,-301.5
[  2.    4.5 100. ],0.0,0.0,-0.30000000000000004,-0.7,0.0,0.0,99.7002,0.0,0.0,-0.30000000000000004,-0.7
[   2.    4. 1130.],70.8,52.2,-672.7,-346.5,0.0,0.0,0.0,0.0,0.0,-601.9000000000001,-294.3
[  2.    4.5 110. ],0.0,0.2,-0.7,-0.1,0.0,0.0,99.3010995,0.0,0.0,-0.7,0.1

1 Answer 1

1

First, your import sentence is inverted. It should be:

from numpy import genfromtxt

Second, apparently genfromtxt() cannot convert the string '[ 2. 4. 1120.]' to float as it does with all the other values in the array so that's why it returns nan. The same occurs with numpy.loadtxt().

An option to not "lose" those values can be reading the csv file with pandas:

import numpy as np
import pandas as pd

my_data = pd.read_csv('data.csv').to_numpy()

Where my_data contains:

array([['[  2.    4.5 100. ]', 0.0, 0.0, -0.30000000000000004, -0.7, 0.0,
        0.0, 99.7002, 0.0, 0.0, -0.30000000000000004, -0.7],
       ['[   2.    4. 1130.]', 70.8, 52.2, -672.7, -346.5, 0.0, 0.0, 0.0,
        0.0, 0.0, -601.9000000000002, -294.3],
       ['[  2.    4.5 110. ]', 0.0, 0.2, -0.7, -0.1, 0.0, 0.0,
        99.3010995, 0.0, 0.0, -0.7, 0.1]], dtype=object)

Although you will still need to parse every value on the first column to convert them to numpy arrays. For that, you can use np.fromstring but you will need to avoid the brackets characters in order for it to work as expected.

Without avoiding brackets you will see an error message:

np.fromstring(my_data[:, 0], sep=' ')
<ipython-input-65-7d75c8d121f5>:1: DeprecationWarning: string or file could not be read to its end due to unmatched data; this will raise a ValueError in the future.
  np.fromstring(my_data[:, 0], sep=' ')

Unfortunately, to avoid brackets you will need to loop the array:

for i, row in enumerate(my_data[:, 0]):
    my_data[i, 0] = np.fromstring(data[i, 0][1:-1], sep=' ').astype(np.float32)

By indexing with [1:-1], is "removing" the bracket characters before passing the values to np.fromstring.

After that, my_data will contain numpy arrays in the first column:

array([[array([  2. ,   4.5, 100. ], dtype=float32), 0.0, 0.0,
        -0.30000000000000004, -0.7, 0.0, 0.0, 99.7002, 0.0, 0.0,
        -0.30000000000000004, -0.7],
       [array([   2.,    4., 1130.], dtype=float32), 70.8, 52.2, -672.7,
        -346.5, 0.0, 0.0, 0.0, 0.0, 0.0, -601.9000000000002, -294.3],
       [array([  2. ,   4.5, 110. ], dtype=float32), 0.0, 0.2, -0.7,
        -0.1, 0.0, 0.0, 99.3010995, 0.0, 0.0, -0.7, 0.1]], dtype=object)

So the first column would have:

print(my_data[:, 0])
array([array([  2. ,   4.5, 100. ], dtype=float32),
       array([   2.,    4., 1130.], dtype=float32),
       array([  2. ,   4.5, 110. ], dtype=float32)], dtype=object)

Although is an elaborated solution, it works. Maybe there is a better or simpler way without the need to loop the array in order to make the conversion.

Sign up to request clarification or add additional context in comments.

2 Comments

@tonyselcuk, Please update your question including the code you tried to implement and the error message. Otherwise no one can know what happened except you.
I did a different function thanks anyways

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.