1

Code:

import csv
import numpy
raw_data = open('C:\\Users\\train.csv', 'rt')
data = numpy.loadtxt(raw_data, delimiter=",")
print(data.shape)

Below is the sample data used

Time    Freq
8:00    91.1
8:03    91.1
8:06    91.1
8:09    91.1
8:12    91.1
8:15    91.1
8:18    91.1
8:21    91.1
8:24    91.1
8:27    91.1
8:30    91.1

Error:
ValueError: could not convert string to float: b'Time'
2
  • What is the question? The error/exception is pretty unambiguous. Does numpy.loadtext have an optional parameter telling it to skip a header line? It isn't clear from your sample data that the first two words are on their own line. Please copy and paste the sample data and format it as code (select it and pres ctrl-k). Commented Apr 26, 2018 at 20:11
  • As a default loadtxt loads the data as floats, and raises an error when it can't. genfromtxt puts nan where it can't create the float. What do you want the result to look like? Commented Apr 26, 2018 at 20:22

2 Answers 2

2
In [350]: txt ='''Time    Freq
     ...: 8:00    91.1
     ...: 8:03    91.1
     ...: 8:06    91.1
     ...: 8:09    91.1
     ...: 8:12    91.1
     ...: 8:15    91.1
     ...: 8:18    91.1
     ...: 8:21    91.1
     ...: 8:24    91.1
     ...: 8:27    91.1
     ...: 8:30    91.1
     ...: '''

Loading as a structured array, using the first line as field names.

In [351]: data = np.genfromtxt(txt.splitlines(),names=True,dtype=None,encoding=N
     ...: one)
In [352]: data
Out[352]: 
array([('8:00', 91.1), ('8:03', 91.1), ('8:06', 91.1), ('8:09', 91.1),
       ('8:12', 91.1), ('8:15', 91.1), ('8:18', 91.1), ('8:21', 91.1),
       ('8:24', 91.1), ('8:27', 91.1), ('8:30', 91.1)],
      dtype=[('Time', '<U4'), ('Freq', '<f8')])
In [353]: data['Freq']
Out[353]: array([91.1, 91.1, 91.1, 91.1, 91.1, 91.1, 91.1, 91.1, 91.1, 91.1, 91.1])

Note that the 2nd column has been loaded as numbers, but the first as strings.

Sign up to request clarification or add additional context in comments.

Comments

1

By default numpy.loadtext expects everything in the file to be a number. Time is not a number. 8:00 is not a number either. If you want to perform numerical operations on your data, you're going to need to remove the Time Freq header, and convert your times to numbers.

If you don't need to do any type of numerical analysis, you can import the data as strings. numpy.loadtxt(raw_data, delimiter=",", dtype='str') . See the docs for more info.


Alternatively, you can use genfromtxt.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.