0

I am new to python and using numpy to read a csv into an array .So I used two methods:

Approach 1

train = np.asarray(np.genfromtxt(open("/Users/mac/train.csv","rb"),delimiter=","))

Approach 2

with open('/Users/mac/train.csv') as csvfile:
        rows = csv.reader(csvfile)
        for row in rows:
            newrow = np.array(row).astype(np.int)
            train.append(newrow)

I am not sure what is the difference between these two approaches? What is recommended to use?

I am not concerned which is faster since my data size is small but instead concerned more about differences in the resulting data type.

5
  • 3
    Why not pandas? It's simple: pd.read_csv('path/to/file') Commented Sep 10, 2018 at 6:55
  • 2
    Aside from @Lucas great suggestion, the use case also depends on whether your data contains a mixture of different data types, or is more heterogeneous. Commented Sep 10, 2018 at 6:56
  • It has just a single data type integer in the file Commented Sep 10, 2018 at 6:56
  • Possible duplicate of The fastest way to read input in Python Commented Sep 10, 2018 at 7:30
  • 1
    What is recommended to use? This is a broad question. What specifically are you concerned about? If it's not performance, is it readability, or something else? Commented Sep 10, 2018 at 8:58

2 Answers 2

2

You can use pandas also, it is better and simple to use.

import pandas as pd
import numpy as np

dataset = pd.read_csv('file.csv')
# get all headers in csv
values = list(dataset.columns.values)

# get the labels, assuming last row is labels in csv
y = dataset[values[-1:]]
y = np.array(y, dtype='float32')
X = dataset[values[0:-1]]
X = np.array(X, dtype='float32')
Sign up to request clarification or add additional context in comments.

Comments

1

So what is the difference in the result?

genfromtxt is the numpy csv reader. It returns an array. No need for an extra asarray.

The second expression is incomplete, looks like would produce a list of arrays, one for each line of the file. It uses the generic python csv reader which doesn't do much other than read a line and split it into strings.

3 Comments

But when I return it is a list instead of an array
That's what I was trying to point out. The 2 methods return different things.
So to convert the list to array I used np.asarray

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.