-2

I want to create a numpy array by parsing a .txt file. The .txt file consists of features of iris flowers seperated by commas. every line is has one flower example with 5 data seperated with 4 commas. first 4 number is features and the last one is the name. I parse the .txt in a loop and want to append (using numpy.append probably) every lines parsed data into a numpy array called feature_table.

heres the code;

import numpy as np
iris_data = open("iris_data.txt", "r")
for line in iris_data:
    currentline = line.split(",")
    #iris_data_parsed = (currentline[0] + " , " + currentline[3] + " , " + currentline[4])
    #sepal_length = numpy.array(currentline[0])
    #petal_width = numpy.array(currentline[3])
    #iris_names = numpy.array(currentline[4])
    feature_table = np.array([currentline[0]],[currentline[3]],[currentline[4]])
    print (feature_table)
    print(feature_table.shape)

so I want to create a numpy array using only first, fourth and fifth data in every line but I can't make it work as I want to. tried reading numpy docs but couldn't understand it.

6
  • Possible duplicate of creating numpy arrays in a for loop Commented Feb 19, 2019 at 12:10
  • 1
    You're continously overwriting the same variable, so you get an array with just 3 elements. Commented Feb 19, 2019 at 12:11
  • Depending on your ultimate goal, you may be better off using numpy.loadtxt, pandas (which has a read_sv() function that reads the whole file into a table a.k.a. a Dataframe), or even scikit-learn, which uses the iris data sets in lots of examples. Commented Feb 19, 2019 at 12:13
  • @9769953 you are right, but I dont even get an array with 3 elements, I get ValueError: only 2 non-keyword arguments accepted. Commented Feb 19, 2019 at 12:13
  • answer by @Alexander Rossa fixed that. now I just have to create a numpy array outside of the loop and update it every line. thanks Commented Feb 19, 2019 at 12:19

1 Answer 1

2

While the people in the comments are right in that you are not persisting your data anywhere, your problem, I assume, is incorrect np.array construction. You should enclose all of the arguments in a list like this:

feature_table = np.array([currentline[0],currentline[3],currentline[4]])

And get rid of redundant [ and ] around the arguments.

See the official documentation for more examples. Basically all of the input data needs to be grouped/separated to be only 1 argument as Python will consider the other arguemnts as different positional arguments.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the swift answer. Now I just have to create a numpy array outside of the loop and update it every line.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.