0

I have a problem.

I want to get the content of a CSV file from an url and then parse it to an array. This is the code I have now:

import requests
import pandas as pd
import io

url="https://www.test.com/csv.php"
dataset = requests.get(url, verify=False).content
df = pd.read_csv(io.StringIO(dataset.decode('utf-8')))

data = []
for row in df: # each row is a list
    data.append(row)

But when I execute this code, I only get the first row of the CSV and the values are between this -> '

['1', '4', '0']

The CSV file looks like this:

1,4,0
0,1,1
1,1,0
0,1,1
1,1,0
0,3,1
1,1,0
0,3,1
1,1,0

And I am hoping to get an array like this:

[[1,4,0],
 [0,1,1],
 [1,1,0],
 [0,1,1],
 [1,1,0],
 [0,3,1],
 [1,1,0],
 [0,3,1],
 [1,1,0]]

What am I doing wrong?

EDIT:

Using df.values gives me this:

[[0. 1. 1.]
 [1. 1. 0.]
 [0. 1. 1.]
 ...
 [1. 1. 0.]
 [0. 1. 1.]
 [1. 3. 0.]]

But that does not seem to be correct, because the first row has to be [1,4,0]. Also I need a -> , <- as seperator

3 Answers 3

1

According to pandas documentation, to iterate rows you should use:

df.iterrows()

as indicated in http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iterrows.html#pandas.DataFrame.iterrows

Sign up to request clarification or add additional context in comments.

2 Comments

Thats not what I am looking for
I need it to be an array with [[]]
0

No need to loop: .values will return a matrix

url="https://www.test.com/csv.php"
dataset = requests.get(url, verify=False).content
df = pd.read_csv(io.StringIO(dataset.decode('utf-8')), header=None, sep=',')
data=df.values

3 Comments

@Vreesie Probably you need to disable header. Default delimiter is comma
Added your code and values seem to be right, but still seperated with a dot and not with a comma!?
@Vreesie print dataset and df, please
0

When you are reading from a .csv file, by default, the first row is considered as a header row. You need to specify that it is not. So, add header=None in read_csv. Like this:

df = pd.read_csv(io.StringIO(dataset.decode('utf-8')), header=None)

Also, following is one of the ways of getting your desired output:

data=[]
for r1, r2, r3 in df.values:
    data.append([r1,r2,r3])

1 Comment

Could you please confirm if the above answer worked for you?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.