2

Trying to read the csv file through pandas, but it looks like it is not reading it correctly

Code:

pd.read_csv(data_file_path, sep=",", index_col=0, header=0, dtype = object)

For eg: My data is (in csv file):

12 1.43E+19 This is first line  101010  
23 1.43E+19 This is the second line 202020  
34 1.43E+19 This is the third line  303030  

I am trying to read with first column as index.

Output:

     1.43E+19 This is first line    101010  
12  
23 1.43E+19 This is the second line 202020  
34 1.43E+19 This is the third line 303030  

Output without making 1st column as index:

  12 1.43E+19 This is first line 101010  
0 23 1.43E+19 This is the second line 202020  
1 34 1.43E+19 This is the third line 303030  

Because of this, any further processing on this data is ignoring the first row data.

1 Answer 1

1

I think you're confusing header=0, which means "use the 0-th row as the header", with header=None, which means "don't read a header from the file".

Compare:

>>> pd.read_csv("h.csv", header=0, index_col=0)
        1.43E+19       This is first line  101010  
12                                                 
23  1.430000e+19  This is the second line    202020
34  1.430000e+19   This is the third line    303030
>>> pd.read_csv("h.csv", header=None, index_col=0)
               1                        2       3
0                                                
12  1.430000e+19       This is first line  101010
23  1.430000e+19  This is the second line  202020
34  1.430000e+19   This is the third line  303030

You can also specify column names using names:

>>> pd.read_csv("h.csv", names=["Number", "Line", "Code"], index_col=0)
          Number                     Line    Code
12  1.430000e+19       This is first line  101010
23  1.430000e+19  This is the second line  202020
34  1.430000e+19   This is the third line  303030

PS: Since you're using sep="," but the file you showed doesn't have any commas, I'm assuming that you removed them for some reason when asking the question. If that's right, please don't: no one's afraid of commas, and it simply means that other people have to guess where to put them back in if they want to test your code.

Sign up to request clarification or add additional context in comments.

1 Comment

Aah.. I see. I thought header = 0 means "No header". My bad. Thanks for pointing it out. The code is working now. And I am reading data from csv file so copied the data directly from file to show here. Sorry for confusion.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.