python pandas not reading first column from csv file

Question

I have a simple 2 column csv file called st1.csv:

GRID    St1  
1457    614  
1458    657  
1459    679  
1460    732  
1461    754  
1462    811  
1463    748

However, when I try to read the csv file, the first column is not loaded:

a = pandas.DataFrame.from_csv('st1.csv')  
a.columns

outputs:

 Index([u'ST1'], dtype=object)

Why is the first column not being read?

It's assuming that the first column is the index, try a = pandas.DataFrame.from_csv('st1.csv', index_col=False) — EdChum
– EdChum, Commented Feb 20, 2014 at 8:29
I am facing the exact opposite issue when I read a csv that was compressed (using python, pandas). any explanation for why it wasn't following behaviour? — Shravya Boggarapu
– Shravya Boggarapu, Commented Aug 28, 2020 at 6:05

zabop · Accepted Answer · 2021-07-24 08:03:03Z

59

Judging by your data it looks like the delimiter you're using is a .

Try the following:

a = pandas.DataFrame.from_csv('st1.csv', sep=' ')

The other issue is that it's assuming your first column is an index, which we can also disable:

a = pandas.DataFrame.from_csv('st1.csv', index_col=None)

UPDATE:

In newer pandas versions, do:

a = pandas.DataFrame.from_csv('st1.csv', index_col=False)

edited Jul 24, 2021 at 8:03

zabop

8,1124 gold badges56 silver badges112 bronze badges

answered Feb 20, 2014 at 8:30

Ewan

15.1k6 gold badges50 silver badges65 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

ryantuck Over a year ago

interesting that in the docs there is no mention of setting index_col=False, but that's definitely part of the solution: pandas.pydata.org/pandas-docs/stable/generated/…

Grant Over a year ago

In Python 3: index_col=False throws an error, I used index_col=None and it works fine...

Tom Cornebize Over a year ago

I agree with @Grant, you have to use index_col=None (even in Python 2).

Ewan Over a year ago

@Grant & Tom - I have updated my answer to reflect this. Thank you for informing me.

Mr. T Over a year ago

Python 3.5 and pandas 0.21.1: index_col = False worked fine, but index_col = None was ignored. Strange.

Matt Messersmith · Accepted Answer · 2019-10-12 13:33:35Z

For newer versions of pandas, pd.DataFrame.from_csv doesn't exist anymore, and index_col=None no longer does the trick with pd.read_csv. You'll want to use pd.read_csv with index_col=False instead:

pd.read_csv('st1.csv', index_col=False)

Example:

(so) URSA-MattM-MacBook:stackoverflow mmessersmith$ cat input.csv 
Date                        Employee        Operation        Order

2001-01-01 08:32:17         User1           Approved         #00045
2001-01-01 08:36:23         User1           Edited           #00045
2001-01-01 08:41:04         User1           Rejected         #00046
2001-01-01 08:42:56         User1           Deleted          #00046
2001-01-02 09:01:11         User1           Created          #00047
2019-10-03 17:23:45         User1           Approved         #72681

(so) URSA-MattM-MacBook:stackoverflow mmessersmith$ python
Python 3.7.4 (default, Aug 13 2019, 15:17:50) 
[Clang 4.0.1 (tags/RELEASE_401/final)] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> pd.__version__
'0.25.1'              
>>> df_bad_index = pd.read_csv('input.csv', delim_whitespace=True)
>>> df_bad_index
                Date Employee Operation   Order
2001-01-01  08:32:17    User1  Approved  #00045
2001-01-01  08:36:23    User1    Edited  #00045
2001-01-01  08:41:04    User1  Rejected  #00046
2001-01-01  08:42:56    User1   Deleted  #00046
2001-01-02  09:01:11    User1   Created  #00047
2019-10-03  17:23:45    User1  Approved  #72681
>>> df_bad_index.index
Index(['2001-01-01', '2001-01-01', '2001-01-01', '2001-01-01', '2001-01-02',
       '2019-10-03'],
      dtype='object')
>>> df_still_bad_index = pd.read_csv('input.csv', delim_whitespace=True, index_col=None)
>>> df_still_bad_index
                Date Employee Operation   Order
2001-01-01  08:32:17    User1  Approved  #00045
2001-01-01  08:36:23    User1    Edited  #00045
2001-01-01  08:41:04    User1  Rejected  #00046
2001-01-01  08:42:56    User1   Deleted  #00046
2001-01-02  09:01:11    User1   Created  #00047
2019-10-03  17:23:45    User1  Approved  #72681
>>> df_still_bad_index.index
Index(['2001-01-01', '2001-01-01', '2001-01-01', '2001-01-01', '2001-01-02',
       '2019-10-03'],
      dtype='object')
>>> df_good_index = pd.read_csv('input.csv', delim_whitespace=True, index_col=False)
>>> df_good_index
         Date  Employee Operation     Order
0  2001-01-01  08:32:17     User1  Approved
1  2001-01-01  08:36:23     User1    Edited
2  2001-01-01  08:41:04     User1  Rejected
3  2001-01-01  08:42:56     User1   Deleted
4  2001-01-02  09:01:11     User1   Created
5  2019-10-03  17:23:45     User1  Approved
>>> df_good_index.index
RangeIndex(start=0, stop=6, step=1)

jmd_dk · Accepted Answer · 2017-01-02 15:24:33Z

6

Based on documentation which compares read_csv and from_csv, it shows that it is possible to put index_col = None. I tried the below and it worked:

DataFrame.from_csv('st1.csv', index_col=None);

This assumes that the data is comma-separated.

Please check the below link

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.from_csv.html

edited Jan 2, 2017 at 15:24

jmd_dk

13.2k11 gold badges71 silver badges104 bronze badges

answered Jan 2, 2017 at 14:28

Muzaffar Omer

611 silver badge4 bronze badges

Collectives™ on Stack Overflow

python pandas not reading first column from csv file

3 Answers 3

5 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

5 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related