2

I am trying to create a dataframe in Pandas from the AB column in my csv file. (AB is the 27th column).

I am using this line:

df = pd.read_csv(filename, error_bad_lines = False, usecols = [27])

... which is resulting in this error:

ValueError: Usecols do not match names.

I'm very new to Pandas, could someone point out what i'm doing wrong to me?

8
  • 1
    A is the first column (index = 0) Z is the 26th, AA, AB should be the 28th (index = 27). Commented Sep 7, 2016 at 16:25
  • You can also write usecols=['AB'] to avoid all that confusion. Commented Sep 7, 2016 at 16:26
  • 1
    @user2539738 Wasn't sure if Pandas started with 0 for usecols. Anyway, the error persists. Commented Sep 7, 2016 at 16:26
  • @NickilMaveli When I switch my line to df = pd.read_csv(filename, error_bad_lines = False, usecols = ['AB']) the error is still the same. Commented Sep 7, 2016 at 16:27
  • 2
    Can you provide us with the first five or so lines in your file? Commented Sep 7, 2016 at 16:28

2 Answers 2

2

Here is a small demo:

CSV file (without header, i.e. there is NO column names):

1,2,3,4,5,6,7,8,9,10
11,12,13,14,15,16,17,18,19,20

We are going to read only 8-th column:

In [1]: fn = r'D:\temp\.data\1.csv'

In [2]: df = pd.read_csv(fn, header=None, usecols=[7], names=['col8'])

In [3]: df
Out[3]:
   col8
0     8
1    18

PS pay attention at header=None, usecols=[7], names=['col8']

If you don't use header=None and names parameters, the first row will be used as a header:

In [6]: df = pd.read_csv(fn, usecols=[7])

In [7]: df
Out[7]:
    8
0  18

In [8]: df.columns
Out[8]: Index(['8'], dtype='object')

and if we want to read only the last 10-th column:

In [9]: df = pd.read_csv(fn, usecols=[10])
... skipped ...
ValueError: Usecols do not match names.

because pandas counts columns starting from 0, so we have to do it this way:

In [12]: df = pd.read_csv(fn, usecols=[9], names=['col10'])

In [13]: df
Out[13]:
   col10
0     10
1     20
Sign up to request clarification or add additional context in comments.

Comments

-1

usecols uses the column name in your csv file rather than the column number. in your case it should be usecols=['AB'] rather than usecols=[28] that is the reason of your error stating usecols do not match names.

1 Comment

usecols supports both positional column indexes or column names. From docs: All elements in this array must either be positional (i.e. integer indices into the document columns) or strings that correspond to column names provided either by the user in names or inferred from the document header row(s)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.