6

My csv file is as following :

INDEX, VAL
04016170,22
04206261,11
0420677,11

df = pd.read_csv('data.csv', index_col='INDEX')

How can I force pandas to read the index as string and not as integer (to preserve the first 0) ?

2 Answers 2

11

You can pass the dtype as a param this will map the column to the passed dtype:

In [130]:
import io
import pandas as pd
t="""INDEX,VAL
04016170,22
04206261,11
0420677,11"""
df = pd.read_csv(io.StringIO(t), index_col='VAL', dtype={'INDEX':str})
df

Out[130]:
        INDEX
VAL          
22   04016170
11   04206261
11    0420677

In [131]:    
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 3 entries, 22 to 11
Data columns (total 1 columns):
INDEX    3 non-null object
dtypes: object(1)
memory usage: 48.0+ bytes

EDIT

OK, you can do it this way, there is a bug here when you explicitly set the index_col in read_csv, so you have to load the csv in first and then call set_index after loading:

In [134]:
df = pd.read_csv(io.StringIO(t), dtype={'INDEX':str})
df = df.set_index('INDEX')
df

Out[134]:
          VAL
INDEX        
04016170   22
04206261   11
0420677    11
Sign up to request clarification or add additional context in comments.

1 Comment

I'm sorry i made a mistake : index_col is INDEX (not VAL), I made the correction asap but you where too fast ! I want INDEX col as the INDEX, and keep the leading 0 in the index
4

Another solution in two lines:

df = pd.read_csv('data.csv',index_col=0)
df.index = [str(x) for x in df.index]

or

df.index = df.index.astype(str)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.