2
import pandas as pd

data_xls=pd.read_excel('/users/adam/abc.xlsx',index=False) 
data_xls.to_csv('def.csv,encoding='utf-8')

Also tried:

data_xls=pd.read_excel('/users/adam/abc.xlsx',index_col=False)
data_xls=pd.read_excel('/users/adam/abc.xlsx',index=None)
data_xls=pd.read_excel('/users/adam/abc.xlsx',index_col=None)

Actual Output:

     Name    Age

0    Adam    24

1    Steve   25

2    Jhon    23

Expected Output:

Name    Age

Adam    24

Steve   25

Jhon    23 

Is there a way If I can drop the index column before inserting the data into a hive table?

5
  • Use Code Segments for code. It is really difficult to the accustomed eye to read non-monospaced code. Commented Aug 9, 2018 at 19:39
  • Use Dataframe.drop() method to drop any row or column. Check more here. Commented Aug 9, 2018 at 19:44
  • @pault I am doing this within pyspark and this file is going to be used to load the data to a Hive table. Commented Aug 9, 2018 at 20:28
  • 1
    @saketk21, I went ahead and ignored the index while converting my file from xlsx to csv. data_xls.to_csv('def.csv,encoding='utf-8',index=False) Commented Aug 9, 2018 at 20:34
  • That works too. Commented Aug 9, 2018 at 21:44

1 Answer 1

8

when writing file you can use the following code if you don't want pandas to write the index column in the csv file

pd.to_csv('your.csv', index=False)

Also if you want to drop the index when reading a file you should be able to do it through:

df = pd.read_csv('some.csv').drop(['Unnamed 0'],axis=1)
Sign up to request clarification or add additional context in comments.

1 Comment

thanks much inder, I got rid of the index while converting the file to csv.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.