0

I have an excel sheet(.xlsx file) with the following data:

Date 1 Date 2
03/26/2010 3/31/2011
NULL NULL
03/26/2010 3/31/2011
NULL NULL
03/26/2010 3/31/2011
NULL NULL
01/01/2010 6/30/2010
01/01/2010 6/30/2010
01/12/2011 4/15/2012

When I convert it to dataframe using

pd.read_excel("file.xlsx",header=0,dtype=str,engine='openpyxl')

It is reading all data properly except for the row items 3,4,5,6 which are being read as below:

Date 1 Date 2
03/26/2010 3/31/2011
NULL NULL
01/01/2010 6/30/2010
01/01/2010 6/30/2010
01/12/2011 4/15/2012
NULL NULL

It is causing an unnecessary data shift and hence affecting my furthur steps. Any reasons why only at this place it is happening and nowhere else in the data?

2
  • what happens when you run only pd.read_excel("file.xlsx") Commented May 24, 2021 at 5:23
  • Did you try: xl = pd.ExcelFile("file.xlsx",engine='openpyxl') df = xl.parse("file") Commented May 24, 2021 at 5:33

2 Answers 2

2

I was facing the same problem but this fixed for me. While reading the data, you can try:

pd.read_excel(r'file.xlsx', sheet_name="Yourfilesheetname")
Sign up to request clarification or add additional context in comments.

1 Comment

Actually I have done it this way, sheet_name is also specified. But still the same issue.
0

The problem is now resolved. It was the issue with the index given by pandas to the Dataframe. My table had headers, but the pandas' index starts from 0 for the first row data. So I was being shown the next index number's data, which deceived me into thinking that read_excel has a bug.

Thanks for your support.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.