choose row numbers to read from excel into pandas dataframe

Question

I have a shared spreadsheet that gets rows added to it everyday. I am creating a script that reads the spreadsheet into a dataframe pd.read_excel(infile, sheet_name=0) and checks for duplicate rows using df.drop_duplicates(keep='first'). The script is going to be be an installed package on multiple people's computer for them to use at any time and different people will want to check different rows. Is there a way to have whoever wants to use the script choose the range of rows they want to check? For example, if the spreadsheet has 100 rows, and someone wants to check for duplicate rows in rows 40-60, is it possible to do this?

Yes, you want .iloc. For example, my_df = df.iloc[40:60,:] — rahlf23
– rahlf23, Commented Dec 17, 2018 at 16:17

rahlf23 · Accepted Answer · 2018-12-17 16:22:21Z

1

You can accept user inputs for the row bounds and then pass them to iloc:

import pandas as pd

start = int(input('Enter your starting row: '))
stop = int(input('Enter your ending row: '))

df_limited = df.iloc[start:stop].drop_duplicates(keep='first')

answered Dec 17, 2018 at 16:22

rahlf23

9,0494 gold badges30 silver badges57 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

choose row numbers to read from excel into pandas dataframe

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related