3

I am reading a csv file in pandas, and I am skipping some bad lines / rows with:

df2 = pd.read_csv("Test.csv", sep=';', engine='python', error_bad_lines=False)

How can I count the total number of skipped rows in python?

Right now, I only get: enter image description here

How can I count this?

2 Answers 2

4

You could calculate the row difference:

with open("test.csv") as f:
    len_csv = sum(1 for line in f)

number_of_skipped_rows = len_csv - len(df2)
Sign up to request clarification or add additional context in comments.

1 Comment

Check to make sure your data doesn't have a header, or this could introduce an off-by-one error
2
f = open("Test.csv")
row_count= len(f.readlines())
df2 = pd.read_csv("Test.csv", sep=';', engine='python', error_bad_lines=False)

Count of skipped rows

skipped_rows  = row_count  - df2.shape[0]

4 Comments

this is not working, the command df1 = pd.read_csv("Test.csv", sep=';', engine='python'), while give an error.
name csv is not definied
now it is working, but I must say, I do not need to import another library, so right now, I prefer Carsten solution
now its quite simliar, but the code is also the same approach

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.