1

My dataframe is like this. I know what I lost some rows in data cleaning because len(df) was previously 500 and now it is 489. I can see, for example, that the row 496 is missing.


    all       month day year
0   03/25/93    03  25  93
...
480     2013    1   1   2013
481     1974    1   1   1974
482     1990    1   1   1990
483     1995    1   1   1995
484     2004    1   1   2004
485     1987    1   1   1987
486     1973    1   1   1973
487     1992    1   1   1992
488     1977    1   1   1977
489     1985    1   1   1985
490     2007    1   1   2007
491     2009    1   1   2009
492     1986    1   1   1986
493     1978    1   1   1978
494     2002    1   1   2002
495     1979    1   1   1979
497     2008    1   1   2008
498     2005    1   1   2005
499     1980    1   1   1980

how can I find out which rows are missing? If my question is a duplicate, please point me to the solution. thanks!

1 Answer 1

6

The easiest, if you have unique index values, is probably to use the difference on the index, i.e. you could simply do:

df_original.index.difference(df_cleaned.index)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.