Drop rows if value in a specific column is not an integer in pandas dataframe

Question

If I have a dataframe and want to drop any rows where the value in one column is not an integer how would I do this?

The alternative is to drop rows if value is not within a range 0-2 but since I am not sure how to do either of them I was hoping someonelse might.

Here is what I tried but it didn't work not sure why:

df = df[(df['entrytype'] != 0) | (df['entrytype'] !=1) | (df['entrytype'] != 2)].all(1)

Well that won't work because of operator precedence so you need braces so it should be :df = df[(df['entrytype'] != 0) | (df['entrytype'] !=1) | (df['entrytype'] != 2)].all(1) however, if you have any rows in a column that is not numeric then the dtype will object could you not just test this — EdChum
– EdChum, Commented Feb 13, 2015 at 13:05
Yes I did test this so I was looking for an alternative, due to the dtype issue. What are the alternatives? — azuric
– azuric, Commented Feb 13, 2015 at 13:28
You could do df[~df['entrytype'].isin([0,1,2])] this willl filter the rows that are not 0, 1 or 2 if you are expecting the values to only be those values — EdChum
– EdChum, Commented Feb 13, 2015 at 13:34
Another way could be: df['entrytype'].apply(lambda x: str(x).isdigit()) — EdChum
– EdChum, Commented Feb 13, 2015 at 13:36

EdChum · Accepted Answer · 2015-02-13 13:51:41Z

20

There are 2 approaches I propose:

In [212]:

df = pd.DataFrame({'entrytype':[0,1,np.NaN, 'asdas',2]})
df
Out[212]:
  entrytype
0         0
1         1
2       NaN
3     asdas
4         2

If the range of values is as restricted as you say then using isin will be the fastest method:

In [216]:

df[df['entrytype'].isin([0,1,2])]
Out[216]:
  entrytype
0         0
1         1
4         2

Otherwise we could cast to a str and then call .isdigit()

In [215]:

df[df['entrytype'].apply(lambda x: str(x).isdigit())]
Out[215]:
  entrytype
0         0
1         1
4         2

answered Feb 13, 2015 at 13:51

EdChum

397k204 gold badges836 silver badges583 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

azuric Over a year ago

Hi, both methods are good but unfortunately only the second slower method works for me. must be because the value are specified as a string when imported in from csv

EdChum Over a year ago

If loading from csv, if you don't specify the dtype or try to coerce the dtype then it tries to guess, if you have non numeric values then it probably is changing them to str types, what are the errant values in your rows? It may be quicker to do df.convert_objects(convert_numeric=True) and then call df.dropna()

azuric Over a year ago

ok I did this and it worked also: df2 = df[df['entrytype'].isin(['0','1','2'])] but your way is cleaner i think

EdChum Over a year ago

Ideally the dtypes should be set to the correct type, I would try to change to int if possible, however if you missing values then this can't be done as NaN cannot be represent by ints but can be represented by floats

Sachin · Accepted Answer · 2021-12-23 05:39:13Z

2

We have multiple ways to do the same, but I found this method easy and efficient.

Quick Examples

#Using drop() to delete rows based on column value
df.drop(df[df['Fee'] >= 24000].index, inplace = True)

# Remove rows
df2 = df[df.Fee >= 24000]

# If you have space in column name
# Specify column name with in single quotes
df2 = df[df['column name']]

# Using loc
df2 = df.loc[df["Fee"] >= 24000 ]

# Delect rows based on multiple column value
df2 = df[ (df['Fee'] >= 22000) & (df['Discount'] == 2300)]

# Drop rows with None/NaN
df2 = df[df.Discount.notnull()]

answered Dec 23, 2021 at 5:39

Sachin

1,74420 silver badges26 bronze badges

Comments

InLaw · Accepted Answer · 2020-11-27 19:30:24Z

1

str("-1").isdigit() is False

str("-1").lstrip("-").isdigit() works but is not nice.

df.loc[df['Feature'].str.match('^[+-]?\d+$')]

for your question the reverse set

df.loc[ ~(df['Feature'].str.match('^[+-]?\d+$')) ]

answered Nov 27, 2020 at 19:30

InLaw

2,7372 gold badges27 silver badges36 bronze badges

Collectives™ on Stack Overflow

Drop rows if value in a specific column is not an integer in pandas dataframe

3 Answers 3

4 Comments

Quick Examples

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Quick Examples

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related