Python Pandas Dataframe filter not working

Question

I have a Pandas dataframe called df with the following content:

    Symbol                     Cat         Beta Eps MktCap
2350    FBP  Foreign Regional Banks            0   0      0
2351   FNBC  Foreign Regional Banks            0   0      0
2353   BSBR  Foreign Regional Banks            0   0      0
2354    BBD  Foreign Regional Banks            0   0      0
2355    HDB  Foreign Regional Banks            0   0      0
2356    BCH  Foreign Regional Banks            0   0      0
2358     WF  Foreign Regional Banks            0   0   None
2359   SMFG  Foreign Regional Banks            0   0   None
2360    BFR  Foreign Regional Banks            0   0      0
2361    BCA  Foreign Regional Banks            0   0      0
2362   BPOP  Foreign Regional Banks            0   0   None
2363    CIB  Foreign Regional Banks            0   0      0
2364   ITUB  Foreign Regional Banks            0   0      0
2365    BMA  Foreign Regional Banks            0   0      0
2366     KB  Foreign Regional Banks            0   0   None
2367   BBDO  Foreign Regional Banks            0   0      0
2368   BSMX  Foreign Regional Banks            0   0   None
2369   BBVA  Foreign Regional Banks            0   0   None
2370    SHG  Foreign Regional Banks            0   0      0
2352     DB  Foreign Regional Banks         1.08   0      0
2357    MFG  Foreign Regional Banks  6.101694915   0   None

I use the following Python code:

df2 = df[df.Beta > 0]

The resulting df2 does not filter out the 0 values for Beta meaning it stays equal to df. How do I fix this? Thanks

It does work for me. What's the data type of Beta? When I read your dataframe using read_clipboard() the column gets assigned float format and it just works. — Khris
– Khris, Commented Dec 2, 2016 at 6:21

jezrael · Accepted Answer · 2016-12-02 06:21:31Z

3

I think you can try cast to float by astype - it seems dtype of column Beta is object (then type is obviously string):

df2 = df[df.Beta.astype(float) > 0]

edited Dec 2, 2016 at 6:21

answered Dec 2, 2016 at 6:17

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Christian Sauer Over a year ago

I would probably cast to int - compared floats will be iffy at best.Especially if some values are close to zero or are the result of computations

jezrael Over a year ago

@ChristianSauer - and if change condition with => or =< then get incorrect output - because print (int(6.9)) return 6.

jezrael Over a year ago

@BryanDowning - Glad can help you!

Christian Sauer Over a year ago

@jezrael Rounding problems are the reason for my suggestion. Alternatively, you coud use a function which takes tie acceptable difference into account, numpy.isclose if I recall correctly

jezrael Over a year ago

@BryanDowning Sure, but I think numpy.isclose is better for compare with float scalar values, or not? I think cast to int can return wrong output - df[df.Beta.astype(int) == 1] return row with Beta = 1.08. Can you explain more?

|

timrockx · Accepted Answer · 2021-06-15 03:04:49Z

I have a pandas df that looks like this: pandas df.

My issue is that when I try to find a certain state using a filter or conditionals, I get that the state doesn't exist, even though I can see it does.

Ex.

state_df.loc[state_df['state'] == 'AK']

results in a df with no rows, meaning it cannot find AK.

I think the issue may be related to the dtype of the columns, but it also looks fine to me:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 44 entries, 0 to 43
Data columns (total 6 columns):
state                     44 non-null object
high_risk_per_ICU_bed     44 non-null float64
high_risk_per_hospital    44 non-null float64
icu_beds                  44 non-null float64
hospitals                 44 non-null float64
total_at_risk             44 non-null float64
dtypes: float64(5), object(1)
memory usage: 2.2+ KB

If it helps, I created the column state by using a groupby function and aggregating with sum, though I don't think that would cause this error.

Collectives™ on Stack Overflow

Python Pandas Dataframe filter not working

2 Answers 2

7 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related