5

I have a Pandas dataframe called df with the following content:

    Symbol                     Cat         Beta Eps MktCap
2350    FBP  Foreign Regional Banks            0   0      0
2351   FNBC  Foreign Regional Banks            0   0      0
2353   BSBR  Foreign Regional Banks            0   0      0
2354    BBD  Foreign Regional Banks            0   0      0
2355    HDB  Foreign Regional Banks            0   0      0
2356    BCH  Foreign Regional Banks            0   0      0
2358     WF  Foreign Regional Banks            0   0   None
2359   SMFG  Foreign Regional Banks            0   0   None
2360    BFR  Foreign Regional Banks            0   0      0
2361    BCA  Foreign Regional Banks            0   0      0
2362   BPOP  Foreign Regional Banks            0   0   None
2363    CIB  Foreign Regional Banks            0   0      0
2364   ITUB  Foreign Regional Banks            0   0      0
2365    BMA  Foreign Regional Banks            0   0      0
2366     KB  Foreign Regional Banks            0   0   None
2367   BBDO  Foreign Regional Banks            0   0      0
2368   BSMX  Foreign Regional Banks            0   0   None
2369   BBVA  Foreign Regional Banks            0   0   None
2370    SHG  Foreign Regional Banks            0   0      0
2352     DB  Foreign Regional Banks         1.08   0      0
2357    MFG  Foreign Regional Banks  6.101694915   0   None

I use the following Python code:

df2 = df[df.Beta > 0]

The resulting df2 does not filter out the 0 values for Beta meaning it stays equal to df. How do I fix this? Thanks

1
  • It does work for me. What's the data type of Beta? When I read your dataframe using read_clipboard() the column gets assigned float format and it just works. Commented Dec 2, 2016 at 6:21

2 Answers 2

3

I think you can try cast to float by astype - it seems dtype of column Beta is object (then type is obviously string):

df2 = df[df.Beta.astype(float) > 0]
Sign up to request clarification or add additional context in comments.

7 Comments

I would probably cast to int - compared floats will be iffy at best.Especially if some values are close to zero or are the result of computations
@ChristianSauer - and if change condition with => or =< then get incorrect output - because print (int(6.9)) return 6.
@BryanDowning - Glad can help you!
@jezrael Rounding problems are the reason for my suggestion. Alternatively, you coud use a function which takes tie acceptable difference into account, numpy.isclose if I recall correctly
@BryanDowning Sure, but I think numpy.isclose is better for compare with float scalar values, or not? I think cast to int can return wrong output - df[df.Beta.astype(int) == 1] return row with Beta = 1.08. Can you explain more?
|
0

I have a pandas df that looks like this: pandas df.

My issue is that when I try to find a certain state using a filter or conditionals, I get that the state doesn't exist, even though I can see it does.

Ex.

state_df.loc[state_df['state'] == 'AK']

results in a df with no rows, meaning it cannot find AK.

I think the issue may be related to the dtype of the columns, but it also looks fine to me:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 44 entries, 0 to 43
Data columns (total 6 columns):
state                     44 non-null object
high_risk_per_ICU_bed     44 non-null float64
high_risk_per_hospital    44 non-null float64
icu_beds                  44 non-null float64
hospitals                 44 non-null float64
total_at_risk             44 non-null float64
dtypes: float64(5), object(1)
memory usage: 2.2+ KB

If it helps, I created the column state by using a groupby function and aggregating with sum, though I don't think that would cause this error.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.