I wrote a lambda function that should be fast, but this is taking a very long time. Is there a better way to write this?
fn = lambda x: shape(df[df.CustomerCard_Num == x.CustomerCard_Num])[0]
df['tottrans'] = df.apply(fn, axis = 1)
Basically, I have a big database of transactions (rows). A set of rows might correspond to different customers (Customer card number if a column in df, multiple rows might have the same df.CustomerCard_Num.)
I am trying to count the number of rows for each customer with this lambda function. But it does not seem to work quickly. Should I be using groupby?
df.CustomerCard_Num.value_counts()lambdainstead ofdef(it's not anonymous, it's not being used in the middle of an expression, it's not transient…)? And, given that you tagged the question aslambda, it seems like you think it might even be relevant to your problem that you usedlambdahere. (It's not, but if you think it might be, why not write it the more idiomatic way and see?)r?