0

I have a list of years in a pandas data frame. I want to filter them using a lambda function, which I am trying to pass using count method. For me, using lambda is the most convenient way, I would prefer a solution that involves lambda.

print df['year_built'][:5]
print df['year_built'].count(lambda x: len(x) == 4)
0    1981
1    1980
2    1935
3    2007
4    1994
Name: year_built, dtype: object

AttributeError: 'RangeIndex' object has no attribute 'levels'

What is the optimal way to do this using lambda and without it?

2
  • What exactly are you trying to do? count doesn't even take a callable object. Are you trying to count the number of elements of length 4? Commented Apr 13, 2019 at 9:24
  • @gmds yes, I am Commented Apr 13, 2019 at 9:25

3 Answers 3

1

Why not just use list comprehension.

[x for x in df['year_built'] if len(x) == 4]
Sign up to request clarification or add additional context in comments.

2 Comments

this looks like the most elegant way! Could you let me know why the lambda wasn't working?
Iterating over large datasets by hand is forbidden in data science.
0

The right way to do this, I think, is (assuming year_built is a column of object type and contains strings:

df.loc[df['year_built'].str.len() == 4, 'year_built']

If it doesn't:

df.loc[(1000 <= df['year_built']) & (df['year_built'] < 9999), 'year_built']

Comments

0

Assume year_build is column of string. Below will give you count of rows where len = 4

  year_build
0       1981
1       1980
2       1935
3       2007
4       1994
5         67
6         89

In [149]: (df['year_build'].str.len() == 4).sum()
Out[149]: 5

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.