2

I'm sure there is a simple solution to this problem but I cannot seem to find it.

I am trying to check if an age from a list is in the age column of my dataframe. However, it is only comparing to the index and not the column.

Here is a simplified piece of code from my program:

def findages(data,ages):
    for age in ages:
        if age in data['age']:
            print('yes')
        else:
            print('no')

I have also tried this:

def findages(data,ages):
    for age in ages:
        if age in data.loc[data['age']]:
            print('yes')
        else:
            print('no')

the dataframe looks like this

                 age     x     Lambda             L
0       1.258930e+05  0.01       91.0  5.349000e+25
1       1.258930e+05  0.01       94.0  1.188800e+26
2       1.258930e+05  0.01       96.0  1.962700e+26
3       1.258930e+05  0.01       98.0  3.169400e+26   
4       1.258930e+05  0.01      100.0  5.010800e+26

and the list like this:

ages = ([125893.0, 4e7,5e9])

What am I doing wrong?

2
  • Can you provide an example input? Commented Jul 25, 2016 at 11:21
  • just updated the question Commented Jul 25, 2016 at 11:27

2 Answers 2

2

DataFrame column access return a Series

In your code, data['age'] is returning a series of columns age. In this case the in operator will compare against the index. To compare against the values in the series use the .values attribute to get an array of the series values.

By example:

import pandas as pd

df = pd.DataFrame({'age':[33, 34], 'pet':['Dog', 'Cat']}, index=['Bob', 'Mary'])

ages = [5, 33, 67]

def findages(data, ages):
    for age in ages:
        if age in data['age'].values:
            print('yes')
        else:
            print('no')

findages(df, ages)

no
yes
no
Sign up to request clarification or add additional context in comments.

1 Comment

No worries, takes some experience to get use to the interfaces and capabilities of the library. Keep at it, it is a very useful tool!
0

Use numpy.where with isin:

np.where(data['age'].isin(ages),'yes','no')

Sample:

import pandas as pd
import numpy as np

data = pd.DataFrame({'age':[10,20,30]})
ages = [10,30]
print (data)
   age
0   10
1   20
2   30

data['new'] = np.where(data['age'].isin(ages),'yes','no')
print (data)
   age  new
0   10  yes
1   20   no
2   30  yes

EDIT by sample:

print (data)
        age     x  Lambda             L
0  125893.0  0.01    91.0  5.349000e+25
1  125893.0  0.01    94.0  1.188800e+26
2  125893.0  0.01    96.0  1.962700e+26
3  125893.0  0.01    98.0  3.169400e+26
4  125893.0  0.01   100.0  5.010800e+26

ages = ([125893.0, 4e7,5e9])
print (np.where(data['age'].isin(ages),'yes','no'))
['yes' 'yes' 'yes' 'yes' 'yes']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.