0

Ive tried to find the OLDEST FEMALE from csv dataset, but I dont know how. Im pretty new to Python and Pandas. I clearly dont know how to use if function here.

import pandas as pd

df = pd.read_csv("people.csv", usecols=['gender', 'age'])

I tried to use something like this

print(df[df["gender"].isin(["F"])].df.age.max())

or like this

if df[df["gender"].all(["F"])] :
print(df.age.max())

even tried this

print(df.loc[df['gender'] == 'F'].max())

but this was before I found the oldest 'M' is the same age as the oldest 'F'

but still cant figure out how to find the oldest female

EDIT : I have to find the oldest female from imported dataset, not to create one. Thank you.

EDIT 2 : Sorry for bothering, I just found out, that the oldest M in my csv have the same age as the oldest F in my csv. This is embarassing

1
  • 1
    Adding few rows from your csv would help answer the question better Commented Mar 9, 2022 at 18:26

4 Answers 4

2

You don't actually need an if statement in this context. See below:

import numpy as np
import pandas as pd

df = pd.DataFrame({'gender': ['M', 'F', 'F','F','M'],
      'age': [99,12,45,98,23]})

# Result
print(df[df['gender'] == 'F']['age'].max())

This should give you what you are looking for. Also, don't forget to indent the next line after an if statement.

Sign up to request clarification or add additional context in comments.

1 Comment

I recommend against using chained indexing, noted by '][' these two characters in a pandas statement back to back. Instead, do this df.loc[df['gender'] == 'F', 'age'].max()
2

You can try this. First group by gender and get max values. Then get the age from it for Females.

import pandas as pd
df = pd.DataFrame([['F',20],['F',30], ['M',20]], columns=['gender', 'age'])

df = df.groupby('gender').max().reset_index()
print(df[df['gender'] == 'F'].iloc[0]['age'])

Output is 30 in this example

1 Comment

@helperman200: Since the input data was not provided, I have created a sample data, with the column names as specified in the question. Hence ideally this should work. If you are having issues with this, then please provide sample data from your csv. It would help to answer the question better.
2
df = pd.DataFrame({'gender':['F', 'M', 'F', 'M','F', 'M'],'age': [12, 33, 43, 22, 18, 16]})

oldest_female = df.loc[df['gender'] == 'F'].max()

print(oldest_female['age'])

1 Comment

Not sure I'm following you. This part selects females only df.loc[df['gender'] == 'F'].
2

To find the row of the oldest female in the data set you can filter your dataframe to only females, the use idxmax to find the index:

df.loc[df.query('gender == "F"')['age'].idxmax()]

This will return the first row in your dataset with a max age of gender 'F'.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.