1

could you please help me understand what's wrong in the script below and how to correct it? I am just trying to add a column iterating over the file. The new column should say 'F', if the percentage of females is higher than the percentage of males. Thank you very much!

babies_df = pd.read_csv('datasets/babynames_nysiis.csv', delimiter=';')

gender=[]
for idx in range(len(babies_df)):
    if babies_df['perc_female'>'perc_male']:
        gender.append('F')
    else:
        gender.append('M')

babies_df['gender'] = gender
7
  • I think you should use babies_df['perc_female'] > babies_df['perc_male'] Commented May 26, 2020 at 11:19
  • thanks PSKP, tried that already but it does not work :( Commented May 26, 2020 at 11:23
  • In if statement you mistakenly comparing inside [ ] which is wrong. you should complete first column. If still error persist please add error also. Commented May 26, 2020 at 11:25
  • Sure, thanks for helping, this is the code: babies_df = pd.read_csv('datasets/babynames_nysiis.csv', delimiter=';') gender=[] for idx in range(len(babies_df)): if babies_df['perc_female’]> babies_df[’perc_male']: gender.append('F') else: gender.append('M') babies_df['gender'] = gender Commented May 26, 2020 at 11:33
  • and this is the error: ValueError Traceback (most recent call last) <ipython-input-195-cbb09ab25ce5> in <module>() 4 gender=[] 5 for idx in range(len(babies_df)): ----> 6 if babies_df['perc_female']>babies_df['perc_male']: 7 idx.append('F') 8 else: Commented May 26, 2020 at 11:35

2 Answers 2

1

The problem is that babies_df['perc_female'>'perc_male'] is not correct syntax.

You could try pandas apply for your solution.


babies_df = pd.read_csv('datasets/babynames_nysiis.csv', delimiter=';')

babies_df['gender'] = babies_df.apply(
    lambda x: 'F' if x['perc_female'] > x['perc_male'] else 'M', 
    axis=1
)
Sign up to request clarification or add additional context in comments.

Comments

1

The problem with your code is, you are not iterating row by row and also you are comparing columns directly which is not possible.

babies_df = pd.read_csv('datasets/babynames_nysiis.csv', delimiter=';')

for index, row in babies_df.iterrows():
    if row["perc_female"] > row["perc_male"]:
        gender.append("F")
    else:
        gender.append("M")

babies_df["gender"] = gender

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.