3

https://i.sstatic.net/Le696.png

I want to count email accounts of male and female separately the code I wrote is not working properly so can anyone help me with this, please here is my code thank you in advance

    import csv

mailAcc = {}
femailAcc = {}

with open('1000 Records.csv', 'r') as csv_file:
    csv_reader = csv.reader(csv_file)
    for i in csv_reader:
        email = i[6]
        gender = i[5]
        doman = email.split('@')[-1]
        if doman in mailAcc:
            if gender == 'm':
                 mailAcc[doman] = mailAcc[doman] + 1
        else:
            mailAcc[doman] = 1

        if doman in femailAcc:
            if gender == 'F':
                femailAcc[doman] = femailAcc[doman] + 1
        else:
            femailAcc[doman] = 1
            
    print('Mail Email accounts: ', mailAcc)
    print('Femail Email Accounts: ', femailAcc)
6
  • Welcome to SO. please avoid using screenshots, copy& paste some dummy data as a minimal reproducible example On another note: I really hope that those email adresses are NOT real... Commented Apr 15, 2021 at 10:54
  • 2
    they are fake emails Commented Apr 15, 2021 at 10:55
  • As pointed by JoSste, please remove the screenshot and paste a sample of the input CSV file as pure text. Commented Apr 15, 2021 at 11:01
  • Do you want just to count the total of male and female accounts or do you want to count them by domain? If you just want to count males and females, there is no need to check the domain. Commented Apr 15, 2021 at 11:14
  • I want to do this with domains like how many male accounts are Gmail and how much female accounts have a Gmail account. here male accounts are 9 but they are giving me more than 9 same with female accounts Commented Apr 16, 2021 at 6:19

3 Answers 3

1

use pandas

import pandas as pd

df = pd.read_csv('your_csv_file.csv') # read in csv
df['domain'] = df['email'].apply(lambda x: x[x.index('@')+1:]) # column with just domain

male = {} # setup male dictionary
female = {} # setup female dictionary

# iterate on unique domains to get a count of male/female and populate in dictionaries
for domain in df['domain'].unique():   
    male[domain] = df[(df['gender']=='M') & (df['domain']==domain)].shape[0]
    female[domain] = df[(df['gender']=='F') & (df['domain']==domain)].shape[0]
Sign up to request clarification or add additional context in comments.

1 Comment

I can't. it's not alow
1

This can be done in pandas. As your columns are unnamed, use header=None when reading your csv and access the columns by number:

import pandas as pd

df = pd.read_csv('1000 Records.csv', header=None)
df['mailhosts'] = df[6].str.split('@').str[-1]

gp = df.groupby(5)

#count e-mail accounts per gender:
print('Female Email Accounts:', gp.get_group('F')['mailhosts'].value_counts())
print('Male Email Accounts:', gp.get_group('M')['mailhosts'].value_counts())

2 Comments

thank you for your help but I can't use pandas because it not alow
@imhamza3333, if you can't use pandas, why did you mark this answer as accepted?
0

Here is a solution that counts male and female accounts by domain using just standard Python modules:

import csv
from collections import Counter

males = Counter()
females = Counter()

with open('1000 Records.csv') as f:
    records = csv.reader(f)
    for record in records:
        _, domain = record[6].split('@')
        gender = record[5]
        if gender.lower() == 'm':
            males.update((domain.lower(),))
        else:
            females.update((domain.lower(),))

    print('Total male accounts:', sum(males.values()))
    print('Total male accounts by domain')
    for k, v in males.items():
        print(k, v)

    print('Total female accounts:', sum(females.values()))
    print('Total female accounts by domain')
    for k, v in females.items():
        print(k, v)
                                              

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.