I have a csv file that contains large amount of data,but the data that contained in csv file is not cleaned.The example of csv data is as follows
country branch no_of_employee total_salary count_DOB count_email
x a 30 2500000 20 25
x b 20 350000 15 20
y c 30 4500000 30 30
z d 40 5500000 40 40
z e 10 1000000 10 10
z f 15 1500000 15 15
after applying the group by i am not getting the proper result .
df = data_df.groupby(['country', 'customer_branch']).count()
the result is in the form of
country branch no of employees
x 1 30
x 1 20
y 1 30
z 3 65
country x is repeating twise.This is because of source file data,in source file the country field contains "X" and "X ". That is why it display X twise .How can i ignore this problem using pandas
df['country'] = df['country'].str.strip(' ')should workgroupby