I wish to check that the categories in one dataframe column match the categories in another, ie that there are no mismatches in spelling etc.
I now have two arrays representing all the unique values in the columns of interest, and I would like to return any values that are in the first, smaller array but aren't in the second, larger array, hence then I can narrow down categories I may need to adjust/re-spell etc. I believe I should use a for loop to evaluate each array but I am struggling with the implementation. Example code below, thanks:
borough_pm25 = pm25['Borough_x'].unique()
borough_pm25
array(['Barnet', 'Camden', 'Wandsworth', 'Hounslow', 'Southwark',
'Westminster', 'Kensington & Chelsea', 'Tower Hamlets',
'Islington', 'Kingston', 'Barking & Dagenham', 'Waltham Forest',
'Haringey', 'Lambeth', 'Enfield', 'Greenwich', 'Redbridge',
'Newham', 'City of London', 'Hackney', 'Richmond', 'Ealing',
'Hammersmith & Fulham', 'Lewisham', 'Sutton', 'Havering', 'Bexley',
'Bromley'], dtype=object)
borough_map = map_df['NAME'].unique()
borough_map
array(['Kingston upon Thames', 'Croydon', 'Bromley', 'Hounslow', 'Ealing',
'Havering', 'Hillingdon', 'Harrow', 'Brent', 'Barnet', 'Lambeth',
'Southwark', 'Lewisham', 'Greenwich', 'Bexley', 'Enfield',
'Waltham Forest', 'Redbridge', 'Sutton', 'Richmond upon Thames',
'Merton', 'Wandsworth', 'Hammersmith and Fulham',
'Kensington and Chelsea', 'Westminster', 'Camden', 'Tower Hamlets',
'Islington', 'Hackney', 'Haringey', 'Newham',
'Barking and Dagenham', 'City of London'], dtype=object)