Check for equality between two numpy arrays

Question

I wish to check that the categories in one dataframe column match the categories in another, ie that there are no mismatches in spelling etc.

I now have two arrays representing all the unique values in the columns of interest, and I would like to return any values that are in the first, smaller array but aren't in the second, larger array, hence then I can narrow down categories I may need to adjust/re-spell etc. I believe I should use a for loop to evaluate each array but I am struggling with the implementation. Example code below, thanks:

borough_pm25 = pm25['Borough_x'].unique()
borough_pm25
array(['Barnet', 'Camden', 'Wandsworth', 'Hounslow', 'Southwark',
       'Westminster', 'Kensington & Chelsea', 'Tower Hamlets',
       'Islington', 'Kingston', 'Barking & Dagenham', 'Waltham Forest',
       'Haringey', 'Lambeth', 'Enfield', 'Greenwich', 'Redbridge',
       'Newham', 'City of London', 'Hackney', 'Richmond', 'Ealing',
       'Hammersmith & Fulham', 'Lewisham', 'Sutton', 'Havering', 'Bexley',
       'Bromley'], dtype=object)

borough_map = map_df['NAME'].unique()
borough_map
array(['Kingston upon Thames', 'Croydon', 'Bromley', 'Hounslow', 'Ealing',
       'Havering', 'Hillingdon', 'Harrow', 'Brent', 'Barnet', 'Lambeth',
       'Southwark', 'Lewisham', 'Greenwich', 'Bexley', 'Enfield',
       'Waltham Forest', 'Redbridge', 'Sutton', 'Richmond upon Thames',
       'Merton', 'Wandsworth', 'Hammersmith and Fulham',
       'Kensington and Chelsea', 'Westminster', 'Camden', 'Tower Hamlets',
       'Islington', 'Hackney', 'Haringey', 'Newham',
       'Barking and Dagenham', 'City of London'], dtype=object)

Thanks Mihai, yes this works in the sense that it returns False, ie there is a mismatch, however I need to return the actual values which do not match. — ojp
– ojp, Commented Feb 8, 2020 at 18:00

Ch3steR · Accepted Answer · 2020-02-08 18:02:36Z

You can use set operations.

import numpy as np
a=np.array(['Barnet', 'Camden', 'Wandsworth', 'Hounslow', 'Southwark',
       'Westminster', 'Kensington & Chelsea', 'Tower Hamlets',
       'Islington', 'Kingston', 'Barking & Dagenham', 'Waltham Forest',
       'Haringey', 'Lambeth', 'Enfield', 'Greenwich', 'Redbridge',
       'Newham', 'City of London', 'Hackney', 'Richmond', 'Ealing',
       'Hammersmith & Fulham', 'Lewisham', 'Sutton', 'Havering', 'Bexley',
       'Bromley'])
b=np.array(['Kingston upon Thames', 'Croydon', 'Bromley', 'Hounslow', 'Ealing',
       'Havering', 'Hillingdon', 'Harrow', 'Brent', 'Barnet', 'Lambeth',
       'Southwark', 'Lewisham', 'Greenwich', 'Bexley', 'Enfield',
       'Waltham Forest', 'Redbridge', 'Sutton', 'Richmond upon Thames',
       'Merton', 'Wandsworth', 'Hammersmith and Fulham',
       'Kensington and Chelsea', 'Westminster', 'Camden', 'Tower Hamlets',
       'Islington', 'Hackney', 'Haringey', 'Newham',
       'Barking and Dagenham', 'City of London'])

print(set(a)-set(b)) #(set A – set B) will be the elements present in set A but not in B
print(set(b)-set(a)) #(set B – set A) will be the elements present in set B but not in set A
print(set(a)-set(b)|set(b)-set(a))

{'Barking & Dagenham',
 'Hammersmith & Fulham',
 'Kensington & Chelsea',
 'Kingston',
 'Richmond'}  #set(a)-set(b)

{'Barking and Dagenham',
 'Brent',
 'Croydon',
 'Hammersmith and Fulham',
 'Harrow',
 'Hillingdon',
 'Kensington and Chelsea',
 'Kingston upon Thames',
 'Merton',
 'Richmond upon Thames'}  #set(b)-set(a)

{'Barking & Dagenham',
 'Barking and Dagenham',
 'Brent',
 'Croydon',
 'Hammersmith & Fulham',
 'Hammersmith and Fulham',
 'Harrow',
 'Hillingdon',
 'Kensington & Chelsea',
 'Kensington and Chelsea',
 'Kingston',
 'Kingston upon Thames',
 'Merton',
 'Richmond',
 'Richmond upon Thames'}

Collectives™ on Stack Overflow

Check for equality between two numpy arrays

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related