I have some Numpy arrays (or equivalently Pandas dataframes as it can be easily converted to) that I wish to compare. These arrays/dataframes contain both numbers and strings.
For purely numbers I can do the following.
import numpy as np
a = np.array([[1.0, 2.0], [1.00001, 2.00001]])
b = np.array([[1.000001, 2.00001], [1.00001, 2.00001]])
print(np.allclose(a, b, 1e-9))
# output: False
print(np.allclose(a, b, 1e-4))
# output: True
With a mixed array of the following, I am getting errors.
c = np.array([[1.0, "Cat"], [1.00001, 2.00001]])
d = np.array([[1.000001, "Dog"], [1.00001, 2.00001]])
e = np.array([[1.000001, "Cat"], [1.00001, 2.00001]])
print(np.allclose(c, d, 1e-4))
# expected output: False on account of the string difference
print(np.allclose(c, e, 1e-4))
# expected output: True
I tried converting it into a Pandas dataframe hoping that the builtin testing module might do the trick.
import pandas as pd
from pandas.util import testing as pdtest
df_c = pd.DataFrame(c)
df_d = pd.DataFrame(d)
df_e = pd.DataFrame(e)
print(pdtest.assert_almost_equal(df_c, df_e, check_exact=False, check_less_precise=4))
# expected output: True as the strings match and numbers agree within tolerance.
But this doesn't work. Is there a way to compare arrays where numerical elements are compared with a specified tolerance while string elements are compared exactly?
EDIT: The tolerance is purely for float elements. For strings, exact match is required.
cord(i.e.print). Note thedtype. We should probably close this because you failed to describe theallcloseerrors, and/or attempt any followup. But the basic issue is theallclose, which usesiscloseis designed for use with numeric arrays, not string arrays.abs(x-y)is small enough. That doesn't apply to arrays that have a string dtype.