I may be giving this more attention than it's worth, but am curious about how the assert works.
(From development work on another builtin package, argparse I know that documentation is never exact. There are nuances to the code that can't be captured in the documentation. Documentation has to balance usefulness for most users, against the expectations of more pickier users.)
Here's the full code for this method:
Signature: M.assertSequenceEqual(seq1, seq2, msg=None, seq_type=None)
Source:
def assertSequenceEqual(self, seq1, seq2, msg=None, seq_type=None):
"""An equality assertion for ordered sequences (like lists and tuples).
For the purposes of this function, a valid ordered sequence type is one
which can be indexed, has a length, and has an equality operator.
Args:
seq1: The first sequence to compare.
seq2: The second sequence to compare.
seq_type: The expected datatype of the sequences, or None if no
datatype should be enforced.
msg: Optional message to use on failure instead of a list of
differences.
For the purposes of this function, a valid ordered sequence type is one
which can be indexed, has a length, and has an equality operator.
ndarray has an equality operator, but numpy doesn't allow it to be used in an if statement.
"""
if seq_type is not None:
seq_type_name = seq_type.__name__
if not isinstance(seq1, seq_type):
raise self.failureException('First sequence is not a %s: %s'
% (seq_type_name, safe_repr(seq1)))
if not isinstance(seq2, seq_type):
raise self.failureException('Second sequence is not a %s: %s'
% (seq_type_name, safe_repr(seq2)))
else:
seq_type_name = "sequence"
differing = None
try:
len1 = len(seq1)
except (TypeError, NotImplementedError):
differing = 'First %s has no length. Non-sequence?' % (
seq_type_name)
if differing is None:
if seq1 == seq2:
return
This is the step where numpy throws its error. For the builtin sequence types like list and string this is a perfectly good expression. I suppose they could have wrapped it in a try/except ValueError. But is it up to the core python to anticipate all the ways that an external library would fail, or should the library itself take care to provide its own useful tests?
The rest of the code tries to identify how the sequences differ, whether in length, or element values.
differing = '%ss differ: %s != %s\n' % (
(seq_type_name.capitalize(),) +
_common_shorten_repr(seq1, seq2))
for i in range(min(len1, len2)):
try:
item1 = seq1[i]
except (TypeError, IndexError, NotImplementedError):
differing += ('\nUnable to index element %d of first %s\n' %
(i, seq_type_name))
break
try:
item2 = seq2[i]
except (TypeError, IndexError, NotImplementedError):
differing += ('\nUnable to index element %d of second %s\n' %
(i, seq_type_name))
break
if item1 != item2:
differing += ('\nFirst differing element %d:\n%s\n%s\n' %
((i,) + _common_shorten_repr(item1, item2)))
break
else:
if (len1 == len2 and seq_type is None and
type(seq1) != type(seq2)):
# The sequences are the same, but have differing types.
return
if len1 > len2:
differing += ('\nFirst %s contains %d additional '
'elements.\n' % (seq_type_name, len1 - len2))
try:
differing += ('First extra element %d:\n%s\n' %
(len2, safe_repr(seq1[len2])))
except (TypeError, IndexError, NotImplementedError):
differing += ('Unable to index element %d '
'of first %s\n' % (len2, seq_type_name))
elif len1 < len2:
differing += ('\nSecond %s contains %d additional '
'elements.\n' % (seq_type_name, len2 - len1))
try:
differing += ('First extra element %d:\n%s\n' %
(len1, safe_repr(seq2[len1])))
except (TypeError, IndexError, NotImplementedError):
differing += ('Unable to index element %d '
'of second %s\n' % (len1, seq_type_name))
standardMsg = differing
diffMsg = '\n' + '\n'.join(
difflib.ndiff(pprint.pformat(seq1).splitlines(),
pprint.pformat(seq2).splitlines()))
standardMsg = self._truncateMessage(standardMsg, diffMsg)
msg = self._formatMessage(msg, standardMsg)
self.fail(msg)
File: /usr/lib/python3.8/unittest/case.py
That difference diagnosis is designed to work with sequences like lists, objects which have a len and element wise iteration makes sense.
Samples with lists:
In [309]: M.assertSequenceEqual([1,2,3],[1., 2., 3.])
In [310]: M.assertSequenceEqual([1,2,3],[1., 2., 4])
Traceback (most recent call last):
File "<ipython-input-310-3cc2193995b9>", line 1, in <module>
M.assertSequenceEqual([1,2,3],[1., 2., 4])
File "/usr/lib/python3.8/unittest/case.py", line 1100, in assertSequenceEqual
self.fail(msg)
File "/usr/lib/python3.8/unittest/case.py", line 753, in fail
raise self.failureException(msg)
AssertionError: Sequences differ: [1, 2, 3] != [1.0, 2.0, 4]
First differing element 2:
3
4
- [1, 2, 3]
+ [1.0, 2.0, 4]
In [311]: M.assertSequenceEqual([1,2,3,0],[1., 2., 4])
Traceback (most recent call last):
File "<ipython-input-311-1a7c548c65ee>", line 1, in <module>
M.assertSequenceEqual([1,2,3,0],[1., 2., 4])
File "/usr/lib/python3.8/unittest/case.py", line 1100, in assertSequenceEqual
self.fail(msg)
File "/usr/lib/python3.8/unittest/case.py", line 753, in fail
raise self.failureException(msg)
AssertionError: Sequences differ: [1, 2, 3, 0] != [1.0, 2.0, 4]
First differing element 2:
3
4
First sequence contains 1 additional elements.
First extra element 3:
0
- [1, 2, 3, 0]
+ [1.0, 2.0, 4]
For lists within lists:
In [314]: M.assertSequenceEqual([[1,2,3]],[[1,2,4]])
Traceback (most recent call last):
File "<ipython-input-314-e71a55865cae>", line 1, in <module>
M.assertSequenceEqual([[1,2,3]],[[1,2,4]])
File "/usr/lib/python3.8/unittest/case.py", line 1100, in assertSequenceEqual
self.fail(msg)
File "/usr/lib/python3.8/unittest/case.py", line 753, in fail
raise self.failureException(msg)
AssertionError: Sequences differ: [[1, 2, 3]] != [[1, 2, 4]]
First differing element 0:
[1, 2, 3]
[1, 2, 4]
- [[1, 2, 3]]
? ^
+ [[1, 2, 4]]
? ^
Without getting into the details, it's not clear that the diagnostic steps would handle numpy arrays any better. len for an array is just the size of the first dimension.
Contrast that with the error reported by numpy own tester. Why force unittest to work with arrays, when numpy provides a tester that's better suited to multidimensional arrays?
In [319]: numpy.testing.assert_array_almost_equal([[1,2,3]],[[1,2,4]])
Traceback (most recent call last):
File "<ipython-input-319-9392a2140ddd>", line 1, in <module>
numpy.testing.assert_array_almost_equal([[1,2,3]],[[1,2,4]])
File "/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py", line 1042, in assert_array_almost_equal
assert_array_compare(compare, x, y, err_msg=err_msg, verbose=verbose,
File "/usr/local/lib/python3.8/dist-packages/numpy/testing/_private/utils.py", line 840, in assert_array_compare
raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 6 decimals
Mismatched elements: 1 / 3 (33.3%)
Max absolute difference: 1
Max relative difference: 0.25
x: array([[1, 2, 3]])
y: array([[1, 2, 4]])
While test arrays may be small, production arrays are often quite large. This statistical style of reporting is more useful than the more detailed differences of the unittest method.
Note what the numpy test has done. I gave it lists, which it converted to numpy arrays. That's common practice in numpy. It's much simpler to write code that converts the inputs into a 'standard' type at the start, rather than trying to account for different behaviors through out the code. For what it's worth, numpy has a perfectly good tolist method
In [322]: M.assertSequenceEqual([1,2,3],np.array([1.,2.,3.]).tolist())
unittest(which is a first-party library) should account for the behavior of third-party libraries likenumpy.__eq__should be required to return a single Boolean value, rather than an element-wise comparison. This would not be the only part of the language definition designed to accommodate Numpy.whyquestion. We can attempt to explain what is happening based on the code, but we can't explain why developers (Python or numpy) didn't go the extra mile and make the behavior conform to your expectations.assertSequenceEqualshould not rely on==since==is not a part of a sequences interface. Even ifndarrayis not a sequence, it wouldn't be hard to define an object that adheres to being a sequence in every way butassertSequenceEqualstill fails for.