Python type checking numpy arrays include their dtype

Question

I can verify my function receives inputs in the correct type using:

def foo(x: np.ndarray, y: float):
    return x * y

Making sure if I try to use this function with x that is not a np.ndarray I will get an error even before running the code.

What I don't know, is how to verify the array type. For example:

 def return_valid_points_only(points: np.ndarray, valid: np.ndarray):
    assert points.shape == valid.shape
    return points[valid]

I wish to check that valid is not only a np.ndarray but also valid.dtype == bool.

For this example, if valid will be supply with 0 and 1 to indicate validity, the program won't fail and I will get terrible results.

Thanks

These typing checks are for other programmer to easily understand the function + if it uses it wrongly (sending the function arguments from the wrong type) Pychrm letting it know on the spot — Shaq
– Shaq, Commented Jan 28, 2021 at 17:10
I guess we're dependent on PyCharm's features to meet your requirement, not on Python. — fountainhead
– fountainhead, Commented Jan 28, 2021 at 17:13

Mad Physicist · Accepted Answer · 2021-01-28 20:08:45Z

Python is all about asking for forgiveness, not permission. That means that even in your first definition, def foo(x: np.ndarray, y: float): is really relying on the user to honor the hint, unless you are using something like mypy.

There are a couple of approaches you can take here, usually in tandem. One is to write the function in a way that works with the inputs that are passed in, which can mean failing or coercing invalid inputs. The other method is to document your code carefully, so users can make an intelligent decisions. The second method is especially important, but I will focus on the first.

Numpy does most of the checking for you. For example, rather than expecting an array, it is idiomatic to coerce one:

x = np.asanyarray(x)

np.asanyarray is usually an alias for array(a, dtype, copy=False, order=order, subok=True). You can do something similar for y:

y = np.asanyarray(y).item()

This will allow any array-like as long as it has one element, whether scalar or not. Another way is to respect numpy's ability to broadcast arrays together, so if the user passes in y as a list of x.shape[-1] elements.

For your second function, you have a couple of options. One option is to allow a fancy indexing. So if the user passes in a list of indices vs a boolean mask, you can use both. If, on the other hand, you insist on a boolean mask, you can either check or coerce the dtype.

If you check, keep in mind that the numpy indexing operation will raise an error for you if the array sizes don't match. You only need to check the type itself:

points = np.asanyarray(points)
valid = np.asanyarray(valid)
if valid.dtype != bool:
    raise ValueError('valid argument must be a boolean mask')

If you choose to coerce instead, the user will be allowed to use zeros and ones, but valid inputs will not be copied unnecessarily:

valid = np.asanyarray(valid, bool)

Collectives™ on Stack Overflow

Python type checking numpy arrays include their dtype

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related