1

I have two arrays of points with xy coordinates:

basic_pts = np.array([[0, 0], [1, 0], [2, 0], [0, 1], [1, 1], [0, 2]])
new_pts = np.array([[2, 2], [2, 1], [0.5, 0.5], [1.5, 0.5]])

As result I want from the array new_ptsonly those points, that fulfil the condition that there is no point in basic_ptswith a bigger x AND y value. So a result would be

res_pts = np.array([[2, 2], [2, 1], [1.5, 0.5]])

I have a solution which works but because of working with list comprehension it is not suitable for a bigger amount of data.

x_cond = ([basic_pts[:, 0] > x for x in new_pts[:, 1]])
y_cond = ([basic_pts[:, 1] > y for y in new_pts[:, 1]])
xy_cond_ = np.logical_and(x_cond, y_cond)
xy_cond = np.swapaxes(xy_cond_, 0, 1)
mask = np.invert(np.logical_or.reduce(xy_cond))
res_pts = new_pts[mask]

Is there a better way to solve this only with numpy and without list comprehension?

2 Answers 2

1

You could use NumPy broadcasting -

# Get xy_cond equivalent after extending dimensions of basic_pts to a 2D array
# version by "pushing" separately col-0 and col-1 to axis=0 and creating a 
# singleton dimension at axis=1. 
# Compare these two extended versions with col-1 of new_pts. 
xyc = (basic_pts[:,0,None] > new_pts[:,1]) & (basic_pts[:,1,None] > new_pts[:,1])

# Create mask equivalent and index into new_pts to get selective rows from it
mask = ~(xyc).any(0)
res_pts_out = new_pts[mask]
Sign up to request clarification or add additional context in comments.

2 Comments

This was my thought as well; note however that it ends up creating an intermediate (len(basic_pts), len(new_pts)) intermediate array, which can be pretty memory intensive (OP mentioned 'big amounts of data')
@val Yeah that could be an issue with really huge datasizes. Thanks for pointing that out!
0

As val points out, a solution that creates an intermediate len(basic_pts) × len(new_pts) array can be too memory intensive. On the other hand, a solution that tests each point in new_pts in a loop can be too time-consuming. We can bridge the gap by picking a batch size k and testing new_pts in batches of size k using Divakar's solution:

basic_pts = np.array([[0, 0], [1, 0], [2, 0], [0, 1], [1, 1], [0, 2]])
new_pts = np.array([[2, 2], [2, 1], [0.5, 0.5], [1.5, 0.5]])
k = 2
subresults = []
for i in range(0, len(new_pts), k):
    j = min(i + k, len(new_pts))
    # Process new_pts[i:j] using Divakar's solution
    xyc = np.logical_and(
        basic_pts[:, np.newaxis, 0] > new_pts[np.newaxis, i:j, 0],
        basic_pts[:, np.newaxis, 1] > new_pts[np.newaxis, i:j, 1])
    mask = ~(xyc).any(axis=0)
    # mask indicates which points among new_pts[i:j] to use
    subresults.append(new_pts[i:j][mask])
# Concatenate subresult lists
res = np.concatenate(subresults)
print(res)
# Prints:
array([[ 2. ,  2. ],
       [ 2. ,  1. ],
       [ 1.5,  0.5]])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.