1

I have the following code which calculates the number of pairs 1 <= i < j <= n such that xs[i] == ys[j]:

def f(xs, ys):
    s = 0
    for j in range(xs.size):
        s += np.sum(xs[:j] == ys[j])
    return s

This is called a couple of times from some other procedure, so I need it to be as fast as possible (at some memory cost).

The sizes of the arrays are > 1e6.

Is there a faster equivalent that makes use of some numpy magic which gets rid of the for loop?

3
  • 1
    Can you use Numba? It’s built for this kind of thing. Commented Aug 22, 2020 at 5:47
  • 1
    Try using np.broadcast_to(xs, (xs.shape[0], xs.shape[0])) to get a square matrix of repeated xs, then np.tril to zero-out elements above the diagonal, then np.sum(zeroed_matrix == ys)... not sure about the last part, but play around with it Commented Aug 22, 2020 at 6:31
  • 1
    np.sum(zeroed_matrix==ys[:, np.newaxis]) as in this answer Commented Aug 22, 2020 at 6:35

1 Answer 1

1

One way of doing it if xs and ys are of the same size:

s = np.triu(xs[:,None]==ys,1).sum()

And if xs and ys are different sizes (according to your code, you only compare same length of ys with xs. In case you want to compare xs with ALL ys, use line above):

s = np.triu((xs[:,None]==ys[:xs.size]),1).sum()

or equivalently:

s = (xs[:,None]==ys)[np.triu_indices(xs.size,1)].sum()

Which compares all elements of a 2-D xs with ys and sum the equals on upper triangle (same as loop's inner line)

If your arrays are too large and you run into memory issue, simply chunk your arrays and use lines above to compare blocks on diagonal plus all of the off-diagonal blocks on upper triangle and add them up.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.