numpy indexing with an array of index (to be broadcasted)

Question

Let A be an array of shape (n_0, n_1, …, n_p-2, n_p-1) with arbitrary p.

Let ind be and array of shape (m_0, m_1, …, m_p-2, m_p-1), with, for all k:

either m_k == n_k
or m_k == 1.

The values of ind are indices corresponding to the p-th axis of A, namely integers in {0, 1, …, p-1}.

I want to get an array B of shape (n_0, n_1, …, n_p-1) such that B[i_0, i_1, …, i_p-2, i_p-1] equals A[i_0, i_1, …, i_p-2, ind[j_0, j_1, …, j_p-2, j_p-1] with, for all k:

j_k == i_k if m_k == n_k,
or j_k == 0 if m_k == 1.

Could you please provide a minimal reproducible example

Raghul Raj
– Raghul Raj

2021-01-28 15:02:14 +00:00
Commented Jan 28, 2021 at 15:02 — Raghul Raj
– Raghul Raj, Commented Jan 28, 2021 at 15:02

M. Toya · Accepted Answer · 2021-01-22 14:14:03Z

I ended writing the following (not thoroughly tested) function. I suppose (hope) better performances are achievable

def time_shift(signal, index_shifted):
    """Time shift an array of signal with shifted (time) indices.

    Time is indexed by the last dimension.

    Args:
       signal (array): original signals.
       index_shifted (array): shifted indices, with same shape as signal, or
           possibly 1 along dimension to be broadcasted.

    Returns:
        Array with shifted values.
    """

    out = np.empty(np.shape(signal))
    it = np.nditer([index_shifted, out], flags=['multi_index'],
                   op_flags = [['readonly'],
                               ['writeonly', 'allocate', 'no_broadcast']])
    with it:
        for ind, zz in it:
            selection = (*it.multi_index[:-1], ind)
            zz[...] = signal[selection]
    return out

druskacik · Accepted Answer · 2021-01-29 11:54:01Z

This is the fastest solution I could come up with after some testing. The idea is: broadcast ind matrix so that it has the same shape as A, then use the fact that the ind matrix has the same values accros axes which were expanded by broadcasting (i. e. there's no need to iterate over these axes).

If the shape of ind matrix is the same as the shape of A my function works actually slower, but with one or more dimensions of ind set to 1, this function works substantially faster.

Note that this is "logical" optimalization. I believe it's possible to make it faster by maybe using some other numpy functions, different handling of list and tuples etc.

The function:

import numpy as np
import time
from itertools import product

def time_shift_opt(signal, index_shifted):

  index_broadcasted = np.broadcast_to(index_shifted, signal.shape)
  out = np.empty(np.shape(signal))

  indices = product(*[range(d) for d in index_shifted.shape])

  dims_to_expand = [i for i, val in enumerate(index_shifted.shape) if val == 1]

  for index in indices:
    index_to_change = list(index)
    for d in dims_to_expand:
      index_to_change[d] = slice(None)

    index_to_change = tuple(index_to_change)

    val_index = list(index_to_change)
    val_index[-1] = index_broadcasted[index]
    value = signal[tuple(val_index)]

    if index_shifted.shape[-1] == 1:
      value = value[..., np.newaxis]

    out[index_to_change] = value

  return out

Comparing times: (See that with the growing number of ind dimensions set to 1 the function I proposed runs much faster)

def test_time_optimality(signal_shape, ind_shape, n_iterations = 10):
  p = signal_shape[-1]
  A = np.random.randint(1000, size = size)
  ind = np.random.randint(p, size = ind_size)

  averages = []

  temp_average = 0
  for i in range(n_iterations):
    start = time.time()
    res = time_shift(A, ind)
    temp_average += time.time() - start

  averages.append(temp_average/n_iterations)

  temp_average = 0
  for i in range(n_iterations):
    start = time.time()
    res = time_shift_opt(A, ind)
    temp_average += time.time() - start

  averages.append(temp_average/n_iterations)

  print('time_shift average duration: {:.4f}'.format(averages[0]))
  print('time_shift_opt average duration: {:.4f}'.format(averages[1]))

p = 3
size = (10, 20, 12, 50, p)
ind_size = (10, 20, 12, 50, p)
test_time_optimality(size, ind_size)
# time_shift average duration: 0.3734
# time_shift_opt average duration: 0.5365

size = (10, 20, 12, 50, p)
ind_size = (10, 1, 12, 50, p)
test_time_optimality(size, ind_size)
# time_shift average duration: 0.3697
# time_shift_opt average duration: 0.0414

size = (10, 20, 12, 50, p)
ind_size = (1, 1, 12, 50, p)
test_time_optimality(size, ind_size)
# time_shift average duration: 0.3840
# time_shift_opt average duration: 0.0080

size = (10, 20, 12, 50, p)
ind_size = (1, 1, 1, 1, 1)
test_time_optimality(size, ind_size)

# time_shift average duration: 0.3775
# time_shift_opt average duration: 0.0018

Collectives™ on Stack Overflow

numpy indexing with an array of index (to be broadcasted)

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related