Vectorize numpy code with operation depending on previous value

Question

The following code models a system that can sample 3 different states at any time, and the constant transition probability between those states is given by the matrix prob_nor. Threrefore, each point in trace depends on the previous state.

n_states, n_frames = 3, 1000
state_val = np.linspace(0, 1, n_states)

prob = np.random.randint(1, 10, size=(n_states,)*2)
prob[np.diag_indices(n_states)] += 50

prob_nor = prob/prob.sum(1)[:,None] # transition probability matrix, 
                                    # row sum normalized to 1.0

state_idx = range(n_states) # states is a list of integers 0, 1, 2...
current_state = np.random.choice(state_idx)

trace = []      
sigma = 0.1     
for _ in range(n_frames):
    trace.append(np.random.normal(loc=state_val[current_state], scale=sigma))
    current_state = np.random.choice(state_idx, p=prob_nor[current_state, :])

The loop in the above code makes it run pretty slow, specially when I have to model millions of data points. Is there any way to vectorize/accelerate it?

'vectorize' in the strictest numpy sense means operating on whole arrays in compiled code. It moves the iterations to the compiled level, outside of the control of your Python code. So an inherently sequential, iterative, problem can't be 'vectorized'. Calling those np.random functions repeatedly for one value at a time is much slower than calling them once for many values. — hpaulj
– hpaulj, Commented Oct 8, 2018 at 16:36
Someone just recently asked why the Python random.random functions were faster than the np.random ones. They are faster when used for one value at a time. — hpaulj
– hpaulj, Commented Oct 8, 2018 at 16:41
@hpaulj I think you are referring to stackoverflow.com/a/50790263/8033585 — AGN Gazer
– AGN Gazer, Commented Oct 8, 2018 at 17:22
"... I have to model millions of data points" What are typical values of n_states and n_frames for the problems that you are interested in? — Warren Weckesser
– Warren Weckesser, Commented Oct 8, 2018 at 17:32
@WarrenWeckesser n_states is roughly 2-10, but occasionally the transition probability matrix (prob_nor) is sparse and in that case n_states is 10-100. n_frames 1e3-1e6. trace has to be generated 1000s times — Brenlla
– Brenlla, Commented Oct 8, 2018 at 17:45

PMende · Accepted Answer · 2018-10-08 18:21:32Z

3

Offload the computation of probabilities as soon as possible:

possible_paths = np.vstack(
    np.random.choice(state_idx, p=prob_nor[curr_state, :], size=n_frames)
    for curr_state in range(n_states)
)

Then you can simply do a lookup to follow your path:

path_trace = [None]*n_frames
for step in range(n_frames):
    path_trace[step] = possible_paths[current_state, step]
    current_state = possible_paths[current_state, step]

Once you have your path, you can compute your trace:

sigma = 0.1
trace = np.random.normal(loc=state_val[path_trace], scale=sigma, size=n_frames)

Comparing timings:

Pure python for loop

%%timeit
trace_list = []
current_state = np.random.choice(state_idx)
for _ in range(n_frames):
    trace_list.append(np.random.normal(loc=state_val[current_state], scale=sigma))
    current_state = np.random.choice(state_idx, p=prob_nor[current_state, :])

Results:

30.1 ms ± 436 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Vectorized lookup:

%%timeit
current_state = np.random.choice(state_idx)
path_trace = [None]*n_frames
possible_paths = np.vstack(
    np.random.choice(state_idx, p=prob_nor[curr_state, :], size=n_frames)
    for curr_state in range(n_states)
)
for step in range(n_frames):
    path_trace[step] = possible_paths[current_state, step]
    current_state = possible_paths[current_state, step]
trace = np.random.normal(loc=state_val[path_trace], scale=sigma, size=n_frames)

Results:

641 µs ± 6.03 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

A speedup of approximately 50x.

edited Oct 8, 2018 at 18:21

answered Oct 8, 2018 at 15:59

PMende

5,4903 gold badges21 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

13 Comments

Matthieu Brucher Over a year ago

This doesn't work. current_state gives the probabilities for the choice, and it changes each time. I suspect this is a Markov Chain, and this solution won't give the proper transition probability.

Matthieu Brucher Over a year ago

The transition matrix here is random, but I suspect it's not the case in the actual code.

PMende Over a year ago

@Brenlla I've updated my answer to vectorize as much as possible while still doing what you need it to. This should be significantly faster than your previous approach, but still requires using a for loop. Though be aware this will use memory on the order of n_frames*n_states.

Warren Weckesser Over a year ago

You have to create possible_paths each time a new result is computed, so the time to generate possible_paths should be included in the total time for your method. (It might still be significantly faster than the original--I haven't tried it.)

Warren Weckesser Over a year ago

This method works. For each state j and each "time" k, possible_paths[j, k] holds a randomly generated next state. This value was precomputed, using the appropriate row from prob_nor. It has precomputed more than is strictly necessary to generate a path, but it does it using numpy's vectorized code, so it is much faster than the repeated calls of the original code. In a comment, Brenlla gave the expected ranges of problem parameters, which should be enough information to decide if this answer is a viable solution.

|

Stuart · Accepted Answer · 2018-10-08 18:38:24Z

2

Maybe I'm missing something, but I think you can create the current_states as a list and then vectorise the remaining steps:

# Make list of states (slow part)
states = []
current_state = np.random.choice(state_idx)
for _ in range(n_frames):
    states.append(current_state)
    current_state = np.random.choice(state_idx, p=prob_nor[current_state, :])

# Vectorised part
state_vals = state_val[states]   # alternatively np.array(states) / (n_states - 1)
trace = np.random.normal(loc=states, scale=sigma)

I believe this method works and will lead to a modest speed improvement while using some extra memory (3 lists/arrays are created instead of one). @PMende's solution leads to much larger speed improvement.

edited Oct 8, 2018 at 18:38

answered Oct 8, 2018 at 17:05

Stuart

9,8771 gold badge24 silver badges34 bronze badges

Collectives™ on Stack Overflow

Vectorize numpy code with operation depending on previous value

2 Answers 2

13 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

13 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related