1

I need to create a matrix starting from the values of a weight matrix. Which is the best structure to hold the matrix in term of speed both when creating and iterating over it? I was thinking about a list of lists or a numpy 2D array but they both seem slow to me. What I need:

numpy array
A = np.zeros((dim, dim))
for r in range(A.shape[0]):
    for c in range(A.shape[0]):
        if(r==c):
            A.itemset(node_degree[r])
        else:
            A.itemset(arc_weight[r,c])

or

list of lists
l = []
for r in range(dim):
    l.append([])
    for c in range(dim):
        if(i==j):
            l[i].append(node_degree[r])
        else:
            l[i].append(arc_weight[r,c])

where dim can be also 20000 , node_degree is a vector and arc_weight is another matrix. I wrote it in c++, it takes less less than 0.5 seconds while the others two in python more than 20 seconds. I know python is not c++ but I need to be as fast as possible. Thank you all.

4
  • Would a sparse matrix work in your case (i.e. mostly zeros)? Commented May 29, 2014 at 17:20
  • 1
    Try Cython. It should be able to reach the C speed, while still having the Python interface. Commented May 29, 2014 at 17:23
  • You could also consider using a dict, where the key is the row/col. That would give you fast lookups, although I don't think it would save you anything on creation. Commented May 29, 2014 at 17:24
  • You shouldn't be appending to the list if you already know it's size. Preallocate the memory first using ' l = [[0 for x in xrange(m)] for x in xrange(n)] '. See stackoverflow.com/a/6667288/1290264. Commented May 29, 2014 at 17:37

3 Answers 3

4

One thing is you shouldn't be appending to the list if you already know it's size.

Preallocate the memory first using list comprehension and generate the r, c values using xrange() instead of range() since you are using Python < 3.x (see here):

l = [[0 for c in xrange(dim)] for r in xrange(dim)]

Better yet, you can build what you need in one shot using:

l = [[node_degree[r] if r == c else arc_weight[r,c] 
            for c in xrange(dim)] for r in xrange(dim)]

Compared to your original implementation, this should use less memory (because of the xrange() generators), and less time because you remove the need to reallocating memory by specifying the dimensions up front.

Sign up to request clarification or add additional context in comments.

4 Comments

Using range or xrange is a minor issue here. Time spent creating lists of n elements is negligible when compared to pushing the data around for the n^2 matrix elements. With large sizes, any solution that involve dynamic lists or loops (whether explicit or in comprehensions) is a bad idea. This is precisely the problem that numpy was created to solve.
@Javier, thanks for the comment! My answer addresses time/memory improvements using native python. I do not think xrange will solve the OPs problem -- I'm simply offering the most efficient pythonic way of implementing the OPs code in native python. Clearly, if the OP still has issues, a non-native solution such as numpy is necessary.
I understand your point of view, and certainly your solution is faster than the double loop, but since the question specifically mentions the dimensions of the problem (2000x2000) I think a good answer must address the specific issues that arise with large data structures (excessive type checking, need of contiguous memory blocks for fast iteration). If for some reason one wanted to use only the standard library, the right data structure to use would be an array, never a list, but since the OP already used numpy in the question I see no reason not to use it.
Your solution did a great job covering those topics, why would I reproduce them in my solution? I posted this answer to address a topic that was not addressed in your answer, i.e., the OPs attempt at using a list[list] structure. If the OP isn't using it the most efficient way it's hard to say if numpy is needed or if it will just be preoptimization. Anyway, I think your answer is good so I will not duplicate it my answer.
0

Numpy matrices are generally faster as they know their dimensions and entry type.

In your particular situation, since you already have the arc_weight and node_degree matrices created so you can create your matrix directly from arc_weight and then replace the diagonal:

A = np.matrix(arc_matrix)
np.fill_diagonal(A, node_degree)

Another option is to replace the double loop with a function that puts the right element in each position and create a matrix from the function:

def fill_matrix(r, c):
    return arc_weight[r,c] if r != c else node_degree[r]

A = np.fromfunction(fill_matrix, (dim, dim))

As a rule of thumb, with numpy you must avoid loops at all costs. First method should be faster but you should profile both to see what works for you. You should also take into account that you seem to be duplicating your data set in memory, so if it is really huge you might get in trouble. Best idea would be to create your matrix directly avoiding arc_weight and node_degree altogether.

Edit: Some simple time comparisons between list comprehension and numpy matrix creation. Since I don't know how your arc_weight and node_degree are defined, I just made up two random functions. It seems that numpy.fromfunction complains a bit if the function has a conditional on it, so I construct the matrix in two steps.

import numpy as np

def arc_weight(a,b):
    return a+b

def node_degree(a):
    return a*a

def create_as_list(N):
    return [[arc_weight(c,r) if c!=r else node_degree(c) for c in xrange(N)] for r in xrange(N)]

def create_as_numpy(N):
    A = np.fromfunction(arc_weight, (N,N))
    np.fill_diagonal(A, node_degree(np.arange(N)))
    return A

And here the timings for N=2000:

time A = create_as_list(2000)
CPU times: user 839 ms, sys: 16.5 ms, total: 856 ms
Wall time: 845 ms

time A = create_as_numpy(2000)
CPU times: user 83.1 ms, sys: 12.9 ms, total: 96 ms
Wall time: 95.3 ms

2 Comments

The second is giving error: fill_matrix() takes exactly 2 arguments (1 given) Could you please correct it?
Sorry, vectorize is only for one-parameter function. I have replaced the second version by a working one using np.fromfunction. Still, if your matrix is large your biggest issue here is that you seem to be duplicating all your data in memory, avoiding that will be a better improvement than any workaround we post here.
0

Make a copy of arc_weight and fill the diagonal with values from node_degree. For a 20000-by-20000 output, it takes about 1.6 seconds on my machine:

>>> import numpy
>>> dim = 20000
>>> arc_weight = numpy.arange(dim**2).reshape([dim, dim])
>>> node_degree = numpy.arange(dim)
>>> import timeit
>>> timeit.timeit('''
... A = arc_weight.copy()
... A.flat[::dim+1] = node_degree
... ''', '''
... from __main__ import dim, arc_weight, node_degree''',
... number=1)
1.6081738501125764

Once you have your array, try not to iterate over it. Compared to broadcasted operators and NumPy built-in functions, Python-level loops are a performance disaster.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.