I have the adjacency matrix of a graph stored as sparse scipy scr matrix. When I call the scipy.sparse.csgraph.minimum_spanning_tree function, my generated sparse array have too few non zero values ( less than V-1). Doing some tests with smaller datasets (just for test), I noticed that when I convert my sparse input matrix to a dense one, the function calculate a tree.
Here is my code to reproduce the behavior and my input sparse matrix :
import scipy as sp
from scipy.sparse.csgraph import minimum_spanning_tree
from scipy.sparse import csr_matrix
sparse_matrix = sp.sparse.load_npz('sparse_matrix.npz')
MST = minimum_spanning_tree(sparse_matrix)
print(f"weight of MST: {MST.data.sum()}")
print(f'non-zero edges {MST.size}')
dense_matrix = sparse_matrix.todense()
MST2 = minimum_spanning_tree(dense_matrix)
print(f"weight of MST2: {MST2.data.sum()}")
print(f'non-zero edges {MST2.size}')
sparse_from_dense = csr_matrix(dense_matrix)
MST3 = minimum_spanning_tree(sparse_from_dense)
print(f"weight of MST3: {MST3.data.sum()}")
print(f'non-zero edges {MST3.size}')
The outputs are:
weight of MST: 1399.3271604776382
non-zero edges 459
weight of MST2: 1653.5667477846146
non-zero edges 698
weight of MST3: 1653.5667477846146
non-zero edges 698
Does anyone know what is happening and how to get around it? Since the expected number of vertices is indeed 698, I believe the dense outputs are correct.
I tried to look for other ways to represent a matrix sparsely and calculate the MST from a sparse input but I didn't find any other alternatives.