Algorithm to find MST in a huge complete graph

Question

Let's assume a complete graph of > 25000 nodes. Each node is essentially a point on a plane. It has 625M edges. Each edge has length which should be stored as a floating point number.

I need an algorithm to find its MST (on a usual PC).

If I take Kruskal's algorithm, it needs to sort all edges first, but I cannot afford even store the edges altogether in memory at the same time.

If I choose Prim's algorithm, it's quite difficult to evaluate how many edges will be stored in a heap at the same time, but probably the most of them will be there very soon after algorithm starts.

Is there any more memory-sufficient algorithm which can allow me to avoid sorting edges stored in a file?

Also, are there any known MST algorithms which utilize the fact that any tree edges satisfy triangle inequality?

Is the graph complete? 625M is just 25000**2. Also, here en.wikipedia.org/wiki/Euclidean_minimum_spanning_tree — zw324
– zw324, Commented Jul 3, 2013 at 14:24
There are also library for this: mlpack.org/doxygen.php?doc=emst_tutorial.html — zw324
– zw324, Commented Jul 3, 2013 at 14:28

Nuclearman · Accepted Answer · 2013-07-03 17:29:48Z

6

You can still use Kruskal's algorithm.

You don't actually need to sort the edges, what the algorithm requires is simply a method for repeatably finding the smallest weight edge that hasn't already been used. Presorting the edges and iterating through that list is simply a very efficient way of doing so.

You can do the same thing simply by repeatably find the k-smallest unused edges (where k is a manageable number, probably at least |V|), then sort and iterate through them instead as needed. This breaks the sorting process down into more manageable segments, although there is a time-space tradeoff as depending on how large k is the time complexity of this process can be anywhere from O(E log E) (k = E) to about O(E^2) (k = 1).

answered Jul 3, 2013 at 17:29

Nuclearman

5,4042 gold badges21 silver badges37 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

David Eisenstat · Accepted Answer · 2013-07-03 18:25:57Z

3

Boruvka's algorithm makes a logarithmic number of passes on the unsorted edge list. The memory required is proportional to the number of nodes.

answered Jul 3, 2013 at 18:25

David Eisenstat

65.7k7 gold badges66 silver badges127 bronze badges

Comments

Ante · Accepted Answer · 2013-07-03 16:47:13Z

0

Try to use this algorithm

1: Append weight w and outgoing vertex v per edge into a list, X.
2: Divide the edge-list, E, into segments with 1 indicating the start
of each segment, and 0 otherwise, store this in flag array F.
3: Perform segmented min scan on X with F indicating segments
to find minimum outgoing edge-index per vertex, store in NWE.
4: Find the successor of each vertex and add to successor array, S.
5: Remove cycle making edges from NWE using S, and identify
representatives vertices.
6: Mark remaining edges from NWE as part of output in MST.
7: Propagate representative vertex ids using pointer doubling.
8: Append successor array’s entries with its index to form a list, L
9: Split L, create flag over split output and scan the flag to find
new ids per vertex, store new ids in C.
10: Find supervertex ids of u and v for each edge using C.
11: Remove edge from edge-list if u, v have same supervertex id.
12: Remove duplicate edges using split over new u, v and w.
13: Compact and create the new edge-list and weight list .
14: Build the vertex list from the newly formed edge-list.
15: Call the MST Algorithm on

Author:

Vibhav Vineet    
Pawan Harish    
Suryakant Patidar    
P. J. Narayanan

Source

edited Jul 3, 2013 at 16:47

Ante

5,4686 gold badges26 silver badges47 bronze badges

answered Jul 3, 2013 at 14:33

Jayram

19.6k6 gold badges54 silver badges72 bronze badges

3 Comments

Jayram Over a year ago

Question too, isn't it? @Fabinout

Fabinout Over a year ago

Plus, MST stands for STD in French, so... Even harder than one can think. @Jayram

Roman Over a year ago

Doesn't it have memory complexity O(E)?

Collectives™ on Stack Overflow

Algorithm to find MST in a huge complete graph

3 Answers 3

Comments

Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related