Minimum Spanning Tree (MST) algorithm variation

Question

I was asked the following question in an interview and I am unable to find an efficient solution.

Here is the problem:

We want to build a network and we are given c nodes/cities and D possible edges/connections made by roads. Edges are bidirectional and we know the cost of the edge. The costs of the edges can be represented as d[i,j] which denotes the cost of the edge i-j. Note not all c nodes can be directly connected to each other (D is the set of possible edges).
Now we are given a list of k potential edges/connections that have no cost. However, you can only choose one edge in the list of k edges to use (like getting free funding to build an airport between two cities).

So the question is... find the set of roads (and the one free airport) that minimizes total cost required to build the network connecting all cities in an efficient runtime.

So in short, solve a minimum spanning tree problem but where you can choose 1 edge in a list of k potential edges to be free of cost. I'm unsure how to solve... I've tried finding all the spanning trees in order of increasing cost and choosing the lowest cost, but I'm still challenged on how to consider the one free edge from the list of k potential free edges. I've also tried finding the MST of the D potential connections and then adjusting it according the the options in k to get a result.

Thank you for any help!

Not sure about this, but instead of solving one big problem maybe solve c regular problems. Assume a node is an airport, and just calculate the main network. Repeat c times. — Dialecticus
– Dialecticus, Commented Apr 21, 2020 at 17:28

templatetypedef · Accepted Answer · 2020-04-21 18:42:33Z

One idea would be to treat your favorite MST algorithm as a black box and to think about changing the edges in the graph before asking for the MST. For example, you could try something like this:

for each edge in the list of possible free edges:
    make the graph G' formed by setting that edge cost to 0.
    compute the MST of G'

return the cheapest MST out of all the ones generated this way

The runtime of this approach is O(kT(m, n)), where k is the number of edges to test and T(m, n) is the cost of computing an MST using your favorite black-box algorithm.

We can do better than this. There's a well-known problem of the following form:

Suppose you have an MST T for a graph G. You then reduce the cost of some edge {u, v}. Find an MST T' in the new graph G'.

There are many algorithms for solving this problem efficiently. Here's one:

Run a DFS in T starting at u until you find v.
If the heaviest edge on the path found this way costs more than {u, v}:
   Delete that edge.
   Add {u, v} to the spanning tree.
Return the resulting tree T'.

(Proving that this works is tedious but doable.) This would give an algorithm of cost O(T(m, n) + kn), since you would be building an initial MST (time T(m, n)), then doing k runs of DFS in a tree with n nodes.

However, this can potentially be improved even further if you're okay using some more advanced algorithms. The paper "On Cartesian Trees and Range Minimum Queries" by Demaine et al shows that in O(n) time, it is possible to preprocess a minimum spanning tree so that, in time O(1), queries of the form "what is the lowest-cost edge on the path in this tree between nodes u and v?" in time O(1). You could therefore build this structure instead of doing a DFS to find the bottleneck edge between u and v, reducing the overall runtime to O(T(m, n) + n + k). Given that T(m, n) is very low (the best known bound is O(m α(m)), where α(m) is the Ackermann inverse function and is less than five for all inputs in the feasible univers), this is asymptotically a very quick algorithm!

Matt Timmermans · Accepted Answer · 2020-04-22 17:41:23Z

2

First generate a MST. Now, if you add a free edge, you will create exactly one cycle. You could then remove the heaviest edge in the cycle to get a cheaper tree.

To find the best tree you can make by adding one free edge, you need to find the heaviest edge in the MST that you could replace with a free one.

You can do that by testing one free edge at a time:

Pick a free edge
Find the lowest common ancestor in the tree (from an arbitrary root) of its adjacent vertices
Remember the heaviest edge on the path between the free edge vertices

When you're done, you know which free edge to use -- it's the one associated with the heaviest tree edge, and you know which edge it replaces.

In order to make steps (2) and (3) faster, you can remember the depth of each node and connect it to multiple ancestors like a skip list. You can then do those steps in O(log |V|) time, leading to a total complexity of O( (|E|+k) log |V| ), which is pretty good.

EDIT: Even Easier Way

After thinking about this a bit, it seems there's a super easy way to figure out which free edge to use and which MST edge to replace.

Disregarding the k possible free edges, you build the MST from the other edges using Kruskal's algorithm, but you modify the usual disjoint set data structure as follows:

Use union by size or rank, but not path compression. Every union operation will then establish exactly one link, and take O(log N) time, and all path lengths will be at most O(log N) long.
For each link, remember the index of the edge that caused it to be created.

For each possible free edge, then, you can walk up the links in the disjoint set structure to find out exactly at which point its endpoints were connected into the same connected component. You get the index of the last required edge, i.e., the one it would replace, and the free edge with the greatest replacement target index is the one you should use.

edited Apr 22, 2020 at 17:41

answered Apr 21, 2020 at 18:13

Matt Timmermans

60.9k3 gold badges57 silver badges106 bronze badges

7 Comments

templatetypedef Over a year ago

Silly question, but do you know of a short proof that the tree given by cutting the heaviest edge in that cycle is indeed an MST? I've been trying to find a short proof of this for a while and everything I'm coming up with is running at multiple paragraphs.

Matt Timmermans Over a year ago

@templatetypedef consider restarting Kruskal's with the edges in the same order, except with the new edge first. The edges after the removed edge will still be required, since they see the seem set of disjoint sets. The edges before the removed edge will still be required, because the preceding edges do not connect their endpoints.

templatetypedef Over a year ago

The edges before the lowered edge I agree with, but I’m having a heck of a time trying to formalize the bit about the edges afterward because the CCs of the new graph will be different. Is there some nice property the new CCs have relative to the old ones?

Matt Timmermans Over a year ago

To be clear, it's not strictly a lowered edge, because the k free edges needn't be in the original graph. You could consider them to be there and expensive, but then maybe they're not in the original MST, so no difference. Anyway...

Matt Timmermans Over a year ago

Let's consider the CCs formed by only the edges <= the replacement target. They are exactly the same no matter what order those edges are processed in. They remain exactly the same when we add the free edge, because it creates a cycle with the heaviest edge and edges of lesser cost. They remain exactly the same when any edge in that cycle is removed.

|

Collectives™ on Stack Overflow

Minimum Spanning Tree (MST) algorithm variation

So the question is... find the set of roads (and the one free airport) that minimizes total cost required to build the network connecting all cities in an efficient runtime.

2 Answers 2

Comments

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

So the question is... find the set of roads (and the one free airport) that minimizes total cost required to build the network connecting all cities in an efficient runtime.

2 Answers 2

Comments

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related