ADT Lecture 15 - 1 December 2014

This is the final lecture.

Minimum spanning tree

Suppose we have an undirected connected graph with edge values (each edge or connection between nodes has a value associated to it) that is not a tree (there could be more than one path between a pair of nodes)

The minimum spanning tree problem is to construct a tree from this graph where the sum of the edge values is the minimum.

In this graph we can see that 15 and 9 are edges that cannot be omitted as it would disconnected the other nodes from the rest of the graph (and a tree must be connected)

Note: when doing this try to write a pencil mark seperating the connected nodes from the rest of the tree

Now we have to connect them. From the node on the left of 15 we can see that there are two ways to connect them by going to 10 or 7. As we want the tree to be minimum we chose 7 to minimize the cost.

At this moment the edges 7, 9 and 15 are connected. From these connected nodes we want to connect to other nodes in the graph and 5 is the path with least value so we add 5 to our connection.

After we picked 5, we can see that the topmost node needs to be connected so we have to choose either 6 or 12 and 6 is chosen for its cost.

Then from the connected set of nodes we then have to connect the rest of the graph. We can do this by connecting edge 8 and 3

The resulting tree:

This algorithm is a greedy algorithm which use decision based on the guideline that applies to the local situation. In this case the guideline is that we chose the edge with minimum value. Specifically, this algorithm is called Prim's algorithm. The proof that this algorithm generates a correct minimum spanning tree is by contradiction.

The origin of Prim's algorithm is from finding a way to interconnect telephone networks.

Disjoint set

A disjoint set is a set of N objects where we want to perform the following operations:

Union: union(A, B) means that we're including set A into set B. (Thus B becomes the identity of both, and the set connected to A if any)
Find (connected): connected(A, B) checks whether A and B are in the same set

Pseudocode:

DisjointSet a = new DisjointSet(new int[]{0,1,2,3,4,5});
a.union(4, 3);
a.union(6, 5);
a.union(3, 5);
assert !a.connected(0, 3); // false
assert a.connected(4, 6); // true

API:

public class UF{ // union-find
    private int id[];
    public UF(int N){
        /**
         * Each elements in id have unique value but will change to the same
         * if they're conected
         * eg. [0, 1, 1, 8, 8, 0, 0, 1, 8, 8]
         * means that index 0, 5, 6 are connected to 0
         * index 1, 2, 7 are connected to 1
         * index 3, 4, 8, 9 are connected to 8
         */
        id = new int[N];
        for(int i = 0; i < N; i++){
            id[i] = i;
        }
    }
    void union(int p, int q){
        /**
         * Change all element with value as as id[p] to be id[q]
         */
        int pid = id[p];
        int qid = id[q];
        for(int i = 0; i < id.length; i++){
            if(id[i] == pid){
                id[i] = qid;
            }
        }
    }
    boolean connected(int p, int q){
        return id[p] == id[q];
    }
}

The problem here is union is O(n) so the quick union algorithm is created where only id[p] get changed to be id[q] then to find the identity of a node we have to walk through the parents of it until we found a node that its index is equals to its value.

private int root(int i){
    while(i != id[i]){
        i = id[i];
    }
    return i;
}
public boolean connected(int p, int q){
    return root(p) == root(q);
}
public void union(int p, int q){
    id[root(p)] = root(q);
}

The new problem here is finding the root of a node can take several walk. To remedy this we try to limit the height of the tree to be 1-2 levels. We can do this by doing path compressing.

Path compression is to change the parent of an element to be their root recursively so that finally all nodes in a tree are pointed to the root. This is called two-pass. There is also one-pass where parent of each elements are set to their grandfather.

One pass:

private int root(int i){
    while(i != id[i]){
        id[i] = id[id[i]];
        i = id[i];
    }
    return i;
}

The two pass variant is used in the maze generation lab.

The worst case running time of M union-find operations on a set of N objects is O(N+Mlog(N))