---------- OUTLINE ---------- Office Hours Introduction Test Coverage And Tips Traversals * Bellman-Ford * Dijkstra's * Topological * Breadth First Search * Depth First Search MST's / Kruskal's Design Problem (Shortest Path, Least Edges) -- FIRST HOUR -- Break Tree Invariants * Search * AVL * Heap General Questions Bragging / Complaining Closing Office Hours ---------- INTRODUCTION ---------- I won't be explicitlly covering the material from before the second midterm. You are certainly welcome to ask questions about it during the general question section at the end, or during my office hours, which I plan to hold after discussion. I'm not going to spend much time on applying the algorithms to real world situations. Although they are very interesting, you have already seen a number of them, and many of them (especially those whose difficulty is within the scope of this class) are fairly straightforward. For example, let's look at Networking. We can represent each router or cell (in a cell phone network) as a node in the graph and give each edge a weight representing the transmission cost or time or voltage required or current traffic/congestion. From here, figuring out the cheapest/fastest way to route is just a straightforward application of one of the graph traversal algorithms (likely Dijkstra's). Instead, I will focus on: In the following notes I plan to present a synthesized view of the topics from the last two weeks, organized using an alternative taxonomy. Specifically, the review of this material will be motivated by a discussion of techniques for traversing trees and graphs. This, I belive, is the overarching theme from the last portion of the course and should serve to motivate all of the material on trees, graphs, and project 3. We will be doing specific, interactive examples along the way (especially with the graph traversals). Please ask questions as we go if something is not clear. I will begin with some advice on what may be on the test and how to study for it. Then we will cover the material roughly from the most general / conceptual to the most specific. ---------- TEST COVERAGE AND TIPS ---------- I haven't seen the test, so this is mostly speculation. It would be best to be familiar with everything covered in the course. Not only may you be tested directly on the earlier material, but it also serves as the basis for understanding the more recent material. If you have limited time, then it may be wise to focus your effort on the topics covered after the second midterm; At least you would have reviewed the old material once (for the midterms). The questions on the final will likely be similar to those on the 2nd midterm. That is, there will be some execution problems (where you are asked to carry out one of the algorithms or use one of the data structures on some given input), application problems (where you are given some problem and asked to determine which algorithm and/or data structure to use to solve it), and design and analysis problems (where you are given some problem and asked to design an algorithm and/or data structure to solve it). I would suggest that you first get familiar and comfortable with the structures and algorithms. Just understanding the basic ideas or writing down the pseudo-code will not be sufficient; You should become familiar with the intracacies before the test. You can do this by running through example executions of the algorithms and data structures, or by implementing them. Beyond that, I would suggest challenging yourself with difficult design problems; We will look at some today. ---------- RUN TIMES ---------- --- TREES Pre, In, and Post order traversal - O(n). Can also be expressed O(|V|). Notice that in a binary tree, |E| < 2|V|, so O(|V|) = O(|E|). Any insert, search, or remove in/of/from a binary search tree - O(log(n)). Insert, search in/of an AVL tree - O(log(n)). findMin on a heap - O(1). Insert in a heap - O(log(n)), O(1) average. deleteMin on a heap - O(log(n)). --- GRAPHS Adjacency list initialization - O(|E|). Adjacency list space requirement - O(|E|). Adjacency matrix initialization - O(|V|^2) or less? Adjacency matrix space requirement - O(|V|^2). Bellman-Ford - O(|E|*|V|). Dijkstra's - O(|E|*log(|V|)) (normal) or O(|V|^2) in a dense graph. Topological - O(|E|). BFS - O(|E|). MST - O(|E|log(|E|)). ---------- TERMINOLOGY ---------- Be careful to distinguish between shortest path in terms of weight or distance and shortest length path in terms of edges or depth. In some cases (unweighted shortest path / breadth first search, for example), there isn't a difference. In the other algorithms, however, the difference is important. ---------- TRAVERSALS ---------- 1. Initialize all nodes: weight = Infinity, prev = self 2. Initialize the start node(s) (weight = 0, prev = self); Put on queue 3. While there are more nodes 4. Pull off the first one 5. Should we look at this node? 6. Take "note" of visit 7. If yes, for each neighbor 8. If shorter path through current, update weight, prev of neighbor 9. If condition, enqueue neighbor --- BELLMAN-FORD Bellman-Ford is the most general of the graph traversal / shortest path algorithms; It can handle graphs with or without cycles and unweighted, weighted, and negative weight edges. It can also detect the presence of negative weight cycles. In short, it can handle any graph. This generality comes at the cost of performance. Let's begin with the modifications to the generalized form needed to describe the algorithm: 2. Start node weight = 0, prev = self 5. If node visited more than |V| times, we have a neg cycle 6. Increment visit count of current node 9. Condition - If shorter path through current and neighbor not already in queue Now let's do an example. [Write up the start graph, have them work on it individually for 5 minutes, and then review the solution. Draw up the final shortest path tree in the shape of a graph and a tree.] [ASCIIgraphical solution and shortest path tree/graph.] The basic idea behind the algorithm is fairly straightforward: We just keep looking at vertices until all of the shortest paths have been determined (this might be seen as the entire system "stabilizing"). Each time the shortest path for a node changes, we reconsider it and notify its neighbors of the change by checking their path lengths. One key feature of this algorithm is that its fundamental operation, the checking and updating of neighbor nodes, is "safe". That is, although it may be more expensive to do so, we can execute it as many times as we like without endangering the correctness of the algorithm. Notice the efficiency improvement mentioned in the pseudo-code. We know, from what we just said, that it is safe to leave it out. How do we know it is safe to include it? All we need to ensure, each time we update the shortest path to a node, is that it is mentioned somewhere in the queue so that the node will be reconsidered at some point in the future. When the node is being reconsidered, only the latest / best path to the node will be available, so it doesn't matter how many other paths were considered. Suppose node A has its path length updated and gets put in the queue. Now suppose it is updated again (a shorter path was found to A). Whether or not we put A in the queue again, the first revisit to A will consider the path created by the second update to A. If we do put it in the queue again, then the second revisit to A will either be a repeat of the first, or will update based on an even shorter path, the discovery of which will insert another copy of A into the queue. Bellman-Ford is fundamentally a breadth first search. During (conceptual) iteration i, we are looking at shortest paths with depth (in terms of the number of edges) i. Unlike the more straightforward unweighted shortest path algorithm (which is also called graph breadth first search), we can visit a node more than once, in order to satisfy the orthogonal concern of finding the shortest (least weight) path. [Example - Just a linked list of nodes.] How do we know that we have detected a negative cycle when we visit any node more than |V| times? Cycle detection is actually fairly simple. We can set the limit on the number of visits to any node as high as we like and, if a negative cycle is present, that limit will eventually be reached (the algorithm will just keep cycling through and incrementing the visit counts for the involved nodes). We might, for example, set the max visit count to (|E| * |V|) and be confident that it is at least correct. Of course, the efficiency is horrible. So how do we lower the visit count required to detect a negative cycle while ensuring correctness? Our goal is to go as low as we can, without conflicting with any legitimate (non-cycle) case. So what is the largest legitimate visit count we can achieve? Begin by choosing an arbitrary destination node. Suppose that, in the worst case, each time we increase the depth (the number of edges) of the paths we are looking at, we find a shorter (lower weight) path to this destination. In this case, the visit count for the destination node will equal the maximum legal path depth for that graph. The assertion we want to prove is that this maximum legal path depth is (|V| - 1). Suppose we have a shortest path that touches every node in the graph, with one edge going in and one edge coming out of every node, except the start and destination nodes (which only have an outgoing and incoming edge, respectively). The total depth here will be (|V| - 1). Now suppose we choose two arbitrary nodes (they must be in the path because every node is in the path). What if we were to take the path from node 2 through node 1 and then to the destination node, instead of taking the path directly from node 2 to the destination? Wouldn't this increase the path depth to |V|? The only reason we would make such a change is if a + b + e > a + b + c + d - Otherwise we would just choose the path a, b, e. and a + d > a + b + e - Otherwise we would just choose the path a, d. This leads us to conclude that a + d > a + b + c + d 0 > b + c Therefore, the sum of b and c must be less than 0; We have a negative weight cycle. [Diagram] I'm not going to provide a proof of the run time for Bellman-Ford, but, intuitively, it seems that we can accomplish quite a bit by looking at each edge in the graph for each vertex in the graph. The run time for Bellman-Ford is O(|E||V|). --- DIJKSTRA'S Dijkstra's algorithm is the first special case / optimization on the Bellman-Ford algorithm that we will look at. It is able to achieve better run time by limiting the set of graphs it will work on. In other words, it uses special information about its input (specifically that none of the edge weights will be negative) to improve its performance. Here are the modifications to the generalize pseudo-code: 1. Also set visited = false 2. Also set visited = true 3. and totalVisitCount < |V| 4. (will be node with shortest path) 5. If node hasn't been visited 6. Mark node visited and totalVisitCount ++ 9. Condition - Shorter path to neighbor through current Let's begin with an example: [Example] Add elements to the set Run time - Look at pseudo code - Will only visit each node once Conceptually, Dijkstra's discovers all shortest paths by building the set of nodes with known shortest paths from one node (the start node) to all nodes, using the set as an aide in discovering the next shortest path. In the worst case, we will need to look at each edge in the graph once. The optimization in line 3 of the pseudo-code may reduce this on average, but the worst case still stands. Because we only look at each node once, we can gurantee that we will only look at each edge once. At worst, there will be an entry in the priority queue for each of the |E| edges at one time. Processing all of these edges will take |E|log(|E|) time. We can "reduce" this to O(|E|log(|V|)) using a math trick on the slow growing nature of the log: We know that, at most, |E| = |V|^2 (in the case of a complete graph). Therefore, |E|log(|E|), in the worst case, equals (|E| * 2log(|V|)) = O(|E|log(|V|)). --- TOPOLOGICAL The topological sort approach, like Dijkstra's, is another variation on Bellman-Ford that uses special information (the topological ordering of the nodes) to decrease run time. Let's begin, as usual, with the pseudo-code: 1. Also compute indegrees, set visited = false 2. Put all 0 indegree nodes on queue 5. Only if we haven't visited it before 6. visited = true, remove all edges 9. Condition: If indegree is 0 Now let's do an example: [Include example and hierarchically drawn topo ordering.] There are two ways to think about the Topological sort shortest path search: 1. In the first version, we split the process into a topological sort phase and a shortest path search phase. We might visualize topologically sorting the nodes in a graph by grabbing the nodes which initially have indegree 0 and shaking the tree so that all the arrows on the edges are pointing downward. If this is not possible, then we have a cycle. Of course, there are algorithms for topological sorting. Once topologically sorted, we could traverse the graph, from the highest to lowest. The distance of each node would then be the minimum of the sums of the distance of each parent and the weight of the edge connecting the two. 2. Alternatively, and in line with the approach we have taken with the other shortest path algorithms, we might view the toplogical sorting as simply a way of deciding what values go in the queue. Determining the shortest path would take the normal form. From the pseudo-code provided above, it can be seen that we will visit each node at most once. As such, we can only possibly traverse each edge once. Therefore, we have a run time of O(|E|). One possibly tricky point to take note of: Unlike the other shortest path algorithms, we do not "begin" at the start node. Instead, we simply set the distance of the start node to 0 and let it fall naturally into the topological ordering. Once we have run the algorithm, nodes with infinite weight are unreachable from the start node, while the others have shortest path equal to their weight. --- BREADTH FIRST SEARCH The unweighted shortest path algorithm, also known as breadth first search, is the simplest of the shortest path algorithms. It can be considered a special case of Dijkstra's, and therefore has increase performance relative to Dijkstra's. 5. Always visit a node 9. Condition: Weight was Infinity (hadn't been visited) Because a node may only be enqueued once, we only visit each node once. As shown above, this leads to the conclusion that we only traverse each edge once and that the algorithm has a run time of O(|E|). Breadth first search of a tree is simply considered a special case of the graph breadth first search algorithm. --- DEPTH FIRST SEARCH PreOrder, InOrder, and PostOrder tree traversals are all considered depth first search. --- MAJOR THEMES The way that you traverse a structure, such as a tree or graph, determines not only the output, but may also shape the structure. When you traverse a graph, for example, the traversal type determines the resulting tree. As we saw in project 3, your choice of traversal type (DFS, BFS) actually modifies the structure of the game tree. Every type of traversal requires some auxiliary structure (specifically a stack or a queue). Even if we use recursion (as in a DFS), we are still implicitlly using a stack. We can view all of the shortest path algorithms as having the same fundamental structure, with differences only in how values are put onto and taken from the queue. ---------- MINIMUM SPANNING TREES / KRUSKALS ---------- Finding a minimum spanning tree requires the construction of a tree from a graph, through the systematic selection of edges, but is not similar enough to the shortest path algorithms to be fit under the same structure. Also unlike the other algorithms, Minimum Spanning Trees are applied to undirected graphs. Kruskal's algorithm is a straigthforward technique for finding MSTs. Here is the pseudo-code for Kruskal's: Sort all edges or put all edges in a priority queue / minHeap edgeCount = 0 While edgeCount < (|V| - 1) Get the next least-weight edge If new edge wont create cycle Add to tree edgeCount ++ Join vertex sets connected by edge [Example] When implementing this algorithm, we have a choice on how the edges should be sorted. We can sort them all at once, up front, using Quick Sort. Alternatively, we can sort them as we go, by initially inserting them into heap and doing removeMin when we need the next edge. Although Quick Sort will be faster than Heap Sort across all the edges, it is likely that the algorithm will terminate before looking at all the edges. I therefore recommend that the Heap Sort version is used. In either case, the running time is O(|E|log(|E|)). ---------- DESIGN PROBLEM 1 (SHORTEST PATH, LEAST EDGES ---------- Question (from the book): Explain how to modify Dijkstra's algorithm so that, if there is more than one minimum path from the start node to any other node, a path with the fewest edges is chosen. My answer: Add some small constant factor (such as 0.000001) to the weight of each edge. If two paths are equal in their major term (everything left of the decimal place), then the one with the smaller decimal (proportional to the number of edges it contains) will be chosen. ---------- TREE INVARIANTS ---------- I will motivate the remainder of the discussion, which is about trees, with the various invariants that define each type of tree. That is, given the tree invariants, the operations and algorithms needed to enforce those invariants should come (somewhat) naturally. --- SEARCH TREES Invariant - For each node, all values in the left subtree must be less than the nodes value and all values in the right subtree must be greater. You should be familiar with Binary Search Trees and their operations by now. Insertion is incredibly straightforward. Removal is a bit more complicated. In particular, when removing a node with two sub children, replace that node with the smallest element in the right subtree (which will be the leftmost element in the right subtree). We know that this element is less than all of the elements in right subtree, thereby ensuring the search tree invariant for the right subtree. We also know that every element in the right subtree is greater than every element in the left subtree, thereby ensuring the search tree invariant for the left subtree. The run time for all insertions and removals on a binary search tree is O(log(n)). Experiments show that n random insertions to an unbalanced binary search tree will cost 1.38nlog(n); This is still O(log(n)) per insertion, but motivates the use of balanced search trees. -- AVL TREES Invariant - For each node, the heights of the left and right subtrees may differ by at most one. Tips for hand executing insertions to an AVL tree: * After insertion, travel the path from the insertion node to the root, checking the AVL property as you go. * When you find a problem, rebalance. This should be it; No more rebalances should be necessary to gurantee the AVL property for the whole tree. * If an insertion to the outside causes a prolem, do a single rotation. * If an insertion to the inside causes a problem, do a double rotation. * Be pragmatic! You should visually inspect that the search and AVL properties are maintained after each operation. * Write down the tree forms that require the different rotations. If you practice enough, you will memorize these. [Interactive example. Tell them they can use their book or cheat sheet for the general form of the rotations.] The run time for AVL tree insert and search is O(log(n)). We didn't cover removals. --- HEAPS Invariant - For each node, its value must be less than all values in its left and right subtrees. Also, the heap must be a complete binary tree. Binary heaps are our priority queue implementation. Operations - Insert - Insert the new value into the next open spot at the base of the tree (while preserving the complete tree property). percolateUp by switching with the values parent until the heap order property is satisfied. deleteMin - Put the last value from the base of the tree (while preserving the complete tree property) in the open spot at the root of the tree. percolateDown by switching with the lesser of the two children until the heap ordering property is satisfied. [Interactive example.] In the worst case, and under simple analysis, insertions will take O(log(n)). Experimental data shows that, on average, 1.6 swaps are required per insertion. Therefore, we might consider insertion a constant time operation. deleteMin generally requires that the value taken from the base of the tree and inserted at the root will need to propagate back to the base. The heap is guranteed to have a height of log(n), so the run time of deleteMin is O(log(n)), and generally requires the full time to complete. The idea behind Heap Sort is pretty straightforward: Insert the elements into a heap, and pull them out one by one, in order, using deleteMin. The only really tricky part is efficiently getting the elements into a heap. We could insert them one at a time, which would cost O(nlog(n)) for all elements. This would be acceptable, as it is on par with the deleteMin time of O(nlog(n)) across all elements. However, we can do better. Begin by inserting all of the elements into a complete binary tree (without enforcing any invariants on the values). Then, beginning at second to lowest level in the tree and working breadth-first toward the root, call percolateDown on every node. This will run in linear time on the number of edges, which is a constant multiple of the number of nodes; This algorithm runs in linear time. An analysis of the run time and a correctness proof are provided in the book. [Example] ---------- BRAGGING / COMPLAINING ---------- Math 1A,B 53-55 CS61A-C Systems - 122 (Networking), 162 (OSs), 164 (Compilers and Languages), 186 (DBs), 194 (Dist Sys) Theory - 170 (Intro to Theory), 172, 174 Hardware - EE* and CS150, 152 (Architecture) Misc - 160 (UI), 184 (Graphics), 188 (AI) Trancendant - Security (new class), 169 (Software Engineering) As an alternative to the prerequisite chart distributed by the department, I have provided my own taxonomy of the CS classes offered. I'm an undergraduate in EECS here at Berkeley. I've taken 7 of the upper division clases offered here - All of what I've called the "systems" classes, 170 (Intro to Theory), and CS152 (Architecture). I'd like to simultaneously brag and complain about my experiences in those classes, in order to give an idea of what is to come for the CS majors here and what the non-CS majors have avoided or missed (depending on your perspective). In EECS122 (Networks), my solution to the first project was a 2K line C program (which implemented a centralized peer-to-peer network), the second project was a week long black box analysis of TCP, and my solution to the third project was a 4K line C program (which implemented the Chord decentralized overlay network). Each of the four projects in CS162 (OSs) is about the size of the third project in this class. Some of them, especially those that involve multithreading, are substantially more conceptually challenging. Further, the median score on 3 of the 4 projects was a perfect. On one project, my partners and I (we worked in groups of 4) spent an entire day to find one bug. On another project, myself and one of my project partners spent an entire weekend - 3 full 12 hour days (not to mention that they were very nice sunny days) - in the lab, fixing what turned out to be a bug in the code that had been given to us, which were not allowed to modify directly. Not only did it turn out that this bug had been known about for a couple of years (we went back to old newsgroup posts), but our fix for the problem actually broke our submission for some of their test cases, when we had previously had a perfect submission. We were eventually given back all three slip days and all the points. I'm still waiting for my weekend back. In 152 (Architecture), where we built a processor, we were required to keep notebooks detailing when we were in the lab and what we did. One of my project partners and I logged over 400 hours each, while the other two partners logged more than 300 hours each. This was in addition to lectures and discussions, reading (I read the entire book), 4 large homeworks, two midterms, and the final. So what can we take away from all of this? First, CS is a lot of work. Of course, you can put in less, but you will probably leave with less (in terms of knowledge and in terms of grades). Even spending a lot of time doesn't gurantee good grades; I don't have a perfect record. Second, competition is fierce. This often detracts from the whole experience. I guess this is what we expect when we put so many hard working people together. Finally, if it interest you, CS is a lot of fun and really interesting; I simply wouldn't have spent this much time if I hadn't been enjoying myself. ---------- CLOSING ---------- Thank you guys for coming today and for taking this class. You've done a lot of work, and hopefully learned a lot. I've enjoyed teaching you and hope you've learned a lot both about how to write programs and about some of the broader themes of CS.