----------
OUTLINE
----------

Office Hours
Introduction
Test Coverage And Tips
Traversals
  * Bellman-Ford
  * Dijkstra's
  * Topological
  * Breadth First Search
  * Depth First Search
MST's / Kruskal's
Design Problem (Shortest Path, Least Edges)
-- FIRST HOUR --
Break
Tree Invariants
  * Search
  * AVL
  * Heap
General Questions
Bragging / Complaining
Closing
Office Hours


----------
INTRODUCTION
----------

I won't be explicitlly covering the material from before the second
midterm. You are certainly welcome to ask questions about it during the
general question section at the end, or during my office hours, which I plan
to hold after discussion.

I'm not going to spend much time on applying the algorithms to real world
situations. Although they are very interesting, you have already seen a number
of them, and many of them (especially those whose difficulty is within the
scope of this class) are fairly straightforward. For example, let's look at
Networking. We can represent each router or cell (in a cell phone network) as
a node in the graph and give each edge a weight representing the transmission
cost or time or voltage required or current traffic/congestion. From here,
figuring out the cheapest/fastest way to route is just a straightforward
application of one of the graph traversal algorithms (likely Dijkstra's).

Instead, I will focus on:
In the following notes I plan to present a synthesized view of the topics from
the last two weeks, organized using an alternative taxonomy. Specifically,
the review of this material will be motivated by a discussion of techniques
for traversing trees and graphs. This, I belive, is the overarching theme
from the last portion of the course and should serve to motivate all of the
material on trees, graphs, and project 3.

We will be doing specific, interactive examples along the way (especially
with the graph traversals).

Please ask questions as we go if something is not clear.

I will begin with some advice on what may be on the test and how to study
for it. Then we will cover the material roughly from the most general /
conceptual to the most specific.

----------
TEST COVERAGE AND TIPS
----------

I haven't seen the test, so this is mostly speculation.

It would be best to be familiar with everything covered in the course. Not
only may you be tested directly on the earlier material, but it also serves
as the basis for understanding the more recent material.

If you have limited time, then it may be wise to focus your effort on the
topics covered after the second midterm; At least you would have reviewed
the old material once (for the midterms).

The questions on the final will likely be similar to those on the 2nd
midterm. That is, there will be some execution problems (where you are asked
to carry out one of the algorithms or use one of the data structures on some
given input), application problems (where you are given some problem and
asked to determine which algorithm and/or data structure to use to solve it),
and design and analysis problems (where you are given some problem and asked
to design an algorithm and/or data structure to solve it).

I would suggest that you first get familiar and comfortable with the
structures and algorithms. Just understanding the basic ideas or writing
down the pseudo-code will not be sufficient; You should become familiar
with the intracacies before the test. You can do this by running through
example executions of the algorithms and data structures, or by implementing
them. Beyond that, I would suggest challenging yourself with difficult design
problems; We will look at some today.

----------
RUN TIMES
----------

--- TREES

Pre, In, and Post order traversal - O(n). Can also be expressed O(|V|). Notice
that in a binary tree, |E| < 2|V|, so O(|V|) = O(|E|).

Any insert, search, or remove in/of/from a binary search tree - O(log(n)).

Insert, search in/of an AVL tree - O(log(n)).

findMin on a heap - O(1).
Insert in a heap - O(log(n)), O(1) average.
deleteMin on a heap - O(log(n)).

--- GRAPHS

Adjacency list initialization - O(|E|).
Adjacency list space requirement - O(|E|).

Adjacency matrix initialization - O(|V|^2) or less?
Adjacency matrix space requirement - O(|V|^2).

Bellman-Ford - O(|E|*|V|).
Dijkstra's - O(|E|*log(|V|)) (normal) or O(|V|^2) in a dense graph.
Topological - O(|E|).
BFS - O(|E|).
MST - O(|E|log(|E|)).

----------
TERMINOLOGY
----------

Be careful to distinguish between shortest path in terms of weight or
distance and shortest length path in terms of edges or depth. In some cases
(unweighted shortest path / breadth first search, for example), there isn't
a difference. In the other algorithms, however, the difference is important.

----------
TRAVERSALS
----------

1. Initialize all nodes: weight = Infinity, prev = self
2. Initialize the start node(s) (weight = 0, prev = self); Put on queue

3. While there are more nodes
4.   Pull off the first one
5.   Should we look at this node?
6.   Take "note" of visit
7.   If yes, for each neighbor
8.     If shorter path through current, update weight, prev of neighbor
9.     If condition, enqueue neighbor

--- BELLMAN-FORD

Bellman-Ford is the most general of the graph traversal / shortest path
algorithms; It can handle graphs with or without cycles and unweighted,
weighted, and negative weight edges. It can also detect the presence of
negative weight cycles. In short, it can handle any graph. This generality
comes at the cost of performance.

Let's begin with the modifications to the generalized form needed to describe
the algorithm:

2. Start node weight = 0, prev = self
5. If node visited more than |V| times, we have a neg cycle
6. Increment visit count of current node
9. Condition - If shorter path through current and neighbor not already
in queue

Now let's do an example.
[Write up the start graph, have them work on it individually for 5 minutes, and
 then review the solution. Draw up the final shortest path tree in the shape of
 a graph and a tree.]

[ASCIIgraphical solution and shortest path tree/graph.]

The basic idea behind the algorithm is fairly straightforward: We just keep
looking at vertices until all of the shortest paths have been determined
(this might be seen as the entire system "stabilizing"). Each time the
shortest path for a node changes, we reconsider it and notify its neighbors
of the change by checking their path lengths.

One key feature of this algorithm is that its fundamental operation, the
checking and updating of neighbor nodes, is "safe". That is, although it
may be more expensive to do so, we can execute it as many times as we like
without endangering the correctness of the algorithm.

Notice the efficiency improvement mentioned in the pseudo-code. We know, from
what we just said, that it is safe to leave it out. How do we know it is safe
to include it? All we need to ensure, each time we update the shortest path
to a node, is that it is mentioned somewhere in the queue so that the node
will be reconsidered at some point in the future. When the node is being
reconsidered, only the latest / best path to the node will be available,
so it doesn't matter how many other paths were considered.

Suppose node A has its path length updated and gets put in the queue. Now
suppose it is updated again (a shorter path was found to A). Whether or
not we put A in the queue again, the first revisit to A will consider the
path created by the second update to A. If we do put it in the queue again,
then the second revisit to A will either be a repeat of the first, or will
update based on an even shorter path, the discovery of which will insert
another copy of A into the queue.


Bellman-Ford is fundamentally a breadth first search. During (conceptual)
iteration i, we are looking at shortest paths with depth (in terms of the
number of edges) i. Unlike the more straightforward unweighted shortest path
algorithm (which is also called graph breadth first search), we can visit a
node more than once, in order to satisfy the orthogonal concern of finding
the shortest (least weight) path.

[Example - Just a linked list of nodes.]

How do we know that we have detected a negative cycle when we visit any node
more than |V| times?

Cycle detection is actually fairly simple. We can set the limit on the number
of visits to any node as high as we like and, if a negative cycle is present,
that limit will eventually be reached (the algorithm will just keep cycling
through and incrementing the visit counts for the involved nodes). We might,
for example, set the max visit count to (|E| * |V|) and be confident that it
is at least correct. Of course, the efficiency is horrible. So how do we lower
the visit count required to detect a negative cycle while ensuring correctness?

Our goal is to go as low as we can, without conflicting with any legitimate
(non-cycle) case. So what is the largest legitimate visit count we can
achieve? Begin by choosing an arbitrary destination node. Suppose that,
in the worst case, each time we increase the depth (the number of edges)
of the paths we are looking at, we find a shorter (lower weight) path to
this destination. In this case, the visit count for the destination node
will equal the maximum legal path depth for that graph. The assertion we
want to prove is that this maximum legal path depth is (|V| - 1).

Suppose we have a shortest path that touches every node in the graph,
with one edge going in and one edge coming out of every node, except the
start and destination nodes (which only have an outgoing and incoming edge,
respectively). The total depth here will be (|V| - 1). Now suppose we choose
two arbitrary nodes (they must be in the path because every node is in the
path). What if we were to take the path from node 2 through node 1 and then
to the destination node, instead of taking the path directly from node 2
to the destination? Wouldn't this increase the path depth to |V|? The only
reason we would make such a change is if
a + b + e > a + b + c + d - Otherwise we would just choose the path a, b, e.
and
a + d > a + b + e - Otherwise we would just choose the path a, d.

This leads us to conclude that
a + d > a + b + c + d
0 > b + c

Therefore, the sum of b and c must be less than 0; We have a negative
weight cycle.


[Diagram]


I'm not going to provide a proof of the run time for Bellman-Ford, but,
intuitively, it seems that we can accomplish quite a bit by looking at each
edge in the graph for each vertex in the graph. The run time for Bellman-Ford
is O(|E||V|).


--- DIJKSTRA'S

Dijkstra's algorithm is the first special case / optimization on the
Bellman-Ford algorithm that we will look at. It is able to achieve better
run time by limiting the set of graphs it will work on. In other words,
it uses special information about its input (specifically that none of the
edge weights will be negative) to improve its performance.

Here are the modifications to the generalize pseudo-code:
1. Also set visited = false
2. Also set visited = true
3. and totalVisitCount < |V|
4. (will be node with shortest path)
5. If node hasn't been visited
6. Mark node visited and totalVisitCount ++
9. Condition - Shorter path to neighbor through current

Let's begin with an example:

[Example]

Add elements to the set
Run time - Look at pseudo code - Will only visit each node once

Conceptually, Dijkstra's discovers all shortest paths by building the set of
nodes with known shortest paths from one node (the start node) to all nodes,
using the set as an aide in discovering the next shortest path.

In the worst case, we will need to look at each edge in the graph once. The
optimization in line 3 of the pseudo-code may reduce this on average, but
the worst case still stands. Because we only look at each node once, we can
gurantee that we will only look at each edge once. At worst, there will be an
entry in the priority queue for each of the |E| edges at one time. Processing
all of these edges will take |E|log(|E|) time. We can "reduce" this to
O(|E|log(|V|)) using a math trick on the slow growing nature of the log: We
know that, at most, |E| = |V|^2 (in the case of a complete graph). Therefore,
|E|log(|E|), in the worst case, equals (|E| * 2log(|V|)) = O(|E|log(|V|)).

--- TOPOLOGICAL

The topological sort approach, like Dijkstra's, is another variation on
Bellman-Ford that uses special information (the topological ordering of the
nodes) to decrease run time.

Let's begin, as usual, with the pseudo-code:

1. Also compute indegrees, set visited = false
2. Put all 0 indegree nodes on queue

5. Only if we haven't visited it before
6. visited = true, remove all edges
9. Condition: If indegree is 0

Now let's do an example:

[Include example and hierarchically drawn topo ordering.]

There are two ways to think about the Topological sort shortest path search:
1. In the first version, we split the process into a topological sort phase
and a shortest path search phase. We might visualize topologically sorting
the nodes in a graph by grabbing the nodes which initially have indegree
0 and shaking the tree so that all the arrows on the edges are pointing
downward. If this is not possible, then we have a cycle. Of course, there
are algorithms for topological sorting. Once topologically sorted, we could
traverse the graph, from the highest to lowest. The distance of each node
would then be the minimum of the sums of the distance of each parent and
the weight of the edge connecting the two.
2. Alternatively, and in line with the approach we have taken with the other
shortest path algorithms, we might view the toplogical sorting as simply a
way of deciding what values go in the queue. Determining the shortest path
would take the normal form.

From the pseudo-code provided above, it can be seen that we will visit
each node at most once. As such, we can only possibly traverse each edge
once. Therefore, we have a run time of O(|E|).

One possibly tricky point to take note of: Unlike the other shortest path
algorithms, we do not "begin" at the start node. Instead, we simply set
the distance of the start node to 0 and let it fall naturally into the
topological ordering. Once we have run the algorithm, nodes with infinite
weight are unreachable from the start node, while the others have shortest
path equal to their weight.

--- BREADTH FIRST SEARCH

The unweighted shortest path algorithm, also known as breadth first search,
is the simplest of the shortest path algorithms. It can be considered a
special case of Dijkstra's, and therefore has increase performance relative
to Dijkstra's.

5. Always visit a node
9. Condition: Weight was Infinity (hadn't been visited)

Because a node may only be enqueued once, we only visit each node once. As
shown above, this leads to the conclusion that we only traverse each edge
once and that the algorithm has a run time of O(|E|).

Breadth first search of a tree is simply considered a special case of the
graph breadth first search algorithm.

--- DEPTH FIRST SEARCH

PreOrder, InOrder, and PostOrder tree traversals are all considered depth
first search.

--- MAJOR THEMES

The way that you traverse a structure, such as a tree or graph, determines
not only the output, but may also shape the structure. When you traverse a
graph, for example, the traversal type determines the resulting tree. As we
saw in project 3, your choice of traversal type (DFS, BFS) actually modifies
the structure of the game tree.

Every type of traversal requires some auxiliary structure (specifically
a stack or a queue). Even if we use recursion (as in a DFS), we are still
implicitlly using a stack.

We can view all of the shortest path algorithms as having the same fundamental
structure, with differences only in how values are put onto and taken from
the queue.

----------
MINIMUM SPANNING TREES / KRUSKALS
----------

Finding a minimum spanning tree requires the construction of a tree from a
graph, through the systematic selection of edges, but is not similar enough
to the shortest path algorithms to be fit under the same structure.

Also unlike the other algorithms, Minimum Spanning Trees are applied to
undirected graphs.

Kruskal's algorithm is a straigthforward technique for finding MSTs.

Here is the pseudo-code for Kruskal's:

Sort all edges or put all edges in a priority queue / minHeap
edgeCount = 0
While edgeCount < (|V| - 1)
  Get the next least-weight edge
  If new edge wont create cycle
    Add to tree
    edgeCount ++
    Join vertex sets connected by edge

[Example]

When implementing this algorithm, we have a choice on how the edges
should be sorted. We can sort them all at once, up front, using Quick
Sort. Alternatively, we can sort them as we go, by initially inserting them
into heap and doing removeMin when we need the next edge. Although Quick
Sort will be faster than Heap Sort across all the edges, it is likely that
the algorithm will terminate before looking at all the edges. I therefore
recommend that the Heap Sort version is used. In either case, the running
time is O(|E|log(|E|)).

----------
DESIGN PROBLEM 1 (SHORTEST PATH, LEAST EDGES
----------

Question (from the book):
Explain how to modify Dijkstra's algorithm so that, if there is more than
one minimum path from the start node to any other node, a path with the
fewest edges is chosen.

My answer:
Add some small constant factor (such as 0.000001) to the weight of each
edge. If two paths are equal in their major term (everything left of the
decimal place), then the one with the smaller decimal (proportional to the
number of edges it contains) will be chosen.

----------
TREE INVARIANTS
----------

I will motivate the remainder of the discussion, which is about trees,
with the various invariants that define each type of tree. That is, given
the tree invariants, the operations and algorithms needed to enforce those
invariants should come (somewhat) naturally.

--- SEARCH TREES

Invariant - For each node, all values in the left subtree must be less than
the nodes value and all values in the right subtree must be greater.

You should be familiar with Binary Search Trees and their operations
by now. Insertion is incredibly straightforward. Removal is a bit more
complicated.

In particular, when removing a node with two sub children, replace that node
with the smallest element in the right subtree (which will be the leftmost
element in the right subtree). We know that this element is less than all of
the elements in right subtree, thereby ensuring the search tree invariant for
the right subtree. We also know that every element in the right subtree is
greater than every element in the left subtree, thereby ensuring the search
tree invariant for the left subtree.

The run time for all insertions and removals on a binary search tree is
O(log(n)).

Experiments show that n random insertions to an unbalanced binary search tree
will cost 1.38nlog(n); This is still O(log(n)) per insertion, but motivates
the use of balanced search trees.

-- AVL TREES

Invariant - For each node, the heights of the left and right subtrees may
differ by at most one.

Tips for hand executing insertions to an AVL tree:
* After insertion, travel the path from the insertion node to the root,
checking the AVL property as you go.
* When you find a problem, rebalance. This should be it; No more rebalances
should be necessary to gurantee the AVL property for the whole tree.
* If an insertion to the outside causes a prolem, do a single rotation.
* If an insertion to the inside causes a problem, do a double rotation.

* Be pragmatic! You should visually inspect that the search and AVL properties
are maintained after each operation.
* Write down the tree forms that require the different rotations. If you
practice enough, you will memorize these.

[Interactive example. Tell them they can use their book or cheat sheet for the
 general form of the rotations.]

The run time for AVL tree insert and search is O(log(n)). We didn't cover
removals.

--- HEAPS

Invariant - For each node, its value must be less than all values in its
left and right subtrees. Also, the heap must be a complete binary tree.

Binary heaps are our priority queue implementation.

Operations -
Insert - Insert the new value into the next open spot at the base of the tree
(while preserving the complete tree property). percolateUp by switching with
the values parent until the heap order property is satisfied.
deleteMin - Put the last value from the base of the tree (while preserving the
complete tree property) in the open spot at the root of the tree. percolateDown
by switching with the lesser of the two children until the heap ordering
property is satisfied.

[Interactive example.]

In the worst case, and under simple analysis, insertions will take
O(log(n)). Experimental data shows that, on average, 1.6 swaps are required per
insertion. Therefore, we might consider insertion a constant time operation.

deleteMin generally requires that the value taken from the base of the tree
and inserted at the root will need to propagate back to the base. The heap
is guranteed to have a height of log(n), so the run time of deleteMin is
O(log(n)), and generally requires the full time to complete.

The idea behind Heap Sort is pretty straightforward: Insert the elements into
a heap, and pull them out one by one, in order, using deleteMin. The only
really tricky part is efficiently getting the elements into a heap. We could
insert them one at a time, which would cost O(nlog(n)) for all elements. This
would be acceptable, as it is on par with the deleteMin time of O(nlog(n))
across all elements. However, we can do better.

Begin by inserting all of the elements into a complete binary tree (without
enforcing any invariants on the values). Then, beginning at second to
lowest level in the tree and working breadth-first toward the root, call
percolateDown on every node. This will run in linear time on the number of
edges, which is a constant multiple of the number of nodes; This algorithm
runs in linear time. An analysis of the run time and a correctness proof
are provided in the book.

[Example]

----------
BRAGGING / COMPLAINING
----------

Math 1A,B 53-55
CS61A-C

Systems - 122 (Networking), 162 (OSs), 164 (Compilers and Languages), 186
(DBs), 194 (Dist Sys)
Theory - 170 (Intro to Theory), 172, 174
Hardware - EE* and CS150, 152 (Architecture)
Misc - 160 (UI), 184 (Graphics), 188 (AI)
Trancendant - Security (new class), 169 (Software Engineering)


As an alternative to the prerequisite chart distributed by the department,
I have provided my own taxonomy of the CS classes offered.

I'm an undergraduate in EECS here at Berkeley. I've taken 7 of the upper
division clases offered here - All of what I've called the "systems" classes,
170 (Intro to Theory), and CS152 (Architecture).

I'd like to simultaneously brag and complain about my experiences in those
classes, in order to give an idea of what is to come for the CS majors here and
what the non-CS majors have avoided or missed (depending on your perspective).

In EECS122 (Networks), my solution to the first project was a 2K line C program
(which implemented a centralized peer-to-peer network), the second project
was a week long black box analysis of TCP, and my solution to the third
project was a 4K line C program (which implemented the Chord decentralized
overlay network).

Each of the four projects in CS162 (OSs) is about the size of the third project
in this class. Some of them, especially those that involve multithreading,
are substantially more conceptually challenging. Further, the median score
on 3 of the 4 projects was a perfect. On one project, my partners and I
(we worked in groups of 4) spent an entire day to find one bug. On another
project, myself and one of my project partners spent an entire weekend -
3 full 12 hour days (not to mention that they were very nice sunny days) -
in the lab, fixing what turned out to be a bug in the code that had been
given to us, which were not allowed to modify directly. Not only did it
turn out that this bug had been known about for a couple of years (we went
back to old newsgroup posts), but our fix for the problem actually broke
our submission for some of their test cases, when we had previously had a
perfect submission. We were eventually given back all three slip days and
all the points. I'm still waiting for my weekend back.

In 152 (Architecture), where we built a processor, we were required to
keep notebooks detailing when we were in the lab and what we did. One of
my project partners and I logged over 400 hours each, while the other two
partners logged more than 300 hours each. This was in addition to lectures
and discussions, reading (I read the entire book), 4 large homeworks, two
midterms, and the final.

So what can we take away from all of this? First, CS is a lot of work. Of
course, you can put in less, but you will probably leave with less (in terms
of knowledge and in terms of grades). Even spending a lot of time doesn't
gurantee good grades; I don't have a perfect record. Second, competition
is fierce. This often detracts from the whole experience. I guess this is
what we expect when we put so many hard working people together. Finally,
if it interest you, CS is a lot of fun and really interesting; I simply
wouldn't have spent this much time if I hadn't been enjoying myself.

----------
CLOSING
----------

Thank you guys for coming today and for taking this class. You've done a
lot of work, and hopefully learned a lot. I've enjoyed teaching you and hope
you've learned a lot both about how to write programs and about some of the
broader themes of CS.