CMSC-441 Algorithms: Brief Lecture Notes (Fall 1998-Sherman)

These note are intended to serve as a reminder of the major topics and ideas discussed during class; they are not intended to be a self-contained detailed transcription of the class meeting.

Lecture 1 (September 1): Introduction

Introduction to algorithms, course, instructor, students. Course mechanics, expectations, required work, and grading policies. Student introductions and information sheets.

Course goals: design algorithms, analyze algorithms, fundamental algorithms, math techniques in support of analysis.

Criteria for evaluating and comparing algorithms: Resource usage (time, space, communication, energy), quality of results (correctness, performance ratio, special properties), and implementation difficulty (man hours, programming cost in dollars, lines of code). Discussion motivated by asking students to agree or disagree with the statement: "It doesn't matter how a program is written as long as it works correctly."

Overview of analysis techniques: analyzing straight-line programs, looping programs, and recursive programs. Setting up and solving summations and recurrences.

Example: Discussion of mesh computer motivated by process of handing in student information sheets. Diameter, and I/O bottleneck.

Example: Multiplication of two hundred-digit integers using three algorithms--elementary school algorithm (repeated addition), high school algorithm, and college algorithm (Schonehage/Strassen).


Lecture 2 (September 3): Orders of Growth and Asymptotic Notations

A Tale of Two Programs: Cray 1 running finely optimized Fortran vs. Radio Shack TRS80 running interpreted Basic. Maximum subvector sum problem. Challenge: design faster solution to this problem.

Intro to algorithm design and analysis. Design techniques. Exhaustive search of finite problems. Scanning algorithms. "Algorithm's Policy" (read policy, including description, I/O specs, correctness, analysis). Solution value vs. solution description.

Proofs of correctness. Induction and loop invariants.

Information theoretic lower-bounds on problem complexity. Notion of optimality of a problem solution.

Common orders of growth. Constants, polylogarithms, polynomials, superpolynomial-subexponentials, exponentials.

Asymptotic notations: O, o, Omega, omega, Theta, ~. Lower bounds vs. upper bounds. Worst-case vs. average case. How to compare growth rates. Ratio test for comparing functions. Modern vs. old approach to big-Oh notation.

Bounding. How to bound summations, including initial sums, adding terms, splitting and bounding. Stirling's approximation. Floors and ceilings. How to bound floors and ceilings. Exercise: dimensions of a tennis can.

Elementary summations, including Harmonic series, arithmetic series, and geometric series. jokes about various summations.


Lecture 3 (September 8): Analyzing Looping Algorithms

First we took class photos, and we reviewed the solution to the tennis can problem (which dimension of a tennis can is longer: height or circumference?) as an exercise in bounding. demo: "stick model" of computation.

To illustrate the method of analyzing looping programs, we carefully analyzed the worst-case running time of insertion sort. For each statement we assigned a cost. We set up a summation. We solved the summation.

Along the way, we discussed the following topics: models of computation (including uniform-cost vs. logarithmic-cost RAM, PRAM), granularity of resource measurement, specific methods of assigning costs (operand count method, operation count method)

We reviewed three fundamental summations: constant, arithmetic, and geometric. We explicitly illustrated standard techniques for simplifying summations, including linearity of summation, bounding (see also "split and bound" from text), dealing with missing initial terms, change of variable, working from inside out.

I gave a lot of practical problem-solving advice, such as assigning names to important quantities, keeping calculations neatly arranged, explicitly performing change of variables in writing (not in your head), expressing running times as summation in decreasing order of terms, using meaningful symbolic constants rather than numbers, performing various sanity checks (including bounding), writing algebraic expressions in an intuitive fashion (e.g. (x^(n+1) - 1)/(x -1) vs. (1 - x^(n+1))/(x-1) depending on magnitude of x), and intrepreteting answers intuitively.

We discussed the advantages and disadvantages of insertion-sort (good for short or nearly-sorted data, bad worst-case). Challenge problem: how to "unsort" an array?


Lecture 4 (September 10): Analyzing Recursive Algorithms, and Solving Recurrences

Analysis of mergesort by setting up and solving a recurrence. Iteration. How to apply Master Theorem for divide-and-conquer recurrences.


Lecture 5 (September 15): Solving recurrences.

Proof sketch for generalized Case 2 of Master Theorem. Questions and answers. Example of applying Master Theorem. Binary search: 20-questions to guess any word in dictionary.


Lecture 6 (September 17): Sorting: Intro to quicksort, heapsort, and linear-time sorts

Intro to quicksort (intuition, partitioning, optimizations). Best and worst-case recurrences.

Intro to heapsort (heaps--shape and order properties). Representing binary heaps as a linear array. Difference between heap and binary search tree. Subtleties of sifting up vs. sifting down (sifting down must check all children).

Intro to linear-time sorts (counting sort, bucket sort, radix sort). Demonstrations with cards. Issues of stability and ordering (in which order should digits be sorted in radix sort?).


Lecture 7 (September 22): More on heaps, heapsort, and quicksort.

Building a heap in \Theta(n) time. Derivative trick for solving summations that are derivatives of known summations.

How to analyze algorihtms for space complexity (i.e. amount of memory used). Explicit vs. implicit (stack) storage. Example: quicksort.

Worst-case running time of quicksort by iteration. recurrence leads to the arithmetic series.

A method for inferring a closed-form formula for a numerical sequence: Numerically compute sequence s1, s2, ... . Write down table of j, sj, and various guesses for sj (e.g. j, j^2). Also write down Rj = sj/gj where gj is a guess for sj. If Rj appears to be roughly constant, then you have probably correctly guessed the high-order term correctly. Repeat for lower-order terms. Variation: guess general form of solution (e.g. sj = aj^2 + bj + c, and solve for constants a, b, c).

Abstract Data Types (ADT) vs. data structures. An abstract data type is characterized by a set of values and operations on those values. The operations include constructor/destructors, probes, and mutators. A data structure is an abstract data type, together with an implementation of it (including how the vlaues are arranged in memory). Example: priority queue as ADT implemented by a binary heap implemented as a linear array.


Lecture 8 (September 24): Average-case running time of quicksort, and lower bounds on sorting

A lower bound on sorting in the decision-tree model of computation. Complexity of problems vs. complexity of algorihtms. Optimality of algorithms: upper bound on algorithm matches lower bound on problem. Lower bounds on problems: techniques, models, implications on technology (try to find new technologies that violate the models). Techniques include information theoretic arguments, I/O-bounded arguments, diagonalization, and axiomns from our physical reality (e.g. speed of light is constant). Decision tree model. \Omega( n \lg n ) lower bound based on information theory. Explaining apparent paradox with "linear-time" sorts.

Average-case running time of quicksort. Average case running time as expected value of random variable of running time, whose distribution is induced from the distribution of the input. Setting up recurrence from definition of expectation. Simplifying the recurrence by exploiting symmetry. Solving the recurrence by the method of "guess and check." Within this solution, applying the methods of bounding by integration, and splitting and bounding a summation.


Lecture 9 (September 29): Order statistics and linear-time median algorithm.

Practice recurrence for Exam I. Understand what's important in a recurrence and what's not. Understand the difference between asympototic domination vs. asymptotic polynomial domination. Know the generalized Case #2 of the Master T heorem.

A divide-and-conquer approach to order statistics. Recursively cmoputing a good pseudomedian.


Lecture 10 (October 1): Exam I

Lectures 1-8. Does not include order statistics. Be prepared to analyze looping and recurisive algorithms; solve summations and recurrences; apply asymptotic notations; and design algorihtms based on divide-and-conquer.


Lecture N (October DATE): TOPIC