The objective of this assignment is to practice writing recursive functions and to give you some introductory practice with programming in C++.
Consider any sequence S of characters; a subsequence of S is just S with some of its characters (possibly none) omitted. For example, s = aceg is a subsequence of S = abcdefg. Note that there is no requirement that the characters of S be alphabetical.
Given two sequences A and B, another sequence C is a common subsequence of A and B if it is a subsequence of both A and B. For example, if A = tree and B = relief, then C = re is a common subsequence of A and B. For this project, we will be solving the problem of finding not just a common subsequence of two sequences, but a common subsequence of maximum length, otherwise known as a longest common subsequence. For instance, if A = bookstore and B = bootstrap , then C = bot is a common subsequence, but D = boostr is the longest common subsequence. Note that longest common subsequences are not always unique; both pe and pl are longest common subsequences of pencil and please.
Let X = x1 x2 ... xm and Y = y1 y2
... yn be any two sequences of characters (note that they are not
necessarily the same length), and let Z = z1 z2 ... zk
be some longest common subsequence of X and Y. The algorithm to solve this
problem should make use of the fact that finding Z depends on finding longest
common subsequences of not just X and Y, but of prefixes of X and Y -- that is,
sequences of the form
x1 x2 ... xi,
which we would denote Xi, and y1 y2 ... yj
, which we
would denote Yj. Specifically, the following rules can be used
to determine the LCS of prefixes Xi and Yj (where 0 <= i <= m and 0
<= j <= n) :
1. If i = 0 or j = 0, then the LCS of the prefixes is the empty sequence, with length 0.
2. If xi = yj (and neither i nor j
is 0), then the LCS of the prefixes is the LCS of X i-1 and Yj-1,
concatenated with xi.
3. Otherwise, the LCS of the prefixes is the longer of: the LCS of Xi-1 and
Yj or the LCS of Xi and
Yj-1.
Your program must solve the LCS problem by implementing a recursive function that finds a (any) longest common subsequence of two sequences of characters. This function should solve for the longest common subsequence using the recursive definition above. This function may call other functions, as you see fit. The prototype of this function is up to you; make sure to include it in the homework file detailed below.
It turns out that a call to your recursive function may make many recursive calls to itself with the same parameters, depending on how you write the function. To speed up your program, you may wish to implement a memoization table for your function. The memoization table would record, for each value of i and j, whether the LCS of Xi and Yj has already been computed, and if so, what it was. You would write your function so that when it is called, it first checks the memoization table to see whether a function call with the current parameters has been called previously. If so, it should simply return the value stored in the table rather than continue with recursive calls. If no call with the current parameters has been made, then the function will proceed normally, except that the return value (LCS, and possibly length of the LCS, depending on your implementation) is stored in the memoization table before returning. Note that the size of the memoization table depends on the length of the sequences being compared; however, since we have not yet covered dynamic allocation, you may assume that the sequences being compared will not be more than 15 characters long each.
The homework portion of this project will be worth 10% of the project grade. For the homework, create a plain text file which includes the prototypes of the functions you intend to use, along with (approximately) a one- to two-paragraph description of how you intend to solve the problem and the general design of your program. You may use pseudocode if you wish. (This file is not intended to be compiled; it is only for you to express the design of your program.) Your homework will be graded both on the merits of your design and on the extent to which you follow that design in the implementation of your program. Remember to submit your homework file by the homework due date. Late homework submissions are not accepted.
You must include the following statement in a comment section of your program:
I have read and I understand the course policy on cheating. By submitting the following program, I am stating that the program was produced by my individual effort.
Turn in your program using the UNIX submit utility, by using the following command at the unix prompt:
submit cs202 proj1 filename
where filename is the name of the file you wish to submit. Remember to submit a plain text file for the homework assignment, and your C++ source file for the project. After entering this command, you should get a confirmation that submit was successful:
Submitting filename...OK
You can check your submission by entering the command:
submitls cs202 proj1
This will show the names of all files you have submitted for this project, along with the size and submission date and time of each.
This project is due by midnight (1 minute after 11:59pm) on Sunday, 16 September 2001. We will use the system clock as the final arbiter of time and date. Projects turned in up to 24 hours late will receive a 10% penalty; project submitted between 24 and 48 hours late will receive a 25% penalty. Projects will not be accepted past 48 hours after the due date.