CMSC--202 Computer Science II for Majors
Fall 1997 3 October 1997 Lists
A list is a sequence of homogeneous items on which only
sequential access can be done. It's homogeneous because every
item on a list must have the same type. It is sequential
because access to a list item can only be made by sequentially
traversing the list from the ``head.''
Compare and contrast these properties with those of an array in C.
Arrays and lists are both homogeneous. Unlike a list, an array allows
immediate (random) access to any item. An array is fixed in size, a
list has no size limit. Your choice of whether to use a list or an
array for a given task should be guided by these properties. If you
know how much data you will be accessing over the life of the program,
an array is likely to be the best choice. If you do not know, or if
the amount might change significantly during program execution, a list
may be best.
Since lists are homogeneous, there must be some way of declaring the
type of the element a list may contain. There must also be some way of
declaring a variable to be a list. We will be defining the types
list_type
and list_item_type
to meet these needs.
We adopt a notation for lists that uses <>
as delimiters. For
example, the list containing the items a
, b
, and
c
would be denoted <a b c>
. It is permissible to have a
list that contains no items. It is called an empty list and is
denoted <>
.
There is a constant of type list_type
that denotes the empty
list. Not surprisingly, this constant is named the_empty_list
.
The first element on a list is called the first or head of
the list. The remaining elements of the list are members of a list
called the rest or tail of the list. Note the recursive
nature of this fact: a list of n
objects is composed of the
first object ``tacked'' onto the head of a list of the remaining
n - 1
objects. It follows from this that a list that contains
just one object must be composed of the object ``tacked'' onto the
head of the_empty_list
.
It turns out that all imaginable list operations can be performed as
combinations of four primitive operations along with the constant
the_empty_list
. Here are the operations:
list_item_type first(list_type L)
returns
the first item on list L
. It is an error to invoke first
on an empty list.M
is <3 5 7>
, first(M)
returns 3
.
list_type rest(list_type L)
returns the list composed of
the sequence of items in L
after the first item. It is not
an error to invoke rest
on an empty list. By definition,
rest(the_empty_list)
is the_empty_list
.
Example: given that list M
is <3 5 7>
, rest(M)
returns the list <5 7>
and
rest(rest(M))
returns the list <7>
and
rest(rest(rest(M)))
returns the list < >
list_type cons(list_item_type X, list_type L
returns the
list composed of item X
followed by the items in list
L
.cons(7, the_empty_list)
returns the list <7>
cons(3 (cons(5 (cons(7, the_empty_list)))))
returns
the list <3 5 7>
BOOL empty_list(list_type L)
returns TRUE
if list
L
is empty, FALSE
otherwise.M
is <3 5 7>
, empty_list(M)
returns FALSE
.empty_list(the_empty_list)
returns TRUE
.
The lists.h
interface file for these primitive list operations is in
the directory
anastasi/Public/202/Includes
.
The implementation of the operations
is in the lib202.a
archive in the directory
anastasi/Public/202/Libraries
.
Here's an important observation about the cons
list operation.
It constructs a new list composed of a list item and the items
in another list. It does not modify the other list in any way. For
example, in the following code fragment,
list_type L1, L2; /* construct L1 as <5 7> */ L1 = cons(5, cons(7, the_empty_list)); /* construct L2 as <3 5 7> on top of L1 */ L2 = cons(3, L1);
L1
is not changed when L2
is constructed. The
cons
operation is ``non-destructive.''
In this section we present a variety of C functions that use the
primitive operations to do more advanced operations. Most (all?) of
these functions are recursive -- not a surprise given the recursive
nature of lists.
Remember that recursive functions have one or more base cases and one
or more recursive calls. The recursive calls must be ``smaller'' than
the original call. Notice that the base case(s) for lists frequently
test for the list being empty (that's as small as a list can get) and
that the recursive calls frequently are made on the rest
of the list
(rest(L)
is guaranteed to be smaller than L
).
The length of a list is the number of items in the list. It can be defined as
int length(list_type L) { if (empty_list(L) == TRUE) return 0; return 1 + length(rest(L)); }
The copy_list
function returns a copy of a list.
list_type copy_list(list_type L) { if (empty_list(L) == TRUE) return the_empty_list; return cons(first(L), copy_list(rest(L))); }
Note that every item in L
takes part in a cons
operation. The list returned by copy_list
is a brand-new list.
Loosely speaking, the append
operation constructs a list composed of
the items in one list followed by the items in a second list. As will
be seen shortly, the reality is slightly different. Here's a
definition of append
list_type append(list_type L1, list_type L2) { if (empty_list(L1) == TRUE) return L2; return cons(first(L1), append(rest(L1), L2)); }
For example, if L1
is <1 2>
and L2
is <3
4>
, then append(L1, L2)
returns the list <1 2 3 4>
.
Examine the definition of append
carefully. Note that only the
items in L1
take part in the construction operation
cons
. L2
is just along for the ride and for being the
list onto which the various cons
operations ``tack'' their
items. In the example above, there would be only two cons
operations performed.
The last
function returns a list composed of the
last item on a given list. For example, if M
is <3 5 7>
,
then last(M)
returns <7>
list_type last(list_type L) { if (empty_list(L) == TRUE) return the_empty_list; if (empty_list(rest(L)) == TRUE) /* look-ahead */ return L; return last(rest(L)) }
This function uses a trick called ``look-ahead'' to determine that the recursion has reached the last item on the list. You look ahead in the list before making the recursive leap. Once you've made the leap, it's too late to go back.
The member
function returns the list beginning with the first
item in list L
that is equal to X
.
list_type member(list_item_type X, list_type L) { if (empty_list(L) == TRUE) return the_empty_list; if (X == first(L)) return L; return member(X, rest(L)); }
For example, if M
is <3 5 7>
, member(5, M)
returns the list <5 7>
.
The reverse
function returns a list composed of the items in
L
taken in reverse order. For example, if M
is
<3 5 7>
, then reverse(M)
returns the list
<7 5 3>
.
``Wishful thinking'' helps a lot in coming up with this function.
You think ``if only I could reverse the rest of this list, I would
know the answer -- I would make a list of the first item and append
the reversed rest-of-list onto it.'' You make a list of the first
item because append
takes two lists. In the example above,
wishful thinking would say ``I know how to reverse the list
<5 7>
, it's just <7 5>
; therefore, to reverse the
whole list <3 5 7>
, I would append(<7 5>, <3>)
''
list_type reverse(list_type L) { if (empty_list(L) == TRUE) return L; return append(reverse(rest(L)), cons(first(L), the_empty_list)); }
As it turns out, this works, but is very inefficient. Look at all
those append
operations being done. Each time you do one of
them, you must traverse the entire list! There's a way to do it
without using append
(see the Exercises).
The function print_list
prints L
in the notation we have
been using (items between <>
brackets). It also takes a format
string that dictates the format under which each item is to be
printed. We assume that printing is to be to stdout
(but, see
the Exercises). Being very fussy, we want all but the last item in
the list to be followed by a space (we want to print
<3 5 7>
, not <3 5 7 >
). To print the integer list
L
, we would call print_list(L,"%d")
.
The overall structure of the function is
void print_list(list_type L, char * format) { printf("<"); print_list_aux(L, format); /* print the items */ printf(">"); }and the auxiliary function is
void print_list_aux(list_type L, char * format) { if (empty_list(L) == TRUE) return; printf(format, first(L)); if (empty_list(rest(L)) == FALSE) /* look ahead */ printf(" "); /* space character */ print_list_aux(rest(L), format); return; }
Functional parameters provide a powerful abstraction tool. In this section, we show how functional parameters can be used in functions that provide general list operations. First a few definitions:
islower
returns TRUE
if its character argument is lower case. For list
items, we can define the predicate type as
typedef BOOL (*predicate)(list_item_type);
sqrt
takes a double
and returns its square root as a
double
. For list items, we can define the transformer type as
typedef list_item_type (*transformer)(list_item_type);
Some list operations occur so often that they are called cliches. Here are some of them:
int counter(predicate P, list_type L) { if (empty_list(L) == TRUE) return 0; if (P(first(L)) == TRUE) return 1 + counter(P, rest(L)); return counter(P, rest(L)); }
For example, suppose we have some predicates:
BOOL AlwaysTrue(int x)
that always returns TRUE
,
regardless of the value of x
.
BOOL IsOdd(int x)
that returns TRUE
if x
is
an odd number, FALSE
otherwise.
Then, we can define the length(L)
operation as
counter(AlwaysTrue, L)
. To count the number of odd numbers in
a list of integers L
, we call counter(IsOdd, L)
.
list_type filter(predicate P, list_type L) { if (empty_list(L) == TRUE) return the_empty_list; if (P(first(L)) == TRUE) return cons(first(L), filter(P, rest(L))); return filter(P, rest(L)); }
For example, suppose L
is the list <1 2 3 4 5 6>
. Then,
filter(IsOdd, L)
returns the list <1 3 5>
.
list_type transform(transformer T, list_type L) { if (empty_list(L) == TRUE) return the_empty_list; return cons(T(first(L)), transform(T, rest(L))); }
For example, suppose we have the transformer
int Square(int x) { return x * x; }and the list
L
equal to <1 2 3>
. Then
transform(Square, L)
returns the list <1 4 9>
.
It is possible to get even greater functionality by composing list
cliches. As a simple example, we wish to produce a list of the
squares of the odd integers on the list L
equal to
<1 2 3 4 5 6 7 8>
. The following composition does the trick,
returning the list <1 9 25 49>
.
transform(Square, filter(IsOdd, L))
length
given in Section 3.1.
last_item
that returns the last item on
a list. This differs from the function last
(Section 3.4) that returns a
list containing the last item. What error condition must be
avoided in last_item
?
bool_member
that acts like the function
member
of Section 3.5, but returns
TRUE
or FALSE
instead of a list.
reverse
function in Section 3.6
uses append
. Write a reverse
function that does not.
Hint: use an auxiliary parameter in which you construct the reversed
list. The base case returns this parameter's value (it starts off as
the_empty_list
).
void fprint_list(list_type L, char *
format, FILE * stream)
that acts like the print_list
function
of Section 3.7, but prints its output on
stream
.
counter
of
Section 4.1 to count the number of negative
integers on a list of integers. Be sure to write the appropriate
predicate.
filter
of
Section 4.2 to produce a list of the negative
integers on a given list.
transform
of
Section 4.3 to produce a list of the negatives
of the integers on a given list.
list_type merge(list_type L1, list_type L2)that takes two sorted lists (increasing order) and returns the sorted list composed of the elements of lists
L1
and L2
. You
may assume that all items on the lists are unique (no duplicate values).
list_type delete_first_n_if(list_type L, predicate p, int n);that returns a list composed of those items on list
L
that
satisfy predicate p
after the n
th item on L
.
Assume that the
first item on a list is number 1. In case n
is greater than
the length of list L
, the_empty_list
is to be returned.
For example,
delete_first_n_if(<1 2 3 4 5>, IsOdd, 2) ==> <3 5> delete_first_n_if(<1 2 3 4 5>, IsEven, 2) ==> <4>