CMSC--202 Computer Science II for Majors
Fall 1997 3 October 1997 Lists

Basics

A list is a sequence of homogeneous items on which only sequential access can be done. It's homogeneous because every item on a list must have the same type. It is sequential because access to a list item can only be made by sequentially traversing the list from the ``head.'' Compare and contrast these properties with those of an array in C. Arrays and lists are both homogeneous. Unlike a list, an array allows immediate (random) access to any item. An array is fixed in size, a list has no size limit. Your choice of whether to use a list or an array for a given task should be guided by these properties. If you know how much data you will be accessing over the life of the program, an array is likely to be the best choice. If you do not know, or if the amount might change significantly during program execution, a list may be best. Since lists are homogeneous, there must be some way of declaring the type of the element a list may contain. There must also be some way of declaring a variable to be a list. We will be defining the types list_type and list_item_type to meet these needs. We adopt a notation for lists that uses <> as delimiters. For example, the list containing the items a, b, and c would be denoted <a b c>. It is permissible to have a list that contains no items. It is called an empty list and is denoted <>. There is a constant of type list_type that denotes the empty list. Not surprisingly, this constant is named the_empty_list. The first element on a list is called the first or head of the list. The remaining elements of the list are members of a list called the rest or tail of the list. Note the recursive nature of this fact: a list of n objects is composed of the first object ``tacked'' onto the head of a list of the remaining n - 1 objects. It follows from this that a list that contains just one object must be composed of the object ``tacked'' onto the head of the_empty_list.

List Operations

It turns out that all imaginable list operations can be performed as combinations of four primitive operations along with the constant the_empty_list. Here are the operations:

list_item_type first(list_type L) returns the first item on list L. It is an error to invoke first on an empty list.
Example: given that list M is <3 5 7>, first(M) returns 3.
list_type rest(list_type L) returns the list composed of the sequence of items in L after the first item. It is not an error to invoke rest on an empty list. By definition, rest(the_empty_list) is the_empty_list. Example: given that list M is <3 5 7>, rest(M) returns the list <5 7> and rest(rest(M)) returns the list <7> and rest(rest(rest(M))) returns the list < >
list_type cons(list_item_type X, list_type L returns the list composed of item X followed by the items in list L.
Example: cons(7, the_empty_list) returns the list <7>
Example: cons(3 (cons(5 (cons(7, the_empty_list))))) returns the list <3 5 7>
BOOL empty_list(list_type L) returns TRUE if list L is empty, FALSE otherwise.
Example: given that list M is <3 5 7>, empty_list(M) returns FALSE.
Example: empty_list(the_empty_list) returns TRUE.

The lists.h interface file for these primitive list operations is in the directory
anastasi/Public/202/Includes . The implementation of the operations is in the lib202.a archive in the directory anastasi/Public/202/Libraries . Here's an important observation about the cons list operation. It constructs a new list composed of a list item and the items in another list. It does not modify the other list in any way. For example, in the following code fragment,

list_type L1, L2;

/* construct L1 as <5 7> */
L1 = cons(5, cons(7, the_empty_list));

/* construct L2 as <3 5 7> on top of L1 */
L2 = cons(3, L1);

L1 is not changed when L2 is constructed. The cons operation is ``non-destructive.''

List Functions Using the Primitives

In this section we present a variety of C functions that use the primitive operations to do more advanced operations. Most (all?) of these functions are recursive -- not a surprise given the recursive nature of lists. Remember that recursive functions have one or more base cases and one or more recursive calls. The recursive calls must be ``smaller'' than the original call. Notice that the base case(s) for lists frequently test for the list being empty (that's as small as a list can get) and that the recursive calls frequently are made on the rest of the list (rest(L) is guaranteed to be smaller than L).

Length

The length of a list is the number of items in the list. It can be defined as

int length(list_type L)
{
  if (empty_list(L) == TRUE)
    return 0;
  return 1 + length(rest(L));
}

Copy_List

The copy_list function returns a copy of a list.

list_type copy_list(list_type L)
{
  if (empty_list(L) == TRUE)
    return the_empty_list;
  return cons(first(L), copy_list(rest(L)));
}

Note that every item in L takes part in a cons operation. The list returned by copy_list is a brand-new list.

Append

Loosely speaking, the append operation constructs a list composed of the items in one list followed by the items in a second list. As will be seen shortly, the reality is slightly different. Here's a definition of append

list_type append(list_type L1, list_type L2)
{
  if (empty_list(L1) == TRUE)
    return L2;
  return cons(first(L1), append(rest(L1), L2));
}

For example, if L1 is <1 2> and L2 is <3 4>, then append(L1, L2) returns the list <1 2 3 4>. Examine the definition of append carefully. Note that only the items in L1 take part in the construction operation cons. L2 is just along for the ride and for being the list onto which the various cons operations ``tack'' their items. In the example above, there would be only two cons operations performed.

Last

The last function returns a list composed of the last item on a given list. For example, if M is <3 5 7>, then last(M) returns <7>

list_type last(list_type L)
{
  if (empty_list(L) == TRUE)
    return the_empty_list;
  if (empty_list(rest(L)) == TRUE)   /* look-ahead */
    return L;
  return last(rest(L))
}

This function uses a trick called ``look-ahead'' to determine that the recursion has reached the last item on the list. You look ahead in the list before making the recursive leap. Once you've made the leap, it's too late to go back.

Member

The member function returns the list beginning with the first item in list L that is equal to X.

list_type member(list_item_type X, list_type L)
{
  if (empty_list(L) == TRUE)
    return the_empty_list;
  if (X == first(L))
    return L;
  return member(X, rest(L));
}

For example, if M is <3 5 7>, member(5, M) returns the list <5 7>.

Reverse

The reverse function returns a list composed of the items in L taken in reverse order. For example, if M is <3 5 7>, then reverse(M) returns the list <7 5 3>. ``Wishful thinking'' helps a lot in coming up with this function. You think ``if only I could reverse the rest of this list, I would know the answer -- I would make a list of the first item and append the reversed rest-of-list onto it.'' You make a list of the first item because append takes two lists. In the example above, wishful thinking would say ``I know how to reverse the list <5 7>, it's just <7 5>; therefore, to reverse the whole list <3 5 7>, I would append(<7 5>, <3>)''

list_type reverse(list_type L)
{
  if (empty_list(L) == TRUE)
    return L;
  return append(reverse(rest(L)), cons(first(L), the_empty_list));
}

As it turns out, this works, but is very inefficient. Look at all those append operations being done. Each time you do one of them, you must traverse the entire list! There's a way to do it without using append (see the Exercises).

Printing Lists

The function print_list prints L in the notation we have been using (items between <> brackets). It also takes a format string that dictates the format under which each item is to be printed. We assume that printing is to be to stdout (but, see the Exercises). Being very fussy, we want all but the last item in the list to be followed by a space (we want to print <3 5 7>, not <3 5 7 >). To print the integer list L, we would call print_list(L,"%d"). The overall structure of the function is

void print_list(list_type L, char * format)
{
  printf("<");
  print_list_aux(L, format);  /* print the items */
  printf(">");
}

and the auxiliary function is

void print_list_aux(list_type L, char * format)
{
  if (empty_list(L) == TRUE)
    return;
  printf(format, first(L));
  if (empty_list(rest(L)) == FALSE)  /* look ahead */
    printf(" ");    /* space character */
  print_list_aux(rest(L), format);
  return;
}

List Cliches

Functional parameters provide a powerful abstraction tool. In this section, we show how functional parameters can be used in functions that provide general list operations. First a few definitions:

a predicate is a Boolean function that tests a property of its argument. For example, the character function islower returns TRUE if its character argument is lower case. For list items, we can define the predicate type as
```
    typedef BOOL (*predicate)(list_item_type);
```
a transformer is a function that returns a modified value of the same type as its argument. For example, the math function sqrt takes a double and returns its square root as a double. For list items, we can define the transformer type as
```
    typedef list_item_type (*transformer)(list_item_type);
```

Some list operations occur so often that they are called cliches. Here are some of them:

The counter returns the number of list items on a given list that satisfy a given predicate.
The filter returns a list of all items on a given list that satisfy a given predicate.
The transform returns a list of the items on a given list transformed by a given transformer.

The Counter Cliche

int counter(predicate P, list_type L)
{
  if (empty_list(L) == TRUE)
    return 0;
  if (P(first(L)) == TRUE)
    return 1 + counter(P, rest(L));
  return counter(P, rest(L));
}

For example, suppose we have some predicates:

BOOL AlwaysTrue(int x) that always returns TRUE, regardless of the value of x.
BOOL IsOdd(int x) that returns TRUE if x is an odd number, FALSE otherwise.

Then, we can define the length(L) operation as counter(AlwaysTrue, L). To count the number of odd numbers in a list of integers L, we call counter(IsOdd, L).

The Filter Cliche

list_type filter(predicate P, list_type L)
{
  if (empty_list(L) == TRUE)
    return the_empty_list;
  if (P(first(L)) == TRUE)
    return cons(first(L), filter(P, rest(L)));
  return filter(P, rest(L));
}

For example, suppose L is the list <1 2 3 4 5 6>. Then, filter(IsOdd, L) returns the list <1 3 5>.

The Transform Cliche

list_type transform(transformer T, list_type L)
{
  if (empty_list(L) == TRUE)
   return the_empty_list;
  return cons(T(first(L)), transform(T, rest(L)));
}

For example, suppose we have the transformer

  int Square(int x)
   {
     return x * x;
   }

and the list L equal to <1 2 3>. Then transform(Square, L) returns the list <1 4 9>.

Composition of List Cliches

It is possible to get even greater functionality by composing list cliches. As a simple example, we wish to produce a list of the squares of the odd integers on the list L equal to <1 2 3 4 5 6 7 8>. The following composition does the trick, returning the list <1 9 25 49>.

  transform(Square, filter(IsOdd, L))

Exercises

Write a tail-recursive version of the function length given in Section 3.1.
Write a function last_item that returns the last item on a list. This differs from the function last (Section 3.4) that returns a list containing the last item. What error condition must be avoided in last_item?
Write a function bool_member that acts like the function member of Section 3.5, but returns TRUE or FALSE instead of a list.
The reverse function in Section 3.6 uses append. Write a reverse function that does not. Hint: use an auxiliary parameter in which you construct the reversed list. The base case returns this parameter's value (it starts off as the_empty_list).
Write the function void fprint_list(list_type L, char * format, FILE * stream) that acts like the print_list function of Section 3.7, but prints its output on stream.
Write a function that uses the counter of Section 4.1 to count the number of negative integers on a list of integers. Be sure to write the appropriate predicate.
Write a function that uses the filter of Section 4.2 to produce a list of the negative integers on a given list.
Write a function that uses the transform of Section 4.3 to produce a list of the negatives of the integers on a given list.
Write a recursive C function
```
   list_type merge(list_type L1, list_type L2)
```
that takes two sorted lists (increasing order) and returns the sorted list composed of the elements of lists L1 and L2. You may assume that all items on the lists are unique (no duplicate values).
Write a recursive C function
```
   list_type delete_first_n_if(list_type L, predicate p, int n);
```
that returns a list composed of those items on list L that satisfy predicate p after the nth item on L. Assume that the first item on a list is number 1. In case n is greater than the length of list L, the_empty_list is to be returned. For example,
```
   delete_first_n_if(<1 2 3 4 5>, IsOdd, 2)  ==> <3 5>
   delete_first_n_if(<1 2 3 4 5>, IsEven, 2) ==> <4>
```

Thomas A. Anastasio
Thu Oct 2 22:39:40 EDT 1997