Project 1

Assigned Monday, Sept 19, 2011
Due 11:59pm, Tuesday Oct 4, 2011
Points 100
Updates In case you missed it, this message was sent to all students

The ChunkList project description's reference to a ChunkListNode inner class implies that you are required to write your own LinkedList code in which each node of that LinkedList is a ChunkListNode which in turn contains an ArrayList<T>. You are certainly welcome to take that approach using the author'\ s code from the course slides as a starting point if you wish.

However, after some discussion we have decided to allow an implementation of the ChunkList that uses existing Java classes -- LinkedList and ArrayList. In this implementation, a ChunkList contains a data member that is a LinkedList<ArrayList<T>> (a LinkedList of ArrayLists) and no ChunkListNode class is required. This approach allows and requires extensive use of LinkedList, ArrayList, and LinkedListIterator methods. You should become very familiar with these methods, they can do a lot of the work for you. The code for the inner ChunkListIterator class, the insert( ) method and remove( ) method is easier with this implementation, but still far from trivial. In our implementation the insert( ) method is about 60 lines of code, although that includes some duplication. The remove( ) method and the iterator implementation are about the same.

You would be well advised to focus on the insert( ) method and the iterator first. The insert( ) method and the iterator working together allow you to implement the print, union, intersection, and difference commands very quickly. A method to support the NODES command that prints out the internal structure of the ChunkList (how many nodes, the contents of each node, etc.) is very helpful for debugging.

This isn't an easy project. Please feel free to see your instructor or TA for help, ask questions in class, or post to the blackboard.

Objectives

To gain experience
  1. modifying someone else's code.
  2. combining multiple data structures.
  3. implementing and using iterators in an application.
  4. designing Java classes.

Background

In class we have discussed both the LinkedList and ArrayList implementions of the List ADT, their respective pros and cons and their performance for the basic dictionary operations. In this project you will implment a List which is a hybrid of a LinkedList and ArrayList which we will call a "ChunkList".

A ChunkList has the following characteristics

  1. A ChunkList stores objects in sorted order.
  2. Duplicate objects are not allowed.
  3. A ChunkList contains nodes like a LinkedList.
  4. Each node contains multiple objects (the "chunk").
  5. The maximun number of objects in a node's chunk is specified in the ChunkList constructor and cannot change.
  6. Each node's chunk must be at least half full.
    I.e. if the maximum chunk size is 7, then each node's chunk must contain between 4 and 7 objects.
    An exception to this rule is allowed if the ChunkList contains less than or equal to the maximum chunk size objects (and therefore has just one node).


Description

In this project you will implment a ChunkList class as described above and write a small application that will read a file containg commands that exercise your class. Since the ChunkList is part LinkedList, your ChunkList class should contain an inner class for a ChunkList node. Also, since the ChunkList is an implementation of a List, your ChunkList class should also be Iterable and contain an inner class for a ChunkList iterator. See below for an extra credit opportunity.

Although we will only be creating ChunkLists which contain Integers, all of your classes must be generics using approriately bound parameters. You are welcome to use the linked list code from the course slides as a starting point for your project.

The application will instantiate an array of ChunkLists, then read the specified file and perform the commands that are read.

The Command Line

Project 1 will be invoked with a command line that consists of three arguments. The first argument will be an integer specifying the number of ChunkLists in the array. The second argument will be an integer specifying the maximun number of elements stored in each node (the chunk size). The third argument will be the name of the full path of a file that contains a set of commands that must be performed on the ChunkLists. Following is an example of the command line:
  ant -Dargs="20 5 /afs/umbc.edu/users/f/r/frey/pub/cs341f11/Proj1/test1.txt" run
Note that you must check command line arguments to ensure that they are valid, e.g. there are three arguments and the command file can be opened. Exit your program if any of these tests fail.

The Command File

Commands in the file specify a sequence of operations to be performed on ChunkLists. Each line in the file represents one command. Blank lines may appear anywhere in the file and should be ignored. Lines whose first character is the pound character ( # ) are comments and should also be ignored. Otherwise, you can assume that any line containing a command is syntactically well-formed. We make this assumption so you don't have to spend lots of time making the command file parser bullet proof.

As noted above, your application creates an array of ChunkLists. The arguments of each command include the index (or indicies) in this array of the ChunkList(s) on which the command is to be performed. You can assume that the array indicies are valid. You can also assume that other parameters in each command are the correct type. The command file format follows. All command names are in uppercase.

Note that for UNION, INTERSECTION and DIFFERENCE, the "result index" may be the same as one of the operand indicies.

ChunkList Operations

One important requirement of the ChunkList is that each node's chunk contains no more than the predefined maximum number of values and must be at least half-full (unless there is only one node). For example, if the maximum number of values for a node's chunk is 5, then every node's chunk must contain 3, 4, or 5 values (again except for the case in which there is only one node). This requirement means that inserting a new value into a node that is alreay full will result in a that node being split to create a new node. Conversely, deleting a value may cause a node's chunk to be less than half-full. In this case, values may be moved from one node to another or nodes may be merged so that all nodes are once again at least half-full.

See the insertion diagrams for examples that show how inserting a new value is performed in different cases. Also see the deletion diagrams for examples that show how deleting a value may affect the structure of the ChunkList.

ChunkList Iterator

To properly support a ChunkListIterator, the ChunkList class must implement the Iterable<T> interface to provide the iterator( ) method that creates ChunkListIterator objects for the user. See the Java API Iterable<T> interface documentation for more details.

The ChunkListIterator class must implement the Iterator<T> interface which requires the methods below. The ChunkListIterator traverses the ChunkList from the first element to the last..

See the Java Iterator<T> interface documentation for more details.

Basic Project Requirements

  1. All commands and their parameters must be echoed as part of your program output.
  2. Your program must exhibit appropriate object-oriented design. In particular the ChunkList class must be reusable and contain no application-specific methods.
  3. Good OO design also exhibits a high degree of abstraction. The ChunkList iterator is one such abstraction that hides the nature of the ChunkList from the user. User code must use ChunkList iterators and have no access to the inner workings of the ChunkList.
  4. For purposes of this project, intersection, union, difference, printing and reverse printing should be considerd as application specific operations.

Extra Credit

The required ChunkListIterator class must implement the Iterator<T> interface. However, since the ChunkList implements a List, it is more appropriate that the ChunkListIterator implement the ListIterator<T> interface. A ListIterator is bi-directional and includes the methods previous( ) and hasPrevious( ) to support backwards traversal of the ChunkList.

For 10 points of extra credit

Sample Output

This partial sample output is provided as an example of an acceptable output format only, showing the minimum required data. It is not the result of executing any program.
linux2[1]  ant -Dargs="20 5 /afs/umbc.edu/users/f/r/frey/pub/cs341f11/Proj1/test1.txt" run
INSERT 0 2

INSERT 0 3

PRINT 0
ChunkList0: 2, 3

INSERT 1 10

INSERT 1 4

PRINT 1
ChunkList1: 4, 10

UNION 2 1 0

PRINT 2
ChunkList2: 2, 3, 4, 10

INTERSECTION 3 0 1

PRINT 3
ChunkList3: Empty

INSERT 0 5

INSERT 0 11

INSERT 0 8

INSERT 0 22

NODES 0
ChunkList0 contains 2 node(s)
Node 1: 3 elements
Node 2: 3 elements

Grading and Academic Integrity

Project grading is described in the Project Policy handout.

Cheating in any form will not be tolerated. Please re-read the Project Policy handout for further details on honesty in doing projects for this course.

Testing of your program will fall into the following categories.

  1. Basic Tests include (but are not limited to) the following
    1. Inserting a few (less than or equal to chunk size) values in sorted order into a single ChunkList.
    2. Inserting more than chunk size (but less than 2 * chunk size) values into a single ChunkList.
    3. Inserting a few values, then removing a value from a single ChunkList.
    4. Finding a value in a ChunkList.
    5. Intersection, Union, and Difference on two small ChunkLists.
  2. More Complex Tests include (but are not limited to) the following
    1. Inserting many unsorted values into multiple ChunkLists.
    2. Inserting many sorted values into multiple ChunkLists.
    3. Union, Intersection, and Difference with identical ChunkLists.
    4. Union, Intersection, and Difference of a ChunkList with itself.
    5. Intersection, Union, and Difference in which the resulting ChunkList is one of the operands.
    6. Removing all elements from a ChunkList.
    7. Intersection, Union, and Difference in which one (or both) of the operands is an empty ChunkList.
  3. Atypical Tests include (but are not limited to) the following
    1. Printing an empty ChunkList.
    2. Attempting to remove a non-existent element.
    3. Attempting to insert a duplicate element.
    4. Attempting to find a non-existent element.
    5. A non-existent command file.
  4. Stress Tests include (but are not limited to) the following
    1. Union, Intersection, and Difference of all ChunkLists.
    2. Inserting a very large number of elements into a ChunkList.

Questions

For this project, 10% of your project grade is based on your answers to questions about the ChunkList. Edit the file P1Questions.txt that was included in your initial project repository to insert your answers to the questions. Be sure to submit your answers by comitting your file to your CVS repository.

Coding Standards and Documentation

For this project, 10% of your project grade will allocated to how well you follow the course coding standards and the degree to which classes and methods are documentd. Classes and methods should be commented using javadoc style comments so that the command ant doc can be used to extract them. Only javadoc style comments will be reviewed to detemine the degree to which your project meets the documentation requirement.

Files To Be Submitted

Files are submitted by being committed to your CVS repository. See the directions for using CVS to submit your code and the CVS utilities for verifying that your code was submitted successfully. Submit only your .java files and your .txt file with the answers to the project questions.

Remember, the due date is firm. Submittals made after midnight of the due date will result in the use of grace days. A maximum of two grace days are allowed for each project. Do not submit any files after that time.