## CSci 2101 Data Structures: Problem set 5

Due Wednesday, May 4 at 8pm.

### Problem 1. AVL Trees (15 points)

Fill in the missing code in this implementation of AVL trees. Make sure to test it well and submit all your test cases. Print out the tree in in-order and pre-order to see the positions of all elements. Explain the results of the tests and show that the tree is indeed an AVL tree after each insert. You may submit this part of the problem on paper (draw the resulting tree after each insert).

### Problem 2. Hash tables (6 points)

Given this implementation of hash tables, add code to compute the number of collisions that happen when elements are inserted. Note that you shouldn't include collisions for searches.

Print out the total number of collisions that happen if you are inserting random elements between 0 and 999 into the hash tables that use the three different methods of collision resolution. Specifically, you need to compute the average number of collisions per inserted element if the table is (approximately) 1/4 full, 1/2 full, 3/4 full, and completely full. Note that the quadratic probing uses a table of 64 elements, and the other two tables have 67 elements.

The results for number of collisions should be printed to a file `statistics.txt`.

### Problem 3: Counting Sort (15 points)

Counting sort is an algorithm for sorting elements with integers keys, where all keys are in a known interval. The algorithm is described in CLRS Section 8.2 (pp. 168). For this problem we assume that the keys are integers from 1 to 20 (inclusive) and two elements may have the same key value.

We assume that objects sorted by the algorithm implement the following abstract class `KeyOneTwenty`:

``````
public abstract class KeyOneTwenty {

// public method that returns a key only if
// it's between 1 and 20
public final int Key() {
int key = _Key();
if (key < 1 || key > 20) {
System.out.println("invalid key " + key);
System.exit(0);
}
return key;
}

// private class-specific method
protected abstract int _Key();
}
``````
Your task is to implement the counting sort algorithm as a method
``````
public static void countingSort(KeyOneTwenty [] items)
``````
You also need to write a class `StringKey` which extends `KeyOneTwenty`. The class has one private variable `str` of type String. The `_Key()` method of the class should return the length of the string.

Test your counting sort algorithm by creating an array of StringKeys in main and calling the method countingSort on that array.

### Problem 4: Adding exceptions to Counting Sort (6 points)

Modify your counting sort solution as follows:
• Write a class `InvalidKeyValueException` which is an extension of `Exception` (not `RuntimeException`!).
• Change the class `KeyOneTwenty` so that the Key method throws the `InvalidKeyValueException` when the key is invalid (instead of exiting the program via System.exit(0)). No error message should be printed.
• Modify the counting sort method so that the program doesn't exit when invalid keys are encountered. Instead these elements are ignored during sorting, and the rest of the elements are sorted as usual. A message "Invalid key ..." (where ... is replaced by the actual incorrect key value). At the end of the sorting method print out the number of elements with invalid keys.

Check your program to make sure that only the `InvalidKeyValueException` is caught. The rest of the exceptions should behave as usual (i.e. print the error message and terminate the program).

Submit all of your test cases.

### Problem 5: Adjacency matrix implementation of a graph (15 points)

In this program you have to write an implementation of a graph using an adjacency matrix. Your implementation must work exactly the same way as the adjacency list implementation. Specifically, if I replace the class Graph in the testing program by your implementation, the output of the program must be exactly the same. However, you should use a 2-dimensional array for adjacency matrix instead of classes Vertex and Edge.
You need to implement Breadth-First Search (BFS) and Depth-First Search (DFS) on your graph. You also need to compare the running time of BFS and, separately, DFS for the two implementations.

More specifically, you need to do the following:

1. Replace the array of vertices in the Graph class by a two-dimensional array of integers to represent the edges. You will NOT need the classes Vertex and Edge in your implementation at all, Graph is the only class in this implementation.
2. Change all the methods of the new Graph class to work with the new data representation. Do not change any of the method parameters or the types of return values.

You should start by commenting out the old method code, put some dummy return values for all non-void methods, such as an empty string for toString(), and make sure that the program compiles and runs. Then start writing code for the methods one by one, compile and test often.

3. Test your code for an undirected and a directed graph.
4. Compare the two implementations on random graphs of 1000 vertices and 10000 edges:
• Create a graph of 1000 vertices (note: use the IntQueue implementation that can hold at least 1000 elements)
• Use a random number generator to generate random starting and ending points of the graph. The resulting graph doesn't have to be connected, edges may be repeated, and self-loops are OK.
• The method `currentTimeMillis()` of the System class returns the current time in milliseconds (as a long integer). Get the reading of the time before and after a call to BFS() and DFS(), subtract the readings, and print out the results. For instance, to find out how long a BFS() method takes, we can write this:
``````
long time1 = System.currentTimeMillis();
g.BFS();
long time2 = System.currentTimeMillis();
System.out.println("The method took " + (time2 - time1) + " milliseconds");
``````
Important: print statements take up a lot of running time. Comment out all print statements when you are running your tests. Print out the times only at the end of the program.
• Measure the running times for the two implementations, compute the averages of 5 runs for each (on paper or in a program), and compare the results. Which implementation is more efficient?
• Extra credit 1. Do the results depend on whether the graphs are sparse or dense? Investigate the differences. Submit your tests, the results, and the conclusions.
• Extra credit 2. Do the times depend on how efficient your queue implementation is? Does the comparison between the two implementations depend on the efficiency of the queue? Explain your reasoning. You don't have to run tests for this question, but you can if you'd like.

This is a problem set from CSci 2101 course.

The views and opinions expressed in this page are strictly those of the page author. The contents of this page have not been reviewed or approved by the University of Minnesota.