# Data structure problems

There are a large number of possible data structures that are generally useful. These include lists (providing sequential access to a collection of values), vectors (providing random access to a collection of values), maps (associative arrays, providing efficent access to an item specified by a key), sets (unordered collections of unique values), and so on. The problems here are intended as a way of focussing attention on the range of applications that the data structures may be used in, and as a gauge for checking that any proposals are capable of dealing with a range of uses. Suggestions for additional problems are welcome!

1. Generate a list of the ten commonest words in a file. This requires a mapping of string keys to integer values, and the ability to extract the contents into a list of (string,integer) pairs which can be sorted based on the integer component of the pair.

2. Given a tree representing a hierarchy of windows in a GUI, build a list of "pointers" to selected tree nodes (e.g. all buttons in the hierarchy) so that it can be traversed to selectively update that particular set of windows (e.g. by changing the font used for the text of each button). This requires the ability to build arbitrary tree structures and to keep "pointers" to individual nodes outside the tree itself.

3. Implement an efficient sorting algorithm (e.g. quicksort) for a collection of values. This requires that the collection should allow for adequately efficient sorting without the need for a special "sort" primitive. The algorithm should be applicable to a variety of data structures (all proposed data structures?) without any changes. This requires some common protocol for accessing items in different data structures.

4. Generate the set of all possible permutations of a list of values. This requires the ability to recursively permute sublists of the original list.

5. Build a directed graph to represent an non-deterministic finite state automaton (NFA). This requires that the number of edges leading from a particular node can be made arbitrarily large, and the number of edges leading to a particular node can also be made arbitrarily large. Note that matching using an NFA requires sets of states to be manipulated.

6. Implement a stack using another more general data structure (e.g. a list), and use this together with a directed graph representing a maze to find a route through the maze.

Some related design issues:

• Should controlled types be used? If so, automatic storage reclamation as part of finalization is possible, but generic packages using controlled types can only be instatiated at library level.

• Should it be possible to design algorithms which can be applied equally well to any data structure (although there may be a performance difference between particular data structures)?

• Should structures such as stacks, queues, deques and so on be provided, or should a library focus on types such as lists which can be used to build such specialised types?

• Should common algorithms like those in the C++ STL be provided (partition, merge, and so on)?

• What additional data types (other than data collections) would be useful, such as reference-counted objects?