July 18, 2011, 7:35 p.m.

posted by tailcall

## Implementing SetWhen we used the methods of There are six concrete implementations of
## Implementations of the Set interfaceWe will discuss ## HashSetThis class is the most commonly used implementation of An element's position in a hash table is calculated by a hash function
of its contents. Hash functions are designed to give, as far as possible, an even spread of results (hash codes) over the element values that might be stored. For example, here is code like that used in the int hash = 0; for (char ch : str.toCharArray()) { hash = hash * 31 + ch; } Traditionally, hash tables obtain an index from the hash code by taking the remainder after division by the table length. The Collections Framework classes actually use bit masking rather than division. Since that means it is the pattern of bits at the low end of the hash code that is significant, prime numbers (such as 31, here) are used in calculating the hash code because multiplying by primes will not tend to shift information away from the low endas would multiplying by a power of 2, for example. A moment's thought will show that, unless your table has more locations than there are values that might be stored in it, sometimes two distinct values must hash to the same location in the hash table. For instance, no ## A hash table with chained overflowSet<Character> s1 = new HashSet<Character>(8); s1.add('a'); s1.add('b'); s1.add('j'); The index values of the table elements have been calculated by using the bottom three bits (for a table of length 8) of the hash code of each element. In this implementation, a As long as there are no collisions, the cost of inserting or retrieving an element is constant. As the hash table fills, collisions become more likely; assuming a good hash function, the probability of a collision in a lightly loaded table is proportional to its load, defined as the number of elements in the table divided by its capacity (the number of buckets). If a collision does take place, a linked list has to be created and subsequently traversed, adding an extra cost to insertion proportional to the number of elements in the list. If the size of the hash table is fixed, performance will worsen as more elements are added and the load increases. To prevent this from happening, the table size is increased by rehashingcopying to a new and larger tablewhen the load reaches a specified threshold (its load factor). Iterating over a hash table requires each bucket to be examined to see whether it is occupied and therefore costs a time proportional to the capacity of the hash table plus the number of elements it contains. Since the iterator examines each bucket in turn, the order in which elements are returned depends on their hash codes, so there is no guarantee as to the order in which the elements will be returned. The hash table shown in Figure yields its elements in order of descending table index and forward traversal of the linked lists. Printing it produces the following output: [j, b, a] Later in this section we will look at The chief attraction of a hash table implementation for sets is the (ideally) constant-time performance for the basic operations of
HashSet(int initialCapacity) HashSet(int initialCapacity, float loadFactor) Both of these constructors create an empty set but allow some control over the size of the underlying table, creating one with a length of the next-largest power of 2 after the supplied capacity. Most of the hash tablebased implementations in the Collections Framework have similar constructors, although Joshua Bloch, the original designer of the Framework, has told us that new classes will no longer usually have configuration parameters like the load factor; they are not generally useful, and they limit the possibilities of improving implementations at a later date.
## LinkedHashSetThis class inherits from ## A linked hash tableSet<Character> s2 = new LinkedHashSet<Character>(8); Collections.addAll(s2, 'a', 'b', 'j'); // iterators of a LinkedHashSet return their elements in proper order: assert s2.toString().equals("[a, b, j]"); The linked structure also has a useful consequence in terms of improved performance
for iteration: The constructors
for ## CopyOnWriteArraySetIn functional terms, When would you want to use a set with these characteristics? In fairly specialized cases; one that is quite common is in the implementation of the Subject-Observer
design pattern (see Section 9.5), which requires events to be notified to a set of observers. This set must not be modified during the process of notification; with locking set implementations, read and write operations share the overhead necessary to ensure this, whereas with Iterators for Since there are no configuration parameters for ## EnumSetThis class exists to take advantage of the efficient implementations that are possible when the number of possible elements is fixed and a unique index can be assigned to each. These two conditions hold for a set of elements of the same
<E extends Enum<E>> EnumSet<E> of(E first, E... rest) // create a set initially containing the specified elements <E extends Enum<E>> EnumSet<E> range(E from, E to) // create a set initially containing all of the elements in // the range defined by the two specified endpoints <E extends Enum<E>> EnumSet<E> allOf(Class<E> elementType) // create a set initially containing all elements in elementType <E extends Enum<E>> EnumSet<E> noneOf(Class<E> elementType) // create a set of elementType, initially empty An Common cases for <E extends Enum<E>> EnumSet<E> of(E e) <E extends Enum<E>> EnumSet<E> of(E e1, E e2) <E extends Enum<E>> EnumSet<E> of(E e1, E e2, E e3) <E extends Enum<E>> EnumSet<E> of(E e1, E e2, E e3, E e4) <E extends Enum<E>> EnumSet<E> of(E e1, E e2, E e3, E e4, E e5) The third set of methods allows the creation of an <E extends Enum<E>> EnumSet<E> copyOf(EnumSet<E> s) // create an EnumSet with the same element type as s, and // with the same elements <E extends Enum<E>> EnumSet<E> copyOf(Collection<E> c) // create an EnumSet from the elements of c, which must contain // at least one element <E extends Enum<E>> EnumSet<E> complementOf(EnumSet<E> s) // create an EnumSet with the same element type as s, // containing the elements not in s The collection supplied as the argument to the second version of In use, |

- Comment