July 22, 2011, 10:44 a.m.
posted by tailcall
Comparing Set Implementations
Figure shows the comparative performance of the different Set implementations. When you are choosing an implementation, of course, efficiency is only one of the factors you should take into account. Some of these implementations are specialized for specific situations; for example, EnumSet should always (and only) be used to represent sets of enum. Similarly, CopyOnWriteArraySet should only be used where set size will remain relatively small, read operations greatly outnumber writes, thread safety is required, and read-only iterators are acceptable.
That leaves the general-purpose implementations: HashSet, LinkedHashSet, TReeSet, and ConcurrentSkipListSet. The first three are not thread-safe, so can only be used in multi-threaded code either in conjunction with client-side locking, or wrapped in Collection.synchronizedSet (see Section 17.3.1). For single-threaded applications where there is no requirement for the set to be sorted, your choice is between HashSet and LinkedHashSet. If your application will be frequently iterating over the set, or if you require access ordering, LinkedHashSet is the implementation of choice.
Finally, if you require the set to sort its elements, the choice is between treeSet and ConcurrentSkipListSet. In a multi-threaded environment, ConcurrentSkipListSet is the only sensible choice. Even in single-threaded code ConcurrentSkipListSet may not show a significantly worse performance for small set sizes. For larger sets, however, or for applications in which there are frequent element deletions, treeSet will perform better if your application doesn't require thread safety.