May 10, 2011, 6:40 a.m.
posted by tekkero
Introducing Collection Class Interfaces
The previous section examined the most common collection classes and the methods that are specific to them. This section delves into the collection-related interfaces in order to understand the common capabilities of all collection classes. Understanding the interfaces implemented by a collection class gives you a quick overview of a collection's capabilities. Knowing which interfaces a collection class implements is a quick filtering mechanism for choosing the correct collection.
Figure shows the hierarchy of interfaces that make up the collection classes.
7. Generic Collection Interface Hierarchy
You use these interfaces to establish capabilities such as iterating over the collection using a foreach loop, indexing into a collection, and determining the total number of elements in the collection. This section examines these interfaces, starting at the bottom of Figure and moving upward.
IList<T> versus IDictionary<TKey, TValue>
When selecting a collection class, the first two interfaces to look for are IList<T> and IDictionary<TKey, TValue>. These interfaces determine whether the collection type is focused on retrieval via index or retrieval via key. If the type of collection you are using should be key-centric, use a collection class that implements the IDictionary<TKey, TValue> interface. Alternatively, the IList<T> interface provides support for element retrieval via index. In other words, although both of these interfaces require that the indexer be implemented, the implementations are fundamentally different. In the case of IList<T>, the parameter passed to the array operator corresponds to the index of the element being retrieved, the nth element in the list. In the case of the IDictionary<TKey, TValue> interface, the parameter corresponds to the key of a previously inserted element. When assigning using the key, a new item will be inserted if one doesn't already exist for the specified key.
Before I discuss the next interface in Figure, I need to discuss an interface that does not appear in the diagram but is nonetheless key to both IList<T> and IDictionary<TKey, TValue>. The IComparable<T> interface is crucial for any sorting operation by classes implementing these interfaces. For example, if the List<T>.Sort() method is called, you need a means to compare objects to determine their order. One way to do this is via the IComparable<T> interface. This interface has one method, CompareTo(). It returns an integer indicating whether the element passed is greater than, less than, or equal to the current element. For this to work the key data type needs to implement IComparable<T>.
Both IList<T> and IDictionary<TKey, TValue> are derived from ICollection<T>. Furthermore, even if a collection does not implement either IList<T> or IDictionary<TKey, TValue> it will almost certainly support ICollection<T>. This interface is derived from IEnumerable<T> and includes two members: Count and CopyTo().
Iterating Using a foreach Loop
Chapter 3 showed how to use a foreach statement to iterate over an array of elements. The syntax is simple and avoids the complication of knowing how many elements there are. The runtime does not directly support the foreach statement, however. Instead, the C# compiler transforms the code as described in this section.
foreach with Arrays
Consider Listing 12.9.
Listing 12.9. foreach with Arrays
From this code, the C# compiler creates a CIL equivalent of the for loop, as shown in Listing 12.10.
Listing 12.10. Compiled Implementation of foreach with Arrays
In this example, note that foreach relies on support for the Length property and the index operator (). With the Length property, the C# compiler can use the for statement to iterate through each element in the array.
foreach with IEnumerable<T>
Although the code shown in Listing 12.10 works well on arrays where the length is fixed and the index operator is always supported, not all types of collections have a known number of elements. Furthermore, many of the collection classes, including the stack, queue, and dictionary classes, do not support retrieving elements by index. Therefore, a more general approach of iterating over collections of elements is needed. The iterator pattern provides this capability. Assuming you can determine the next element and when there are no more elements, knowing the count and supporting retrieval of elements by index is unnecessary.
The System.Collections.Generic.IEnumerator<T> and nongeneric System.Collections.IEnumerator interfaces (see Listing 12.12) are designed to enable the iterator pattern for iterating over collections of elements, rather than the length-index pattern shown in Listing 12.10. A class diagram of their relationships appears in Figure.
Figure. IEnumerator<T> and IEnumerator Interfaces
IEnumerator, which IEnumerator<T> derives from, includes three members. The first is bool MoveNext(). Using this method, you can move from one element within the collection to the next while at the same time detecting when you have enumerated through every item. The second member, a read-only property called Current, returns the element currently in process. Current is overloaded in IEnumerator<T>, providing a type-specific implementation of it. With these two members on the collection class, it is possible to iterate over the collection simply using a while loop, as demonstrated in Listing 12.11. (The last member, Reset(), will reset the enumeration.)
Listing 12.11. Iterating over a Collection Using while
In Listing 12.11, the MoveNext() method returns false when it moves past the end of the collection. This replaces the need to count elements while looping.
This example shows the gist of the C# compiler output, but it doesn't actually compile because it omits two important details about the implementation: interleaving and error handling.
State Is Shared. The problem with Listing 12.11 is that if there are two interleaving loops over the same collection, one foreach inside another, the collection must maintain a state indicator of the current element so that when MoveNext() is called, the next element can be determined. The problem is that one interleaving loop can affect the other. (The same is true of loops executed by multiple threads.)
To overcome this problem, the collection classes do not support IEnumerator<T> and IEnumerator interfaces directly. As shown in Figure, there is a second interface, called IEnumerable<T> (IEnumerable is the nongeneric version), whose only method is GetEnumerator(). The purpose of this method is to return an object that supports IEnumerator<T>. Instead of the collection class maintaining the state, a different class, usually a nested class so that it has access to the internals of the collection, will support the IEnumerator<T> interface and will keep the state of the iteration loop. Using this pattern, the C# equivalent of a foreach loop will look like the code shown in Listing 12.12.
A Separate Enumerator Maintaining State during an Iteration
(Notice that if a nongeneric collection is used the enumerator will be of type IEnumerator rather than IEnumerator<T>, and an additional cast will occur when retrieving the enumerator.Current property.)
Error Handling. Since the classes that implement the IEnumerator<T> interface maintain the state, sometimes the state needs cleaning up after all iterations have completed. To achieve this, the IEnumerator<T> interface derives from IDisposable. Enumerators that implement IEnumerator do not necessarily implement IDisposable, but if they do, Dispose() will be called as well. This enables the calling of Dispose() after the foreach loop exits. The C# equivalent of the final CIL code, therefore, looks like Listing 12.13.
Listing 12.13. Compiled Result of foreach on Collections
Notice that because the IDisposable interface is supported by IEnumerator<T>, you can simplify the C# code with the using keyword, as shown in Listing 12.14.
Listing 12.14. Error Handling and Resource Cleanup with using
Do Not Modify Collections During foreach Iteration
Chapter 3 showed that the compiler prevents assignment of the foreach variable identifier (number). As is demonstrated in Listing 12.14, an assignment of number would not be a change to the collection element itself, so instead of mistakenly assuming that, the C# compiler prevents such an assignment altogether.
In addition, the element count within a collection cannot be modified during the execution of a foreach loop. If, for example, you called stack.Push(42) inside the foreach loop, it would be ambiguous whether the iterator should ignore or incorporate the change to stackin other words, whether iterator should iterate over the newly added item or ignore it and assume the same state as when it was instantiated.
Because of this ambiguity, an exception of type System.InvalidOperationException is thrown if the collection is modified within a foreach loop, reporting that the collection was modified after the enumerator was instantiated.