April 12, 2011, 8:54 a.m.
posted by sensei
Whether you are searching Active Directory using filters or with SQL, there are some important guidelines to follow that can help reduce load on the domain controllers, increase performance of your scripts and applications, and reduce the amount of traffic generated on the network. It is also important to socialize these concepts with others as much as possible. It takes only a couple of badly written search filters in a heavily used application to severely impact the performance of your domain controllers!
Understanding how to write efficient search criteria is the first important step to optimizing searches. By understanding a few key points, you can greatly improve the performance of your searches. It is also important to reuse data retrieved from searches or connections to Active Directory as much as possible. Microsoft has provided a paper on creating more efficient Active Directory Enabled applications. This paper has a lot of detailed information concerning efficient queries, you can find the paper at the URL:
The following list describes several key points to remember about searching:
Objectclass Versus Objectcategory
It is very important to understand the differences between objectClass and objectCategory and how they should be used during searches. The first point to be aware of from a searching standpoint is that objectClass is not indexed. Initial concerns of the indexing code in the Windows 2000 alpha and beta periods indicated that indexed attributes should be unique single-valued attributes. Even though the indexing process was corrected to allow for efficient indexing and retrieval of multi-valued attributes, objectClass indexing was not implemented in the OEM product. It is possible to index this attribute, however, and several large companies have found it quite desirable from a performance standpoint. See the Chapter 4 discussion on searchFlags for more information on this.
Another major difference is that objectClass is a multi-valued attribute that contains the objectClass hierarchy for an instantiated object. For example, a user object has the following values as part of its objectClass attribute:
That is because the user class inherits from the organizationalPerson class, which inherits from the person class, which inherits from the top class. When a class inherits from another, the attributes of the inherited class (also known as the parent class) are available for the inheriting class to use. A class can inherit attributes from abstract and structural classes, which would show up in the objectClass attribute for an instantiated object, but auxiliary classes that get statically associated with a particular class do not. Statically associated auxiliary classes allow for a grouping of attributes to be associated with one or more classes in a similar manner to just adding attributes directly to a class's definition and this association can be determined by looking at the schema. Windows Server 2003 Active Directory in forest functional mode allows for dynamically associated auxiliary classes; these classes do show up in the objectClass attribute for the instantiated object because you can dynamically associate auxiliary classes on an object by object basis.
ObjectCategory, on the other hand, is a single-value indexed attribute, which specifies a classification for a type of object. ObjectCategory is intended to be an easy way to query for a certain "category" of objects, such as "Person." As an example, both user and contact objects have an objectcategory of Person, so by simply searching for (objectcategory=Person), you could possibly retrieve user or contact objects.
For the reasons listed previously, queries should use objectCategory or a combination of objectClass and objectCategory as part of the search filter or SQL. The primary reasons for not using just objectClass is that it is not indexed and is multi-valued, which does not make for an efficient query. The other classic problem with using only objectClass is that you can end up with more object types than you were expecting. This is a common problem with using (objectClass=user). You would think you'd only get user objects back using that filter, but you can also potentially get computer objects as well, since the computer objectClass is inherited from the user class (therefore causing it to be one of the values for the objectClass attribute for every computer object). And even though it would be efficient to use only objectCategory because it is indexed, it falls into the same trap as objectClass, because additional objects other than the one you are targeting may get returned (e.g., user objects and contact objects). It is for these reasons that you should try to use a combination of objectClass and objectCategory in your searches.
Several examples are included next to illustrate what using various combinations of objectClass and objectCategory can return:
Filtering an Existing Resultset
An optimization technique that can be used when you need to perform a lot of queries is to instead perform one large query and repeatedly filter the resultset to get the subset of entries you want. It is possible to select particular items from a resultset by using the Recordset::Filter property method. Once the Recordset::Filter property has been set, you can access only the items in the resultset that match the filter. Properties such as the Recordset::RecordCount return only the number of items that match the filter. If you then set the filter back to an empty string, the whole resultset is available again. Because filtering a resultset relies on data that is present in the resultset, you can only filter using the Fields object and its values. For example, if you only specify to return the givenName and sn attributes in a query, you can use only those attributes to filter the resultset later. If you do not return cn as a field, there is no way to filter on it later.
Being able to filter an existing resultset is a useful tool, but only in certain situations. In our experience, it is especially useful in three situations:
Let's consider a contrived example where use of the Recordset::Filter makes some sense. Let's say we want to count how many usernames begin with each of the 26 letters of the alphabet. The most intuitive method is probably to execute 26 ADO searches and record the Recordset::RecordCount property for each. However, this will hit Active Directory with 26 separate searches. Now let's expand the requirement and say we need these totals recorded continually in a file every minute or so. By now, you may be unwilling to keep hitting Active Directory with this sort of traffic every minute. The other alternative is to execute a single search for all users and loop through the resultset using Recordset::MoveNext, updating an array of 26 counts as we go. This hits Active Directory only once, but it iterates through every item. This process is fast for a moderate number of users, but for a really large number of users, it is much slower. If your resultset returns, say, 20,000 users in a single search, you need to use Recordset::Filter.
To solve the problem, we can write a piece of code that executes one search and then sets 26 separate filters, recording the Recordset::RecordCount value at each stage. Figure contains the sample code, from which the values are written to the C:\out.txt file.
Using recordset filters to reduce the load on Active Directory
The filter property must be set using a SQL-like query string, not an LDAP search filter. The recordset filter notation is fairly simple to use. The string can be an empty string (""), which removes the current filter; a criteria string; or an array of bookmarks . Bookmarks will be explained in more detail shortly.
Using a criteria string
The criteria string can take a number of different forms, which basically can be broken down to:
Field-name operator value-to-check
Here are some simple examples:
Name = vicky 'Checks for exact equivalence (=) size < 10 'Checks for less-than (<) size > 10 'Checks for greater-than (>) size >= 5 'Checks greater-than-or-equal-to (>=) size <= 20 'Checks less-than-or-equal-to (<=) size <> 10 'Checks for not-equal-to (<>)
Dates are simple to check if you surround them with pound signs (#):
Date = #12/12/99#
You also can use the keyword LIKE:
cn LIKE 'a*' 'Checks for all cn's beginning with "a" cn LIKE 'ca%' 'Checks for all three-letter cn's beginning with "ca" cn LIKE '*eithCoo*'
You can also use AND and OR:
size > 10 AND size < 20 cn LIKE 'a*' OR cn LIKE 'b*'
However, there is a strict rule to follow if you want to group a criteria string containing OR with another string using AND. Again, this is sloppy, and Microsoft should look to fixing it in a later release:
(cn LIKE 'a*' OR cn LIKE 'b*') AND (size <> 10) 'This is WRONG! (cn LIKE 'a*' AND size <> 10) OR (cn LIKE 'b*' AND size <> 10) 'This is CORRECT!
That should be enough to get you started.
Each object in a resultset has a bookmark associated with it. You can always obtain the bookmark for the current record and store it for later use by retrieving the value of ResultSet::Bookmark . After recording the bookmark, you can instantly jump to that record in the resultset at any time by writing the recorded value back to the bookmark property. For example:
'Record the bookmark for the current record objBookMark = objRS.Bookmark 'Do something 'Now return the current record to the record indicated by the bookmark objRS.Bookmark = objBookMark