July 21, 2011, 10:18 p.m.
posted by vdv
Detailed Replication Transaction Descriptions
Now that we have a general idea of how replication works, let's examine details of the replication transactions themselves. This information helps to diagnose replication problems. It also helps to make critical architectural decisions such as where to place specific domain controllers and how to select the proper inter-site polling frequency.
Multiple-master replication raises several challenges:
Active Directory deals with the challenges of multi-master replication by embedding replication control information into each property. This information is called property metadata. The metadata information is saved along with the property's primary value each time the property is modified. This is called an atomic transaction, meaning that if one value isn't written, none of them are written.
You can view the property metadata for an object in one of several ways. The simplest is to use the REPADMIN utility. This utility is available on all domain controllers. You'll need to know the distinguished name (DN) of the object you want to view. For instance, the DN for the Sites object in the Company.com domain would be: cn=Sites,cn=Configuration,dc=Company,dc=com. Here is the REPADMIN syntax and a sample listing. The property in bold was updated after the object was initially created:
repadmin /showmeta cn=sites,cn=configuration,dc=company,dc=com 10 entries. Loc.USN Originating DSA Org.USN Org.Time/Date Ver Attribute ======= =============== ======= ============= === ========= 1165 Phoenix\DC-01 1165 2002-02-23 17:48.40 1 objectClass 1165 Phoenix\DC-01 1165 2002-02-23 17:48.40 1 cn 2765 Phoenix\DC-02 2843 2002-02-24 18:14.10 3 description 1165 Phoenix\DC-01 1165 2002-02-23 17:48.40 1 instanceType 1165 Phoenix\DC-01 1165 2002-02-23 17:48.40 1 whenCreated 1165 Phoenix\DC-01 1165 2002-02-23 17:48.40 1 showInAdvancedViewOnly 1165 Phoenix\DC-01 1165 2002-02-23 17:48.40 1 nTSecurityDescriptor 1165 Phoenix\DC-01 1165 2002-02-23 17:48.40 1 name 1165 Phoenix\DC-01 1165 2002-02-23 17:48.40 1 systemFlags 1165 Phoenix\DC-01 1165 2002-02-23 17:48.40 1 objectCategory
The PVN is the key value that determines whether two replicas are internally consistent. The entire replication system is geared to ensure that properties with the same PVN have the same primary value. This is how Active Directory resolves the data integrity challenge. The next few sections detail how property metadata values help control redundant replication requests, prevent circulating updates, and manage collisions.
Example Replication Transactions
The next few examples use a three-node replication ring in a single domain and single site as shown in Figure. The Global Catalog status does not matter because there is only one domain.
The example traces a change to a property for a user object with the common name cn=Al Bondigas. Here are a few of the property metadata values for the cn=Al_Bondigas object. The listing was taken by running REPADMIN on domain controller DC-01. The USN and PVN values for the Title property, shown in bold, are different because it was updated after Al's user object was created:
Loc. USN Originating DSA Org.USN Org.Time/Date PVN Attribute ======= =============== ==== ============= === ========= 1416 Atlanta\DC-01 1416 2002-022002-02-01 01:20.50 1 objectClass 1416 Atlanta\DC-01 1416 2002-022002-02-01 01:20.50 1 cn 1416 Atlanta\DC-01 1416 2002-022002-02-01 01:20.50 1 description 14D3 Atlanta\DC-01 14D3 2002-022002-02-02 12:14.31 2 title 1416 Atlanta\DC-01 1416 2002-022002-02-01 01:20.50 1 department
Use of Update Sequence Number (USN)
The Local USN value prevents redundant replication requests. The system uses the Local USN as a high water mark to filter out all but the most current updates between replication partners. Procedure 7.2 shows how it works.
If you get impatient working through process traces, I really don't blame you. Here are the important points to remember so far:
Use of Up-To-Dateness Vector
The circular nature of directory replication makes it possible for updates to propagate back to domain controllers that have already received the update. Unchecked, these updates keep circulating and circulating, the packets getting bigger and bigger, until the entire network hemorrhages in an Ebola virus of unchecked replication traffic.
The property metadata includes the identity of the originating server. When a domain controller receives a replication packet, it takes note of the originating server and the USN assigned by that server and stores this information in the UTD Vector table. The UTD Vector table contains the GUID and high USN for every domain controller that has ever originated an update.
A domain controller includes its copy of the UTD Vector along with the high water mark USN in each replication request. In effect, it says, "Give me the most recent updates and also don't bother giving me any updates that I've gotten from another source." Procedure 7.3 is an example of how this works.
The UTD Vector is a critical component in preventing replication storms. Remember these important points:
Replication Collision Handling
So far, we've seen how property metadata ensures data consistency and controls unnecessary replication. We now need to address the final challenge of multi-master replication: how to deal with conflicting changes made on different domain controllers. These are called collisions. Several situations can cause a collision:
Deleted Object Handling
Objects deleted from Active Directory are not immediately removed from the database. This is because the system relies on replication to inform replication partners of changes, and it cannot very well replicate the absence of an object.
Following the object deletion, the domain controller notifies its replication partners. The partners pull a replication packet with an update that, essentially, changes the distinguished name of the object to move it to the Deleted Objects container. The replication partners perform this object move and strip the attributes.
The Deleted Objects container is not revealed in the Active Directory management consoles. Nor does it appear in the ADSI Edit tool. See the sidebar, "Viewing Deleted Objects," for a way to browse the deleted objects in a given naming context.
When the tombstone interval expires for a given object, the object is eligible for complete removal from the database. The removal is performed by the garbage collection process. Garbage collection runs every 12 hours. When garbage collection runs, it removes any expired tombstones then packs the database and re-indexes. Without doing this periodically, performance would degrade.
A 60-day tombstone interval may seem like a long time, but it is long for a purpose. It is essential that the presence of deleted objects not cause database corruption following a tape restore of Active Directory.
Consider what would happen if the tombstone interval were 1 day instead of 60. Let's say a domain controller fails. You are unable to find a viable tape newer than three days old—not an uncommon occurrence. You decide to use the tape because you know that the domain controller will update the older Active Directory copy by pulling changes from its replication partners.
Unfortunately, the tape is older than our hypothetical 1-day tombstone interval. This means that objects deleted since the tape backup may have been expunged from the directory database on other domain controllers. When you restore from the older tape, the objects would be in their original containers and there would be no information to the contrary in the databases of the replication partners. This would leave the objects in their original locations, corrupting the directory because it would now be different than its peers.
The 60-day interval gives you lots of leeway in doing a tape restore; but always keep in mind that any copy of Active Directory, whether it comes from tape or a disk image or wherever, becomes useless 61 days after it was obtained.
The Help Desk technician and the system administrator are logged on to different domain controllers. They make their changes during the same replication interval. This means that the modified attributes have the same PVN but contain different information.
Within the next 15 seconds, the changes begin to circulate around to the other domain controllers. Because the updates have the same PVN, the other domain controllers must decide which one to retain.
They start by using the timestamp applied by the originating domain controller as a tiebreaker. If one update were saved a few seconds after the other, it would be retained.
If the updates were saved at exactly the same time, the domain controllers apply an ultimate tiebreaker: a comparison of the GUID of the originating domain controller. The highest GUID wins. Sure, it's arbitrary, but at least it's consistent—something like parental discipline.
Identical Distinguished Names
Here's another situation. A new user joins the company. Your internal procedures require that an HR representative with special directory permissions create the user object in Active Directory and set the correct parameters. This particular user is a VIP, though, and someone expedites the process by calling an Operations administrator directly.
The HR representative and the Operations administrator both create a user object with the same name in the same container during the same replication interval. Objects cannot have the same distinguished names in Active Directory. When the new objects begin to replicate around the network, the other domain controllers are faced with a decision. Which object should be retained and which should be discarded?
There's a possibility that information about a user could be lost if one object simply overwrites another, so both objects must be retained. One object is renamed to give it a different distinguished name. Each domain controller must rename the same object the same way so that the directory replicas remain consistent.
The tiebreaker once again is timestamp followed by GUID. If the objects were created at different times, the domain controllers retain the name on the object with the later timestamp. The object with the earlier timestamp is renamed. If the timestamps match, the object created on the server with the higher GUID retains its name.
The losing object gets a new name using this process:
The resultant name would look like this:
The only warning you get about this action is an entry in the Event log. When you discover the problem, you must examine both objects to decide which one to keep and which to delete.
Here's another situation that can cause a collision. You make a change to an object's property while, at the same time, another administrator on another domain controller moves the object to a different container. When the property update arrives at the domain controller where the object was moved, the directory engine has a dilemma. The distinguished name of the object has changed, but the DN associated with the property for that object has not.
Active Directory resolves this problem easily because the GUID of the object does not change when the DN changes. The directory just does a lookup for the object GUID, finds the new DN, and updates the property in the correct location.
The next collision situation could occur in this fashion. A Facilities staffer makes a change to the Telephone attribute for a user. At same time, the user's manager fires the user and insists that a network administrator delete the user's object. (In a true production environment, you would disable a user account for a period of time rather than delete it. You don't want to lose the user's Security Identifier (SID) until you're sure the user will never return.)
Recall that when an object is deleted, the object is moved to the Deleted Objects container. It is also stripped of all but a few properties. If an update for one of those stripped properties arrives after the object has been deleted, the system ignores the update.
For instance, let's say that a friend of yours, Andy, calls you up to say that he's been promoted to manager of the company's branch office. "Do what you folks do to get me their computer stuff in the branch office, okay?" he says to you. What he wants is to be put in the Branch OU so he can get their desktop configuration and group policies. You're happy to do it and you move the object.
At the same time, the CEO walks into another administrator's office and says, "We're destaffing the branch office. Do whatever you computer people do to remove their network access. They won't be back." The administrator obliges the CEO by deleting the entire Branch OU.
An object move changes several properties of an object, including the object's distinguished name. If the updates to Andy's object arrive at a domain controller after the Branch OU has been deleted, Active Directory has a problem. Ordinarily, the system would move Andy's object to the target OU in its new location, but in this case that is the Deleted Objects container and the system doesn't know if you really wanted to delete the object. So instead, it moves the object to a hidden container called Lost & Found.
Unlike the Deleted Objects container, Lost & Found can be viewed from the AD Users and Computers console by enabling the Advanced view options. The user object remains enabled so the user can still log on. You may be unaware that there's a problem unless the user complains that the desktop doesn't show the expected result from inheriting group policies.
Windows Time Service
Proper time synchronization is critical for railroads, airports, blind dates, and Active Directory domains. Proper collision management requires consistent timestamps, so it's important that domain controller clocks stay in sync with each other. Also, the Kerberos authentication system uses timestamps to ensure that bad guys don't hijack authentication tickets and replay them at a later time.
WTS uses a standard implementation of the Simple Network Time Protocol (SNTP) as promulgated in RFC 2030, "Simple Network Time Protocol (SNTP) Version 4 for IPv4, IPv6 and OSI." Every Windows 2000 and Windows Server 2003 domain controller is an SNTP time server.
SNTP uses a hierarchical approach to distributing time updates. The PDC Emulator acts as the time standard for a domain. (See Chapter 8, "Designing Windows Server 2003 Domains," for a description of Flexible Single Master Operators such as the PDC Emulator.) In a forest, the PDC Emulators in each domain sync their clocks with the PDC Emulator in the root domain. The root domain is the first domain created in the forest.
NET TIME and WIN32TM
For example, if you want to know the time source for a particular client, type
w32tm -source -v
This prints out a verbose listing that shows how WTS on the server discovered its time sources and what sources it discovered. By default, modern Windows clients use a domain controller for a time standard.
You can also use net time /querysntp to show the time source for a client, but this will show time.windows.net, a time server on the Internet maintained by Microsoft. Member computers ignore this SNTP time source entry in favor of their logon server. Domain controllers ignore this SNTP time source entry in favor of the PDC Emulator in their domain. The PDC Emulators in each domain look to the PDC Emulator in the root domain.
net time /setsntp:<time_standard>
The most commonly used time servers in the U.S. are maintained by the National Institute of Standards and Technology (NIST). Go to www.boulder.nist.gov/timefreq/service/time-servers.html for a list of the servers and their IP addresses. You'll need to open UDP port 123 through your firewall to use SNTP.
You can use this same procedure if you have a standalone server or desktop at home that you want to keep synchronized with an Internet time standard. If you want to resynchronize a client, type
This does a one-time resync to the NTP time standard for the client. You can also type
net time /domain:<domain_name> /set
Both of these options require you to have Change Local Time permission at the client. You can verify that you have this permission using the WHOAMI utility from the Resource Kit. Enter whoami /all to see your permissions and the groups you belong to by name and SID. Ordinary users are not granted permissions to change the local system time.
If you have laptop users who experience time synchronization problems because they are off the network for long periods of time, you can give them permission to change the local time with a group policy. The policy is located in Computer Configuration | Windows Settings | Local Policies | User Rights Assignments | Change The System Time. Create a security group for your laptop users and add that group to the policy.
You should not manually set the time on a domain controller. If you do, it puts incorrect timestamps on the updates it makes to Active Directory properties until it syncs again with the PDC Emulator. This can cause integrity problems when the objects are replicated. Always change time by synchronizing the domain controller with the time standard server. The easiest way to do this is by using the NET TIME command as follows:
net time \\local_computer_name /set
The NET TIME command cannot be used to connect to a non-Windows server because it uses SMB (Server Message Block) to handle the transaction. If you want to set a server to an outside time source, use the following syntax (the example shows the time standard server from the National Institute of Standards and Technology):
net time /setsntp:nist1.datum.com
The net time /setsntp option updates the Registry with the names or IP addresses of SNTP time servers. The target time server is checked when the server boots. If you use net time or net time /set, the /setsntp switch has no effect.