April 13, 2011, 7 a.m.
posted by vdv
Functional Description of the Windows Server 2003 Boot Process
Computers are like airplanes: they are most prone to accidents during takeoffs and landings. This topic contains a detailed analysis of the Windows Server 2003 boot process for use in troubleshooting upgrade and setup problems.
What follows is the IA32 boot process. The IA64 boot process is much simpler because EFI locates and launches the secondary bootstrap loader, Ia64ldr.efi, using information stored in Non-Volatile RAM (NVRAM). If you have an IA64 system, skip to the "System Kernel" topic later in this chapter.
When you turn on any IA32 computer, the system starts out with a Power On Self Test (POST). Specific actions during the POST vary from system to system. A short beep at the end indicates a successful completion. For IA64 computers, the POST includes a full system configuration check done by the Extensible Firmware Interface (EFI).
The final step in the POST for IA32 machines is a handoff to an INT13 routine in BIOS that checks for a bootable device. For IA64 machines, the EFI finds the bootstrap loader using information in the boot menu.
Depending on your BIOS settings, the boot routine usually starts at the A: drive followed by the drive determined by the system BIOS to be the boot drive. This is the master drive on the primary IDE interface, if one exists, followed by the SCSI drive designated as the boot drive in the SCSI BIOS. You cannot boot to a SCSI drive if a bootable IDE drive is present. The system loads the Master Boot Record (MBR) from this drive.
There is executable code in the MBR that is just smart enough to scan the partition table at the end of the MBR to find the sector/offset of the active boot partition. If the code cannot find suitable entries in the partition table, it displays the error Invalid Partition Table. If the code finds a partition table but cannot locate the start of the active partition, it displays either the error Error Loading Operating System or the error Missing Operating System.
When the executable code in the MBR finds the start of the active partition, it loads the first sector (512 bytes) into memory at location 0x700h. This is the called the partition boot sector, or more commonly just the boot sector. The boot sector contains executable code designed to find and load a secondary bootstrap loader. The executable code in the boot sector cannot read a file system, so the secondary bootstrap loader must be at the root of the boot drive.
On a DOS machine, the secondary bootstrap loader is IO.SYS. On IA32 versions of Windows Server 2003 and XP, Windows 2000, and NT, the secondary bootstrap loader is Ntldr. If Ntldr is missing or will not load, the boot sector code displays error messages such as A disk read error has occurred or Ntldr is missing or Ntldr is compressed. In all these events, the message includes the instructions Press Ctrl+Alt+Del to restart.
When Ntldr executes, it initializes the video hardware and puts the screen in 80x25 mode with a black background. It then switches the processor to Protected mode to support 32-bit memory addressing and initializes miniature versions of the NTFS and FAT file system drivers contained in the Ntldr code itself. These file system drivers permit Ntldr to see enough of the drive to load the remaining Windows Server 2003 system files.
Ntldr now locates a file called Boot.ini, which contains the Windows Server 2003 boot menu. See the "Working with Boot.ini" section later in this chapter for details about the entries in Boot.ini. If Ntldr cannot find Boot.ini, it displays the error Windows Server 2003 could not start because the following file is missing or corrupt: Boot.ini. If Boot.ini is present but does not contain valid entries, any number of errors can appear. Generally, they indicate a problem with default ARC path.
If there is only one entry in the Boot.ini file, Windows Server 2003 automatically uses that entry with no delay time. If there are multiple entries, the delay time is set to 30 seconds by default.
If you install Windows Server 2003 in addition to an existing operating system such as Windows 9x or DOS, the boot sector from the previous operating system is saved in a file called Bootsect.dos. The alternate operating system becomes a selection in the boot menu.
If you select the alternate operating system from the boot menu, Ntldr loads the contents of Bootsect.dos into memory at file location 700h, shifts back to Real mode, and turns control over to the executable code in the boot sector image.
You can quickly change the boot menu delay time in both IA32 and IA64 systems using the new BOOTCFG utility in Windows Server 2003. The syntax is bootcfg /timeout # where # is the number of seconds. You can set the delay time to 0 to avoid the menu entirely.
On IA32 systems, you can manually edit the Boot.ini file to set the timeout value to –1 to disable the countdown timer completely. In this configuration, the system will sit at the boot menu until you make a selection. You can do the same on IA64 systems using the Boot Manager Maintenance utility.
Before Ntldr can execute the kernel image, it needs to know something about the hardware. This is the cue for Ntdetect.com, which gathers the same kind of information on an IA32 machine that EFI delivers from firmware on an IA64 machine.
Don't confuse Ntdetect with the Plug and Play Manager. PnP enumeration happens much further along in the boot process. Ntdetect looks for the following hardware configurations: (The CPU type and FPU type are detected later on by Ntoskrnl. exe and Hal.dll.)
If you select a Windows Server 2003 operating system from the boot menu, the secondary bootstrap loader (Ntldr for IA32 and Ia64ldr for IA64) uses that partition to load the operating system kernel, Ntoskrnl.exe, and its associate Hardware Abstraction Layer library, Hal.dll, along with a video driver, Bootvid.dll. The secondary bootstrap loader puts the images of the kernel files in memory but does not execute them quite yet. First, it searches out and loads the service drivers.
If the secondary bootstrap loader does not find Ntoskrnl.exe, it gives the error Windows Server 2003 could not start because the following file is missing or corrupt:\<systemroot>\system32\Ntoskrnl.exe. The most likely cause of this error is an incorrect path in the boot menu entry, although corruption of the file system on the volume can also be the culprit.
Initial Service Drivers
The secondary bootstrap loader opens the System hive in the Registry and checks the Select key to find the CurrentControlSet. It then scans the list of Services keys in the CurrentControlSet looking for devices with a Start value of 0, indicating Service_Boot_Start, and 1, indicating Service_System_Start. It loads these drivers in the order specified by the Group value under Control, Service Group Order.
At this point, the console shows a Starting Windows Server 2003 message along the bottom of the screen with a progress bar that slides along as the drivers load. When this is complete, the secondary bootstrap loader initializes Ntoskrnl.exe and hands over the driver images now stored in memory.
When Ntoskrnl starts, it initializes Hal.dll and Bootvid.dll. The screen now shifts to graphic mode. Ntoskrnl then initializes the system drivers and uses information from Ntdetect.com (IA32) or EFI (IA64) to create a volatile Hardware hive in the Registry. It then calls on the Session Manager, Smss.exe, to do a little preliminary housekeeping.
Session Manager reads its own key in the System hive under HKLM | System | CurrentControlSet | Control | Session Manager to find entries under BootExecute. By default, this includes AUTOCHK, a boot-time version of CHKDSK. Session Manager also sets up the paging file, Pagefile.sys. After Session Manager finishes its chores, it does the following two things simultaneously:
In NT, users were often mystified by the long delay between entering their credentials at the logon window and getting to the desktop. In Windows 2000, Microsoft withheld the logon window until the service drivers initialized. In Windows Server 2003, to speed up access to the console, the user is permitted to log on even though many of the services are still being initialized. As long as the user credentials are correct, there's a happy ending. Bring up the violin music. Fade to credits.
If Screg cannot start a device or service, it takes action as defined in the associated Registry key. This runs the gamut from putting a simple message in the Event log to crashing the system with a stop error. If the problem is not catastrophic, Screg displays a console message telling you that a problem occurred and that you should check the Event log. This console message does not always appear, so you should always check the log when you start a server. Histories of abnormal starts should be investigated and the problem isolated and corrected.
Working with Boot.ini
The IA32 secondary bootstrap loader, Ntldr, relies on Boot.ini to locate the Windows Server 2003 boot partition and the folder containing the boot files. If you encounter strange behavior during restart following an upgrade, or when adding new mass storage hardware, start your investigations with a look at Boot.ini. For details of the contents of a boot menu entry in an IA64 machine, see Chapter 3, "Adding Hardware." Here is an example of a Boot.ini file for a system that also has a DOS partition:
The long, inscrutable entry under [operating systems] is the Advanced RISC Computing path (ARC path). The ARC path is a pointer that leads Ntldr to a Windows Server 2003 partition. ARC syntax uses this convention:
If the controller type is signature() and the drive is on a SCSI bus with an interface that does not have a BIOS, Ntldr uses the SCSI miniport driver in Ntbootdd.sys to read the drives and find the matching signature.
The disk() entry was used in conjunction with the scsi() controller type to indicate the SCSI ID of the disk where the boot files are located. It is no longer required in Windows Server 2003 because the signature() entry identifies the drive uniquely without the SCSI ID. For controller designations of multi(), the value for disk() is always 0.
rdisk()—Relative Disk Location
The rdisk() entry is used in conjunction with multi() to indicate the relative disk location. For IDE drives, the relative disk location is determined by the master/slave designation. For example, the slave drive on the first IDE controller would have an ARC designator of multi(0)disk(0)rdisk(1).
For SCSI drives, the relative disk location is determined by the device scan performed by the SCSI BIOS during POST. Generally, the scan order matches the SCSI ID order. If the BIOS reports that there are three drives on the bus, for example, the ARC designator for the third drive would be multi(0)disk(0)rdisk(2) regardless of the SCSI ID. If you select a drive other than 0 to be the boot drive, the relative disk locations would change.
For controller designations of signature(), the value for rdisk() is always 0 because the system uses the MBR signature to find the drive.
partition()—Boot Partition Sequence Number
This is the sequential number of the boot partition. Note that this sequence starts with 1, not 0. See the upcoming sidebar, "Automatic Partition Number Changes," for curious aspects to the numbering sequences.
\systemroot—Windows Server 2003 System Directory
The default system directory in Windows Server 2003 is \Windows. That path is used in all examples in this book. A different name can be chosen during Setup. The environment variable %systemroot% displays the name of the system directory. You should have only one system root directory on any given partition. Microsoft strongly discourages having multiple Windows installations in the same partition because there are files in locations other than %systemroot% that are maintained by the operating system.
"menu listing"—Menu Text
The text in quotation marks is displayed as the menu listing in the boot menu. You can change this text if it helps your users to navigate to the right entry; under normal circumstances, however, users could care less and don't want to see the menu anyway.
ARC Path Examples
Assume, for example, that you have a machine configured with several versions of Windows Server 2003 and NT along with a dual-boot to DOS. The drives and partitions are set up as shown in Figure.
Here are the ARC paths for each boot partition:
An additional set of Boot.ini switches are used for headless server operation and kernel-mode debugging. See Chapter 2, "Performing Upgrades and Automated Installations," for headless server operation and Chapter 21, "Recovering from System Failures," for information on kernel-mode debugging.
Special ARC Paths
Ntldr makes INT13 fixed disk function calls to find the boot drive and files at the root of the boot partition. There are a couple of situations where a standard ARC path is insufficient for Ntldr to use INT13 calls.
No Support for INT13 Extensions
The original INT13 specification assumed a certain set of Cylinder/Head/Sector (CHS) limitations. The CHS limitations have been revised as drive technology supported larger and larger capacities. The PC97 specification included a set of INT13 extensions to accommodate larger drive geometries. Windows 2000 and later support these INT13 extensions, so boot partitions can be larger than 7.8GB.
If you have a system that does not support the INT13 extensions and the boot partition is larger than 7.8GB or lies outside the 1024 cylinder limit, Ntldr needs a different way to find the boot drive.
No SCSI BIOS
A SCSI controller supports INT13 calls via special interfaces in the controller's BIOS. If the SCSI controller does not have a BIOS, or the BIOS has been disabled, Ntldr needs an alternate means of locating the boot drive. Classic NT solved this problem by loading the SCSI miniport driver packaged in Ntbootdd.sys. Ntldr used the miniport driver to scan the SCSI bus. The ARC path entries scsi()disk() told Ntldr the ordinal number of the SCSI controller and the SCSI ID of the boot disk.
Because Windows Server 2003 uses Plug and Play, it's possible that the ordinal number assigned to the SCSI controller could change. This would keep Ntldr from finding the boot drive using a classic ARC path. For that reason, Ntldr needs a more reliable means for identifying the boot drive than the SCSI ID.
When Ntldr sees a signature() entry in an ARC path, it scans the mass storage devices looking for a drive with an MBR that contains the signature. For IDE drives, it uses standard INT13 calls to do this scan. For SCSI drives, it loads the miniport driver in Ntbootdd.sys. The signature uniquely identifies the boot drive regardless of the SCSI ordinal number.
The signature() method of identifying drives in Boot.ini has its drawbacks. If the disk signature is overwritten by a virus or the MBR is corrupted and must be overwritten, Ntldr cannot locate the drive and will give the error Windows Server 2003 could not start because of a computer disk hardware configuration problem.
If you encounter this kind of problem, you can correct it as shown in Procedure 1.4.