Managing Peer and Path Availability

Managing Peer and Path Availability

In any IPSec VPN design, it is important to ensure that both IPSec peers are available to one another, reachable through the IP-enabled infrastructure. An IPSec peering point can be thought of as a tunnel termination point. Each IPSec VPN tunnel endpoint must know how to source the IPSec VPN tunnel locally and how to reach the target termination point (its IPSec peer) from the tunnel source in order to successfully negotiate a Phase 2 SA. This requirement is referred to as peer availability. It is also important to ensure that when peers are unavailable to one another, the SADB is managed properly so that IPSec VPN tunnels are allowed to reconverge in HA environments.


Improper management of peer availability can lead to the existence of stale SAs, prohibiting rapid reconvergence of IPSec VPNs in failover scenarios. The impact of stale SAs on IPSec HA designs is discussed in detail in Chapter 6, "Site-to-Site Local HA Solutions," and Chapter 7, "Site-to-Site Geographic HA Solutions."

Once peer availability has been addressed, it is important to make sure that the encrypted path (the IPSec VPN tunnel) is available to traffic to be included in the encrypted path. In other words, traffic to be encrypted must be able to be effectively routed through the IPSec VPN tunnel. This is referred to as crypto path availability. There are several common instances in which path availability may be impeded, the most common of which is when routing protocol traffic cannot be leaked between cleartext- and ciphertext-routed domains. This particular scenario effectively blocks the availability of the path for traffic to be included in the encrypted path, as either cleartext domain will not know how to route to the other because RP updates are not natively exchanged across the IPSec VPN tunnel or ciphertext routed domain. We will discuss this particular example in greater detail later in this section.

Peer Availability

In this section, we will explore features of IPSec and ISAKMP that can be used to manage peer availabilityDead Peer Detection (DPD) and IKE keepalives. Consider the scenario illustrated in Figure, in which the primary peering point of an IPSec VPN gateway becomes unavailable.

Peer Availability with DPD and IKE Keepalives

When an IPSec peer goes offline, the SADB retains the security associations for that peer for the length of the lifetimes negotiated during Phase 1 and 2 SA negotiations. For example, if two IPSec VPN gateways establish IKE and IPSec SAs using the default lifetimes IPSec tunnel were to fail, stale IKE and IPSec SAs could remain present in the SADB for up to 86400 seconds (1 day) and 3600 seconds (1 hour), respectively. This behavior, if not managed properly, can lead to confusion when negotiating new SAs with new peers (see Chapter 6, "Site-to-Site Local HA Solutions," and Chapter 7, "Site-to-Site Geographic HA Solutions," for more detail on IPSec High Availability) and inefficient use of memory on the IPSec gateway itself.

Although these SAs would remain in the SADB of the remote IPSec SADB for an excessively long time, there are methods of reaping stale SAs from the SADB in a more expedient fashion. The default behavior of Cisco IOS is to use on-demand Dead-Peer Detection. On-demand DPD expedites the discovery and removal of stale SAs by sending out DPD status query messages when the status of the remote peer is questionable and there is traffic to be forwarded in the crypto path to that peer. Consider the scenario illustrated Figure, in which IPSec_B1 fails. IPSec_A uses on-demand DPD to expedite the discovery and removal of SAs with the dead peer, IPSec_B1, using the following steps:

A failure occurs on IPSec_B1, preventing it from successfully terminating the IPSec VPN tunnel from IPSec_A.

IPSec_A receives traffic from the workstations at Site_A and must forward the traffic across the encrypted path to Site_B through its IPSec VPN tunnel to IPSec_B1.

IPSec_A forwards a DPD status query message to IPSec_B2 in order to determine the status of the peer.

IPSec_B1 is unable to respond to the DPD status query message sent by IPSec_A. Steps 3 and 4 are retried a number of times, corresponding to the specified retry value.

IPSec_A waits to receive a response to its DPD status query message sent in Step 3. When it receives no response from IPSec_B1, IPSec_A begins to purge its SADB of the security associations previously negotiated with IPSec_B1.

One important element of the DPD process described above is the fact that with on-demand DPD, traffic must be forwarded to the remote peer to initiate the discovery of the dead peer (Step 2). If traffic is not sent to that peer, then the DPD status query message will not be forwarded to the remote peer, and the tunnel SAs will remain installed in the local gateway's SADB for the duration of their previously negotiated lifetimes. This could potentially cause issues with reconvergence if, for example, a redundant IPSec tunnel were to be negotiated over the redundant path between Site A and Site B. One way of eliminating stale SAs in such a scenario is to enable the use of IKE keepalives, which provision for the removal of stale SAs from the SADB regardless of whether or not traffic is actively being forwarded to the dead peer. The process of SA removal with IKE keepalives, also depicted in Figure, operates in the following manner:

The primary IPSec gateway for Site B goes offline.

The IPSec gateway for Site A forwards its first IKE keepalive to the dead peer at Site B

Site A's IPSec gateway does not receive a response to the first IKE keepalive within the configured keepalive interval (default of 10 seconds). Site A's IPSec gateway forwards a second IKE keepalive to the dead peer at Site B.

Site A's IPSec gateway does not receive a response to the first IKE keepalive within the configured keepalive interval. Site A's IPSec gateway forwards a third IKE keepalive to the dead peer at Site B.

After Site A has waited the duration of the keepalive interval without receiving a response to its third IKE keepalive, it declares that its peer at Site B is dead and removes the security associations with Site B's IPSec gateway from its own local SADB.

The two concepts described above are critically important when designing large-scale enterprise IPSec deployments. On-demand DPD and periodic DPD provide two different methods of proactively detecting stale SAs left over from dead IPSec peering sessions. On-demand DPD requires traffic along the encrypted path to initiate DPD status queries and consumes less overhead than periodic DPD (IKE keepalives), while periodic DPD can proactively detect dead peers without the presence of traffic in the encrypted path. Improper management of the IPSec SADB can lead to operational issues in HA environments and platform scalability issues specifically relating to excessive CPU utilization on platforms managing DPD for a large number of SAs. We will discuss the use of DPD to streamline the reconvergence process of an IPSec tunnel in our discussion of IPSec High Availability in Chapter 6, "Site-to-Site Local HA Solutions," and Chapter 7, "Site-to-Site Geographic HA Solutions."

Path Availability

As we have already mentioned in our discussion of the IPSec protocol in Chapter 3, "Basic VPN Topologies and Configuration," IPSec will not encrypt multicast traffic without the use of a prior instantiation of encapsulation using another tunneling protocol such as GRE. In many designs, it may be desirable to separate routing protocol domains while still joining routed domains between the two opposite ends of the IPSec tunnel, as shown in Figure.

IPSec and Multicast RP Updates

In situations where RRI is not an option, this may require the exchange of RP updates across the VPN tunnel. Because most RPs use updates that are forwarded to multicast addresses, RP updates often cannot be exchanged across an IPSec tunnel without encapsulating them in GRE prior to encryption. As we have discussed before, this is commonly referred to as IPSec+GRE. IPSec+GRE tunnels are not only useful in designs where encrypted RP updates are required, but they are critically useful in networks where encrypted multicast application data is required.

One alternative to IPSec+GRE for encrypted RP update exchange using IPSec is to configure the RP to use unicast updates, rather than multicast updates. Configuring of unicast neighbors in RP processes come with added configuration, management, and scalability issues. However, in simpler site-to-site IPSec VPN environments, the configuration of unicast updates can be configured with a minimal increase in management and administration while eliminating the overhead added by GRE headers in an IPSec+GRE scenario.

Unlike most routing protocols, Border Gateway Protocol (BGP) natively uses unicast updates by establishing a TCP session on port 179. Therefore, RP updates using BGP can be sent through the crypto switching path without the use of GRE prior to encryption. Encrypted BGP updates are useful in designs requiring inter-AS connectivity between disparate IGP RP domains. Like unicast RP updates, BGP updates eliminate packet overhead and encapsulation inherent to IPSec+GRE designs. This method represents the fourth method of preserving RP information between two routed domains on opposite ends of an IPSec VPN tunnel that we've discussed in this chapter. The complete list of methods is as follows:

  • Reverse Route Injection (including static routes)

  • Multicast routing updates encapsulated in GRE

  • Unicast RP updates using unicast IGP neighbor definition

  • Unicast RP updates with BGP

Depending on the crypto platform selected, packets encapsulated in GRE may not be forwarded in hardware, regardless of whether or not some method of crypto hardware assist is present on the IPSec VPN gateway. It is therefore critically important to evaluate all options of preserving RP consistency across the VPN tunnel so as not to force traffic outside of the fast crypto switching path by unnecessarily using GRE on a platform that does not support the encapsulation of GRE packets in hardware.

 Python   SQL   Java   php   Perl 
 game development   web development   internet   *nix   graphics   hardware 
 telecommunications   C++ 
 Flash   Active Directory   Windows