Multi-layer network recovery (restoration escalation)

Modern telecommunication networks generally contain multiple network layers, ranging from fiber layer, optical channel layer, TDM layer, MPLS, and IP layers. A network failure often occurs to the lowest fiber layer, e.g., a fiber cut. Such a failure directly affects the connectivity between a pair of nodes in the fiber layer. The failure affects the all the services in the upper layers that use the cut fiber for service transmission. For example, all the lightpath channels that traverse the fiber would be interrupted. Likewise, all the TDM tributaries contained within each of the affected lightpath channel would be interrupted as well. Subsequently, a TDM flow can also contain multiple MPLS traffic flows of which each further contains multiple IP lows. All these flows are also interrupted by the fiber cut.

To recover the failure, each layer has their own protection and restoration mechanisms. They can independently recover network failures. For example, in the fiber layer, we may find an alternative fiber route to reroute all the lightpath channels affected by the fiber cut. Similarly, in the optical channel layer, for each end-to-end optical channel, we may find a link disjoint end-to-end route to re-establish a new lightpath. The same actions can be taken for the other upper layers for failure recovery.

The key difference between failure recoveries in these layers is that different layers bear different recovery complexities and recovery speeds. In general, the lower a layer is, the simpler a recovery action is required. In the fiber layer, only one alternative fiber route should be found and used for recovery. In the optical channel layer, recovery actions should be taken independently for each of the affected end-to-end lightpath channels, which in total can require more than 80 actions if a fiber carries more than 80 wavelengths. The recovery actions are further spawned in the upper layers such as TDM and MPLS layers, since in these layers there are much more affected service flows that have finer capacity granularities.

Another important issue that is important for multiple layer network recovery is referred to as restoration escalation. Because each of the network layers have their own protection capabilities, it is sufficient to recover a failure simply in one of the network layers if sufficient protection capacity is reserved in the layer. For example, we can perform network recovery in the fiber layer. We need only to find an alternative fiber route to reroute all the affected optical channels. If the failure recovery is fast enough, the upper layers would even not notice the recovery process. As another extreme case, we can recover the failure within the IP layer through an IP table converging process with the help of routing protocol such as OSPF. Due to there can be thousands of IP flows that are affected by a single fiber cut, the restoration process can cause a heavy burden to the IP layer control plane. The recovery within the IP layer generally is much slower.

For failure recovery of a network with multiple layers, we often need to determine the layer that takes the major failure recovery actions. And all the other layers just assist this major layer when some failures cannot be fully recovered in the layer. For example, if we assign the optical layer as the major layer for failure recovery. It will recover most optical channel failures due to a fiber cut. If there are any optical channels not recovered, then we can employ the failure recovery mechanism in the TDM layer to recover all the TDM tributaries carried by the optical channels that are not recovered by the optical layer yet.

In summary, network failure recovery in different layers have the following key features: i) failure recovery in lower layers is generally faster and simpler than that in upper layer, ii) failure recovery in upper layers is generally more efficient in capacity utilization and achieves better recovery percentage due to finer traffic flow granularity. To achieve the best network failure recovery, all the network layers should collaborate for the fastest and highest failure recovery.

Random Posts

Random Posts

Leave a Reply

Spam Protection by WP-SpamFree