Load-sharing is one of the clumsy areas that is full of
confusing parts. In this post we should be covering its ABCs, and latter on we
should be covering more parts in details. We chose the name “CEF and load
sharing” as the post name due to the main role that CEF plays when talking about
load sharing.
In IP routing context the forwarding/switching mechanism that
the router uses is the actual controller of the load sharing process
(data/forwarding plane operation), having multiple routes in the routing table
has no significance on how exactly will load sharing be done, you might be left
with poor load sharing or no load sharing at all, although you have multiple
routes for a certain destination in the routing table.
The routing protocols are responsible for placing multiple paths in the routing table in the first place (control plane operation), by default all the IGPs are capable of inserting 4 equal cost paths, while BGP defaults to only 1 (BGP behaves completely different than the IGPs, we should be covering load-sharing with BGP in details in a later post). To control the maximum paths allowed per routing protocol we can use the maximum-paths command (The maximum was 4 in IOS releases earlier than 11.0, 8 with IOS Release 12.0S based software, 16 with IOS Release 12.3T based software, and 32 with IOS Release 12.2S based software.
The routing protocols are responsible for placing multiple paths in the routing table in the first place (control plane operation), by default all the IGPs are capable of inserting 4 equal cost paths, while BGP defaults to only 1 (BGP behaves completely different than the IGPs, we should be covering load-sharing with BGP in details in a later post). To control the maximum paths allowed per routing protocol we can use the maximum-paths command (The maximum was 4 in IOS releases earlier than 11.0, 8 with IOS Release 12.0S based software, 16 with IOS Release 12.3T based software, and 32 with IOS Release 12.2S based software.
NOTE This post is not meant to
explain CEF operation, we’ll only be focusing on CEF load-sharing, however we
might consider to have a dedicated CEF inside out post later.
The most popular forwarding/switching mechanisms with Cisco
routers are; Process switching (performs per-packet load-sharing), fast
switching (performs per-destination load-sharing) and CEF (can do both
per-packet and per-destination (completely different than fast switching
per-destination load-sharing), plus also a new flavor which is per-port
load-sharing).
NOTE According to Cisco, IPv4 fast switching is removed with the
implementation of the Cisco Express Forwarding infrastructure enhancements for
Cisco IOS 12.2(25)S-based releases and Cisco IOS Release 12.4(20)T. For these
and later Cisco IOS releases, switching path are Cisco Express Forwarding switched
or process switched. This makes the switching decision easier for future
development of software features. Starting with the implementation of the Cisco
Express Forwarding enhancements and the removal of IPv4 fast switching,
components that do not support Cisco Express Forwarding will work only in
process switched mode.
Load-sharing with CEF
For each destination with multiple equal cost paths (or
unequal-cost in the case of EIGRP using variance, or with BGP using the BGP
Link Bandwidth feature and also in the case of MPLS-TE) the router creates a 16
hash buckets, each pointing to one of the available paths.
The load sharing is controlled by the ratio of the number of
buckets pointing to each path (outgoing interface), with equal-cost paths the
buckets are fairly distributed (two equal cost paths results in 8 buckets per
each path, three equal cost paths results in 5 per each (yes, one bucket is
omitted), 4 equal cost paths results in 4 per each, and so on). While with
unequal-cost scenarios each path will be associated with different number of
buckets (according to the load sharing ratio).
CEF has three load-sharing options:
- per-destination (per-session):
I prefer to name it per-session – as stated in the show ip
cef x.x.x.x internal command output – since it is actually done based on
both the source and the destination IP addresses in the IP packet rather than
solely the destination, by hashing both into a 4-bit hash value that is used to
select the outgoing interface) – This is the default CEF load sharing option.
It is clear that per-destination load-sharing performs
statistical distribution of traffic, and accordingly load sharing becomes more
effective as the number of source/destination pairs increases as compared to
lower number of source/destination pairs. Obviously this might result in having
one link overloaded while the other(s) underutilized, if a relatively heavy
session flows between a certain source/destination pair over this link.
The hash calculation depends on the algorithm used. The original
algorithm uses only the source and destination IP addresses to compute a
4-bit hash value, giving 16 probabilities, and thus choosing an outgoing bucket
from the 16 available buckets pointing to one of the outgoing paths, this results
in all the routers in the network running the same algorithm with the same
results, which introduced a load sharing hitch called CEF Load-Sharing
Polarization (you can see a good example for this in Cisco press book “Cisco
Express Forwarding”). To circumvent this behavior the universal algorithm (the
default in current IOS versions) adds a 32-bit router-specific value to the
hash function (called Fixed ID, which can be manually controlled – a router
uses its highest loopback IP address as this value when booting) and thus
seeding the hash function on each router with a unique ID, ensuring that the
same source/destination pair will hash into a different 4-bit value on
different routers along the path and thus provides a better network wide load
sharing and circumvent the Polarization issue.
NOTE There is a third available
algorithm called the tunnel algorithm, I couldn’t find or understand its
anatomy, but Cisco stated that this algorithm is meant to solve load sharing
when tunneling techniques such as MPLS, GRE and L2TP are in operation, since
with tunneling the traffic pattern is taken down to a small number of sessions
(between the tunnel head/tail ends) which will introduce another form of
traffic polarization. This algorithm also uses a unique per-router ID to work
around this issue, again I can’t find more details about this algorithm, but if
I do I’ll let you know.
- per-packet
Packets are handled in a round-robin fashion, ensuring that
the traffic is balanced over multiple links. However, using Per-packet load
sharing is not generally recommended, because it most commonly results in
out-of-order packets, affecting TCP traffic throughput (since TCP will bother
to fix the out-of-order) and UDP data loss (since UDP will not bother to fix
the out-of-order) and to make things more scary out-of-order packets might be
interpreted as an attack by firewalls.
The default CEF load sharing mode is per-destination, and we
can change this using the ip load-sharing per-packet interface command
on the outgoing interfaces involved.
NOTE Since load sharing
decisions are made on the outbound interfaces, thus either choosing to do
per-packet or per-destination load sharing should be done on the outbound
interfaces.
- per-port (per-flow)
This is the most adequate option (was introduced with IOS
12.4(11)T release) with networks with low number of sources/destinations with
the majority of the traffic between hosts that use different port numbers,
commonly seen with Real-Time Protocol (RTP) streams, it simply adds the layer 4
source or destination ports or both in the CEF hashing function. This option is
enabled via the ip cef load-sharing algorithm include-ports command in
the global configuration.
The most common scenario with this option as the only
effective solution is when having a subnet of hosts NATed to a single IP then
having a router with multiple paths in the path to their traffic destination,
per-destination option is obviously useless in this case if all the hosts are
communicating with a single destination, since it is always a single
source/destination pair, and accordingly if the layer 4 ports are involved in
the hashing function this would enhance the load sharing process.
I hope that I’ve been informative.