Sunday 17 May 2015

Internet Access from an MPLS VPN Using a Global Routing Table

Internet Access from an MPLS VPN Using a Global Routing Table

 

Introduction

The purpose of this document is to demonstrate the sample configuration used to access the Internet from a Multiprotocol Label Switching (MPLS)-based VPN using a global routing table.
In certain network scenarios, it is required to access the Internet from an MPLS-based VPN in addition to continuing to maintain the VPN connectivity among corporate sites. This sample configuration focuses on providing Internet access from the VPN routing and forwarding (VRF) that contains the default route to the Internet gateway router (IGW).

Prerequisites

Requirements

A basic understanding of MPLS forwarding and MPLS VPN is required to fully understand the contents of this document.

Components Used

The information in this document is based on the software and hardware versions below.
  • Cisco IOS® Software Release 12.1(3)T. Release 12.0(5)T includes the MPLS VPN feature
  • Any Cisco router from the 3600 series or later, such as the Cisco 3660 or 7206
The information presented in this document was created from devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If you are working in a live network, ensure that you understand the potential impact of any command before using it.

Background Theory

In this example configuration, these policies were in place:
  • A router with connectivity to the Internet is attached to the MPLS network. It may or may not inject Border Gateway Protocol (BGP) routes into the global routing table.
    Note: PE routers understand BGP. Routers such as the Gigabit Switch Router (GSR) (which performs as a Provider Core router) do not run BGP at all.
  • There is no requirement for a VRF to have a full routing table from the Internet (global BGP table), so a static default route is put in a VRF pointing to the global next hop address of the IGW.
  • A VPN customer uses a registered unique address range that is routable in the global Internet routing table. The method of access discussed in this document is not recommended where customers have only private addresses in their network.

Conventions

These acronyms are used in this document:
  • CE - Customer Edge router
  • PE - Provider Edge router
  • P - Provider core router
For more information on document conventions, refer to Cisco Technical Tips Conventions.

Configure

  • You can refer to the Network Diagram for an illustration of this configuration. In this example, CE 1 and CE 2 are in the same VPN. They are configured under the customer1 VRF, since there is no requirement for a VRF to have a full routing table from the Internet (as per the policies in the Background Theory section of this document).
  • A static default route is configured in the customer1 VRF on CE 1 pointing to the IGW. By placing a static default route within the customer1 VRF, packets that do not match any of the routes contained within customer1 VRF will be sent to the IGW.
Note: Since the Internet gateway next hop 192.168.67.1� is not a part of the�customer1 VRF, a default route is configured under the customer1 VRF pointing to the Internet gateway interface s8/0 IP 192.168.67.1. The route to 192.168.67.1 does not lie within customer1 VRF, so you need to have a global keyword within the static default route configured under customer1 VRF. The global keyword specifies that the next hop address of the static route is resolved within the global routing table, not within the the customer1 VRF.
The following is an example of the static route.
ip route vrf customer1 0.0.0.0 0.0.0.0 192.168.67.1 global
Having a static route with a global keyword in the customer1 VRF ensures that all packets destined to the Internet are routed to the Internet gateway and subsequently to the Internet.
Note: The default route in PE 1 is configured to point to the serial interface IP address of the Internet gateway (192.168.67.1) and not to the loopback address (10.1.1.6). This avoids blackholing the routes in the event of connectivity failure between the Internet gateway and the Internet (R7). If the default route is pointed to the loopback address of the Internet gateway and the connectivity between the Internet gateway-R7 breaks, all the packets would continue to route to the Internet gateway. This happens because the loopback address remains up (unlike 192.168.67.1 which is withdrawn from the global routing table when interface s8/0 goes down) and the default route continues to exist in the routing table.
The next step is to ensure that packets coming back from the Internet to destination CE 1 network 11.11.11.0/24, are routed from the Internet gateway to PE 1 and to CE 1 through the MPLS core. This is achieved by configuring a static route for the CE 1 network pointing to the Serial 8/0 interface in the global routing table on PE 1. Redistribute it into the Open Shortest Path First (OSPF) so that the Internet gateway has that route in its global routing table. This allows the Internet gateway to route all packets coming from the Internet to PE 1, and to the final destination beyond CE 1.
The following example is the ip route command used in configuration on PE 1.
ip route 11.11.11.0 255.255.255.0 Serial8/0 192.168.10.1
Note: The above static route configured in the global routing table is in addition to the static route configured within the customer1 VRF, which is used for VPN Network Layer Reachability Information (NLRI). On PE 1, it is configured as shown as below.
ip route vrf customer1 11.11.11.0 255.255.255.0 192.168.10.1
Note: To find additional information on the commands used in this document, use the Command Lookup Tool (registered customers only) .

Network Diagram

This document uses the network setup shown in the diagram below.
internet_access_mpls_vpn.gif

Configurations

This document uses the configurations shown below.
CE 1
version 12.2
!
hostname CE-1
!
ip subnet-zero
!
interface Loopback0
 ip address 10.1.1.1 255.255.255.255
!
interface Loopback2
�ip address 11.11.11.1 255.255.255.0
!
interface Serial8/0
�ip address 192.168.10.1 255.255.255.252 

 !--- The interface is connected to PE 1.

 !
ip classless
ip route 0.0.0.0 0.0.0.0 192.168.10.2

!--- This is the default route to route all packets to PE 1.

!

PE 1
version 12.2
!
hostname PE-1
!
ip subnet-zero
!
ip vrf customer1

!--- This configured VRF customer1.
 
�rd 100:1

!--- This configured the route distiguisher for VRF.
 
 route-target export 1:1
 route-target import 1:1

!--- This configured the export and import policies into VRF.
 
!
ip cef

!--- This enabled Cisco Express Forwarding (CEF) switching.

!
interface Loopback0
 ip address 10.1.1.2 255.255.255.255
!       
interface Ethernet0/0

!--- It is connected to P router.

 ip address 10.10.23.2 255.255.255.0
 tag-switching ip

!--- MPLS switching is enabled.
 
!
interface Serial8/0
! Connected to CE-1
�ip vrf forwarding customer1

!--- Route forwarding based on customer1 VRF is enabled.
 
 ip address 192.168.10.2 255.255.255.252
!
router ospf 1
 log-adjacency-changes
 redistribute static subnets
 network 0.0.0.0 255.255.255.255 area 0
!
router bgp 100
 no synchronization
 bgp log-neighbor-changes
 neighbor 10.1.1.4 remote-as 100

!--- Neighbor relationship with PE 2 is established.

 neighbor 10.1.1.4 update-source Loopback0
 neighbor 10.1.1.4 next-hop-self
 no auto-summary
!
 address-family ipv4 vrf customer1

!--- The address-family configuration mode specifies IPv4 unicast 
!---address prefixes for customer1 VRF.

 no auto-summary
 no synchronization
 network 11.11.11.0 mask 255.255.255.0

!--- CE 1 network 11.11.11.0/24 to PE 2 is announced.

network 192.168.10.0 mask 255.255.255.252
 exit-address-family
!
address-family vpnv4

!--- This is the address-family VPNV4 configuration mode for 
!--- configuring BGP sessions.

�neighbor 10.1.1.4 activate
 neighbor 10.1.1.4 send-community extended
 no auto-summary
 exit-address-family
!
ip classless
ip route 11.11.11.0 255.255.255.0 Serial8/0 192.168.10.1

!--- The static route in the global routing table is pointing to 
!--- the interface connected to CE 1.

ip route vrf customer1 0.0.0.0 0.0.0.0 192.168.67.1 global

!--- The static default route under customer1 VRF, routing packets  
!--- outside of�VPN to the Internet gateway.

! routes
ip route vrf customer1 11.11.11.0 255.255.255.0 192.168.10.1

!--- The static route for network 11.11.11.0/24 (CE-1 Network) under 
!---customer1 VRF ensures the reachability of CE 1 network from the 
!--- other VPN sites.


P
version 12.2
!
hostname P
!
ip subnet-zero
!
ip cef

!--- CEF switching is enabled.

!
interface Loopback0
 ip address 10.1.1.3 255.255.255.255
!
interface Ethernet0/0

!--- This is connected to PE 1.

 ip address 10.10.23.3 255.255.255.0
 tag-switching ip

!--- MPLS switching is enabled.

!
interface Ethernet1/0

!--- This is connected to PE 2.

 ip address 10.10.34.3 255.255.255.0
 tag-switching ip
!
interface Ethernet2/0

!--- This is connected to the Internet gateway.

 ip address 10.10.36.3 255.255.255.0
 tag-switching ip
!
router ospf 1 
 log-adjacency-changes 
network 0.0.0.0 255.255.255.255 area 0

IGW
version 12.2
!
hostname IGW
!
ip subnet-zero
!
ip cef

!--- This enabled CEF switching.

!
interface Loopback0
 ip address 10.1.1.6 255.255.255.255
!
interface Ethernet2/0

!--- This is connected to P router.

 ip address 10.10.36.6 255.255.255.0
tag-switching ip
!
interface Serial8/0

!--- This is connected to Internet R7.

 ip address 192.168.67.1 255.255.255.252
!
router ospf 1
 log-adjacency-changes
 network 0.0.0.0 255.255.255.255 area 0
 !
 router bgp 100
 no synchronization
 bgp log-neighbor-changes
 network 11.11.11.0 mask 255.255.255.0
 network 22.22.22.0 mask 255.255.255.0
 neighbor 192.168.67.2 remote-as 200
 no auto-summary

PE 2
version 12.2
!
hostname PE-2
!
ip subnet-zero
!
ip vrf customer1

!--- Customer1 VRF is configured.

�rd 100:1 

!--- Route Distinguisher for VRF is configured.
 
 route-target export 1:1 
 route-target import 1:1 

!--- This configured the import and export policies for customer1 
!--- VRF.

!
ip cef 

!--- This enabled CEF switching.
 
! 
interface Loopback0 
 ip address 10.1.1.4 255.255.255.255
interface Ethernet1/0

!--- Connected to P router.

 ip address 10.10.34.4 255.255.255.0
 tag-switching ip

!--- MPLS switching is enabled.

!
interface Serial9/0

!--- Connected to CE 2 router.

ip vrf forwarding customer1

!--- This enables VRF forwarding on the interface.

 ip address 192.168.20.1 255.255.255.252
!
router ospf 1
 log-adjacency-changes
 redistribute static subnets
 network 0.0.0.0 255.255.255.255 area 0
!
router bgp 100
 no synchronization
 bgp log-neighbor-changes
 neighbor 10.1.1.2 remote-as 100
 neighbor 10.1.1.2 update-source Loopback0
 neighbor 10.1.1.2 next-hop-self 
no auto-summary 
!
 address-family ipv4 vrf customer1

!--- This is the address-family IPv4 configuration of customer1 VRF.

 no auto-summary
 no synchronization
 network 22.22.22.0 mask 255.255.255.0

!--- This announces the CE 2 network to PE 1.

 exit-address-family
!
 address-family vpnv4

!--- This is the address-family VPNV4 configuration for BGP Sessions 
!--- with PE 1.

 neighbor 10.1.1.2 activate
 neighbor 10.1.1.2 send-community extended
 no auto-summary
 exit-address-family
!
ip classless
ip route 22.22.22.0 255.255.255.0 Serial9/0 192.168.20.2

!--- This is the static route for�network�22.22.22.0/24 in the global 
!--- routing table pointing to the interface connected to CE 2.

ip route vrf customer1 0.0.0.0 0.0.0.0 192.168.67.1 global

!--- This is the static default route�for customer VRF 
!--- for destinations outside the VPN.

ip route vrf customer1 22.22.22.0 255.255.255.0 192.168.20.2 

!--- This is the static route within customer1 VRF for CE 2 
!--- network for VPN connectivity.


CE 2
version 12.2
!
hostname CE-2
!
ip subnet-zero
!
interface Loopback0
 ip address 22.22.22.22 255.255.255.0
!
interface Serial9/0

!--- This is connected to PE 2.

 ip address 192.168.20.2 255.255.255.252
!
ip classless
ip route 0.0.0.0 0.0.0.0 192.168.20.1

!--- This is the default route pointing to PE 2.


Verify

This section provides information you can use to confirm your configuration is working properly.

VPN�Connectivity Between CE 1 and CE 2

To verify the VPN connectivity between CE 1 and CE 2, CE 1 should be able to reach CE 2's network 22.22.22.0/24 and the other way around. To check this, verify the route to network 22.22.22.0/24 in the customer1 VRF at PE 1.
Certain show commands are supported by the Output Interpreter Tool (registered customers only) , which allows you to view an analysis of show command output.
  1. The show ip route vrf customer1 command confirms the route to network 22.22.22.0/24 learned from 10.1.1.4 (PE 2's loopback address) shown� highlighted in the output below.
    PE-1# show ip route vrf customer1
    Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP
           D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area�
           N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
           E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP
           i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, ia - IS-IS inter area
           * - candidate default, U - per-user static route, o - ODR
           P - periodic downloaded static route
    
    Gateway of last resort is 192.168.67.1 to network 0.0.0.0
    
    192.168.10.0/30 is subnetted, 1 subnets
    C       192.168.10.0 is directly connected, Serial8/0
           22.0.0.0/24 is subnetted, 1 subnets
    B       22.22.22.0 [200/0] via 10.1.1.4, 01:00:50
           11.0.0.0/24 is subnetted, 1 subnets
    S      11.11.11.0 [1/0] via 192.168.10.1
    S*   0.0.0.0/0 [1/0] via 192.168.67.1
  2. Similarily, at PE 2, the route to network 11.11.11.0/24 in the customer1 VRF is shown in the example below.
    PE-2# show ip route vrf customer1
    Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP
           D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area�
           N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
           E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP
           i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, ia - IS-IS inter area
           * - candidate default, U - per-user static route, o - ODR
           P - periodic downloaded static route 
    
    Gateway of last resort is 192.168.67.1 to network 0.0.0.0
    
    192.168.10.0/30 is subnetted, 1 subnets
    B       192.168.10.0 [200/0] via 10.1.1.2, 01:00:09
         22.0.0.0/24 is subnetted, 1 subnets
    S       22.22.22.0 [1/0] via 192.168.20.2
         192.168.20.0/30 is subnetted, 1 subnets
    C       192.168.20.0 is directly connected, Serial9/0
         11.0.0.0/24 is subnetted, 1 subnets
    B       11.11.11.0 [200/0] via 10.1.1.2, 01:00:09
    S*   0.0.0.0/0 [1/0] via 192.168.67.1
  3. Now check the connectivity between CE 1 and CE 2 by pinging a host 22.22.22.22 on CE 2 using the source IP address of 11.11.11.1 from CE 1.
    CE-1# ping
    Protocol [ip]:
    Target IP address: 22.22.22.22
    Repeat count [5]:
    Datagram size [100]:
    Timeout in seconds [2]:
    Extended commands [n]: y
    Source address or interface: 11.11.11.1
    Type of service [0]:
    Set DF bit in IP header? [no]:
    Validate reply data? [no]:
    Data pattern [0xABCD]:
    Loose, Strict, Record, Timestamp, Verbose[none]:
    Sweep range of sizes [n]:
    Type escape sequence to abort.
    Sending 5, 100-byte ICMP Echos to 22.22.22.22, timeout is 2 seconds:
    !!!!!
    
    Success rate is 100 percent (5/5), round-trip min/avg/max = 20/20/20 ms

Connectivity to the Internet from CE 1

Follow the steps below to verify connectivity to the Internet from CE1.
  1. All packets destined to the Internet or VPN from CE 1 will route using a default route configured in CE 1 pointing to PE 1, as shown below.
    CE-1# show ip route 0.0.0.0
    Routing entry for 0.0.0.0/0, supernet
      Known via "static", distance 1, metric 0, candidate default path
      Routing Descriptor Blocks:
      * 192.168.10.2
    Route metric is 0, traffic share count is 1
  2. Packets coming into PE 1 interface s8/0 get routed using the customer1 VRF routing table. PE 1 has a default route in the customer1 VRF pointing to the IGW IP address 192.168.67.1, as shown below in the output for the show ip route vrf customer1 on PE 1.
    PE-1# show ip route vrf customer1
    Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP
           D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area�
           N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2
           E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP
           i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, ia - IS-IS inter area
           * - candidate default, U - per-user static route, o - ODR
           P - periodic downloaded static route
    
    Gateway of last resort is 192.168.67.1 to network 0.0.0.0
    
         192.168.10.0/30 is subnetted, 1 subnets
    C       192.168.10.0 is directly connected, Serial8/0
         22.0.0.0/24 is subnetted, 1 subnets
    B       22.22.22.0 [200/0] via 10.1.1.4, 01:21:11
         11.0.0.0/24 is subnetted, 1 subnets
    S       11.11.11.0 [1/0] via 192.168.10.1
    S*   0.0.0.0/0 [1/0] via 192.168.67.1
    
  3. Because the default route on PE 1 is configured with a global keyword, it looks for next hop 192.168.67.1 in its global routing table and routes to the IGW, as shown below.
    PE-1# show ip route 192.168.67.1
    Routing entry for 192.168.67.0/30
      Known via "ospf 1", distance 110, metric 84, type intra area
      Last update from 10.10.23.3 on Ethernet0/0, 00:21:54 ago
      Routing Descriptor Blocks:
      * 10.10.23.3, from 10.1.1.6, 00:21:54 ago, via Ethernet0/0
     Route metric is 84, traffic share count is 1
  4. The packets reaching IGW get routed over to the Internet based on the BGP routes it learned from R7. In this case, you can look at the BGP route learned from R7 to demonstrate the connectivity to the Internet. Shown below is the BGP route (network 99.99.99.0/24) learned from R7 in the IGW routing table.
    IGW# show ip route 99.99.99.0
    Routing entry for 99.99.99.0/24
      Known via "bgp 100", distance 20, metric 0
      Tag 200, type external
      Last update from 192.168.67.2 01:37:25 ago
      Routing Descriptor Blocks:
      * 192.168.67.2, from 192.168.67.2, 01:37:25 ago
          Route metric is 0, traffic share count is 1
          AS Hops 1
    The packets that originated from CE-1 get routed to the Internet.
  5. For packets coming back from the Internet destined to CE 1 network 11.11.11.0/24, IGW should have a route pointing to PE 1 in its global routing table. A static route in PE 1's global routing table pointing to s8/0 interface on PE 1 connecting to CE 1 and redistributed it into OSPF is configured. This ensures that the IGW has a route in its global routing table pointing to PE 1. The static route on PE 1 and the OSPF learned route on IGW is shown below.
    IGW# show ip route 11.11.11.0
    Routing entry for 11.11.11.0/24
      Known via "ospf 1", distance 110, metric 20, type extern 2, forward metric 20
      Last update from 10.10.36.3 on Ethernet2/0, 00:34:34 ago
      Routing Descriptor Blocks:
      * 10.10.36.3, from 10.1.1.2, 00:34:34 ago, via Ethernet2/0
          Route metric is 20, traffic share count is 1
    
    PE-1# show ip route 11.11.11.0
    Routing entry for 11.11.11.0/24
      Known via "static", distance 1, metric 0
      Redistributing via ospf 1
      Advertised by ospf 1 subnets
      Routing Descriptor Blocks:
      * 192.168.10.1, via Serial8/0
          Route metric is 0, traffic share count is 1
  6. Now check the connectivity to the Internet from CE 1 by pinging the R7 IP address 99.99.99.1 with the CE 1 source address of 11.11.11.1.
    CE-1# ping
    Protocol [ip]:
    Target IP address: 99.99.99.1
    Repeat count [5]:
    Datagram size [100]:
    Timeout in seconds [2]:
    Extended commands [n]: y
    Source address or interface: 11.11.11.1
    Type of service [0]:
    Set DF bit in IP header? [no]:
    Validate reply data? [no]:
    Data pattern [0xABCD]:
    Loose, Strict, Record, Timestamp, Verbose[none]:
    Sweep range of sizes [n]:
    Type escape sequence to abort.
    Sending 5, 100-byte ICMP Echos to 99.99.99.1, timeout is 2 seconds:
    !!!!!
    Success rate is 100 percent (5/5), round-trip min/avg/max = 20/24/32 ms
    CE-1#

BGP TTL security

Understanding BGP TTL Security

By default, IOS sends BGP messages to EBGP neighbors with an IP time-to-live (TTL) of 1. (This can be adjusted with ebgp-multihop attached to the desired neighbor or peer group under BGP configuration.) Sending BGP messages with a TTL of one requires that the peer be directly connected, or the packets will expire in transit. Likewise, a BGP router will only accept incoming BGP messages with a TTL of 1 (or whatever value is specified by ebgp-multihop), which can help mitigate spoofing attacks.
However, there is an inherent vulnerability to this approach: it is trivial for a remote attacker to adjust the TTL of sent packets so that they appear to originating from a directly-connected peer.
ttl-security1.png
By spoofing legitimate-looking packets toward a BGP router at high volume, a denial of service (DoS) attack may be accomplished.
A very simple solution to this, as discussed in RFC 5082, is to invert the direction in which the TTL is counted. The maximum value of the 8-bit TTL field in an IP packet is 255; instead of accepting only packets with a TTL set to 1, we can accept only packets with a TTL of 255 to ensure the originator really is exactly one hop away. This is accomplished on IOS with the TTL security feature, by appending ttl-security hops <count> to the BGP peer statement.
ttl-security2.png
Only BGP messages with an IP TTL greater than or equal to 255 minus the specified hop count will be accepted. TTL security and EBGP multihop are mutually exclusive; ebgp-multihop is no longer needed when TTL security is in use.

Route reflector

BGP route reflectors
Network topology:


Configuration:

Client1 and Client3 routers are clients of RR1 router while Client2 and Client4 are the clients of RR2 router. There is a single IBGP session between each router.

Client1 advertises 11.1.1.1/32 route to RR1 router and Client2 advertises 12.1.1.1/32 route to RR2 router.

RR1 and RR2 Configuration

RR1 router:

router bgp 100
 neighbor 2.2.2.2 remote-as 100
 neighbor 2.2.2.2 update-source Loopback 0
 neighbor 2.2.2.2 description RR2 router
 neighbor 172.16.1.2 remote-as 100
 neighbor 172.16.1.2 route-reflector-client
 neighbor 172.16.1.2 description Client1

!

RR2 router:

router bgp 100
 neighbor 1.1.1.1 remote-as 100
 neighbor 1.1.1.1 update-source Loopback 0
 neighbor 1.1.1.1 description RR1 router
 neighbor 172.16.2.2 remote-as 100
 neighbor 172.16.2.2 route-reflector-client
 neighbor 172.16.2.2 description Client2

!

Client1 and Client2 Configuration

Client1 router:

router bgp 100
 neighbor 172.16.1.1 remote-as 100
 neighbor 172.16.1.1 description RR1
 network 11.1.1.1 mask 255.255.255.255
!


Client2 router:

router bgp 100
 neighbor 172.16.2.1 remote-as 100
 neighbor 172.16.2.1 description RR2
 network 12.1.1.1 mask 255.255.255.255
!


Monitoring route reflection:
An RR reflecting the route received from a RR-Client adds-
  1. Originator ID- a 4-byte BGP attribute that is created by the RR. This attribute carries the Router ID of the originator of the route in the local AS. If the update comes back to the originator, it ignores the update.
  2. Cluster List- A Cluster List is a list of Cluster IDs that an update has traversed. When a route reflector sends a route received from a client to a non-client, it appends the local Cluster ID. If a route reflector receives a route whose Cluster List contains the local Cluster ID, it ignores the update.
The following output shows RR1 receives 11.1.1.1/32 prefix from it RR-client Client1.

11.1.1.1/32 on RR1

!--------- Client1 router advertises 11.1.1.1/32 to RR1 router

Client1# show ip bgp 11.1.1.1
BGP routing table entry for 11.1.1.1/32, version 2
Paths: (1 available, best #1, table Default-IP-Routing-Table)
  Advertised to update-groups:
     1
  Local
    0.0.0.0 from 0.0.0.0 (172.16.1.2)
      Origin IGP, metric 0, localpref 100, weight 32768, valid, sourced, local, best

Client1# show ip bgp update-group
BGP version 4 update-group 1, internal, Address Family: IPv4 Unicast
  BGP Update version : 5/0, messages 0
  Update messages formatted 1, replicated 0
  Number of NLRIs in the update sent: max 1, min 1
  Minimum time between advertisement runs is 0 seconds
  Has 1 member (* indicates the members currently being sent updates):
   172.16.1.1


!---- RR1 receives 11.1.1.1/32 from its client Client1 router

RR1# show ip bgp 11.1.1.1
BGP routing table entry for 11.1.1.1/32, version 2
Paths: (1 available, best #1, table Default-IP-Routing-Table)
  Advertised to update-groups:
     2
  Local, (Received from a RR-client)
    172.16.1.2 from 172.16.1.2 (172.16.1.2)
      Origin IGP, metric 0, localpref 100, valid, internal, best

RR1 router advertises this "best" route to all its IBGP peers including the peer it received from.

RR1 advertises 11.1.1.1/32

RR1# show ip bgp update-group
BGP version 4 update-group 1, internal, Address Family: IPv4 Unicast
  BGP Update version : 5/0, messages 0
  Route-Reflector Client
  Update messages formatted 3, replicated 0
  Number of NLRIs in the update sent: max 1, min 0
  Minimum time between advertisement runs is 0 seconds
  Has 1 member (* indicates the members currently being sent updates):
   172.16.1.2

BGP version 4 update-group 2, internal, Address Family: IPv4 Unicast
  BGP Update version : 5/0, messages 0
  Update messages formatted 1, replicated 0
  Number of NLRIs in the update sent: max 1, min 1
  Minimum time between advertisement runs is 0 seconds
  Has 1 member (* indicates the members currently being sent updates):
   2.2.2.2

RR2 receives 11.1.1.1/32 prefix from RR1 router. RR1 adds the Originator ID attribute (Router ID of Client1 router 172.16.1.2) and Cluster List attribute (Router ID of RR1 router 1.1.1.1).

RR2 receives 11.1.1.1/32

RR2# show ip bgp 11.1.1.1
BGP routing table entry for 11.1.1.1/32, version 2
Paths: (1 available, best #1, table Default-IP-Routing-Table)
  Advertised to update-groups:
     2
  Local
    172.16.1.2 (metric 65) from 1.1.1.1 (1.1.1.1)
      Origin IGP, metric 0, localpref 100, valid, internal, best
      Originator: 172.16.1.2, Cluster list: 1.1.1.1

When RR2 receives the prefix 11.1.1.1/32, it accepts the prefix and advertises to Client2 after appending its Router ID (2.2.2.2) to the Cluster List attribute. The Originator ID attribute remains unchanged.

Client2 receives 11.1.1.1/32

Client2# show ip bgp 11.1.1.1
BGP routing table entry for 11.1.1.1/32, version 5
Paths: (1 available, best #1, table Default-IP-Routing-Table)
  Not advertised to any peer
  Local
    172.16.1.2 (metric 129) from 172.16.2.1 (2.2.2.2)
      Origin IGP, metric 0, localpref 100, valid, internal, best
      Originator: 172.16.1.2, Cluster list: 2.2.2.2, 1.1.1.1


Summary:

BGP route reflector rules-
  1. An IBGP prefix received from a RR-client is advertised to all peers including the RR-client that advertised the prefix.
  2. Routes originated by the router and routes received from EBGP neighbors and selected as "best" routes are advertised to all internal and external BGP peers.
  3. A RR non-client advertising a prefix is advertised to all EBGP peers and IBGP RR-clients.
  4. Routes received from RR-clients and selected as "best" routes are advertised to all internal and external BGP peers.
  5. The RR adds the Originator ID to the routes received from its RR-clients.
  6. The RR appends its own Router ID to the Cluster List attribute.

Further reading:
http://wiki.nil.com/BGP_route_reflectors

BPDU

Bridge Protocol Data Unit


Bridge Protocol Data Units (BPDUs) are frames that contain information about the Spanning tree protocol (STP). Switches send BPDUs using a unique MAC address from its origin port and a multicast address as destination MAC (01:80:C2:00:00:00, or 01:00:0C:CC:CC:CD for Per VLAN Spanning Tree). For STP algorithms to function, the switches need to share information about themselves and their connections. What they share are bridge protocol data units (BPDUs). BPDUs are sent out as multicast frames to which only other layer 2 switches or bridges are listening. If any loops (multiple possible paths between switches) are found in the network topology, the switches will co-operate to disable a port or ports to ensure that there are no loops; that is, from one device to any other device in the layer 2 network, only one path can be taken.[1]
If any changes occur in the layer 2 network, such as when link goes down, a new link is added, a new switch is added, or a switch fails, the switches share this information by transmitting BPDUs, causing the STP algorithm to be re-executed, and a new loop-free topology is then created. STP and BPDUs help speed up convergence (routing). Convergence is a term used in networking to describe the amount of time it takes to deal with changes and get the network back up and running. The default BPDU advertisement time of 2 seconds allows changes to be quickly shared with all the other switches in the network, reducing the amount of time any disruption would create.
There are three kinds of BPDUs:[2]
  • Configuration BPDU, used by Spanning tree protocol to provide information to all switches.
  • TCN (Topology Change Notification), tell about changes in the topology.
  • TCA (Topology Change Acknowledgment), confirm the reception of the TCN.
By default the BPDUs are sent every 2 seconds.

CEF and load sharing



Top of Form
Load-sharing is one of the clumsy areas that is full of confusing parts. In this post we should be covering its ABCs, and latter on we should be covering more parts in details. We chose the name “CEF and load sharing” as the post name due to the main role that CEF plays when talking about load sharing.
In IP routing context the forwarding/switching mechanism that the router uses is the actual controller of the load sharing process (data/forwarding plane operation), having multiple routes in the routing table has no significance on how exactly will load sharing be done, you might be left with poor load sharing or no load sharing at all, although you have multiple routes for a certain destination in the routing table.

The routing protocols are responsible for placing multiple paths in the routing table in the first place (control plane operation), by default all the IGPs are capable of inserting 4 equal cost paths, while BGP defaults to only 1 (BGP behaves completely different than the IGPs, we should be covering load-sharing with BGP in details in a later post). To control the maximum paths allowed per routing protocol we can use the maximum-paths command (The maximum was 4 in IOS releases earlier than 11.0, 8 with IOS Release 12.0S based software, 16 with IOS Release 12.3T based software, and 32 with IOS Release 12.2S based software.
NOTE This post is not meant to explain CEF operation, we’ll only be focusing on CEF load-sharing, however we might consider to have a dedicated CEF inside out post later.
The most popular forwarding/switching mechanisms with Cisco routers are; Process switching (performs per-packet load-sharing), fast switching (performs per-destination load-sharing) and CEF (can do both per-packet and per-destination (completely different than fast switching per-destination load-sharing), plus also a new flavor which is per-port load-sharing).
NOTE According to Cisco, IPv4 fast switching is removed with the implementation of the Cisco Express Forwarding infrastructure enhancements for Cisco IOS 12.2(25)S-based releases and Cisco IOS Release 12.4(20)T. For these and later Cisco IOS releases, switching path are Cisco Express Forwarding switched or process switched. This makes the switching decision easier for future development of software features. Starting with the implementation of the Cisco Express Forwarding enhancements and the removal of IPv4 fast switching, components that do not support Cisco Express Forwarding will work only in process switched mode.
Load-sharing with CEF
For each destination with multiple equal cost paths (or unequal-cost in the case of EIGRP using variance, or with BGP using the BGP Link Bandwidth feature and also in the case of MPLS-TE) the router creates a 16 hash buckets, each pointing to one of the available paths.
The load sharing is controlled by the ratio of the number of buckets pointing to each path (outgoing interface), with equal-cost paths the buckets are fairly distributed (two equal cost paths results in 8 buckets per each path, three equal cost paths results in 5 per each (yes, one bucket is omitted), 4 equal cost paths results in 4 per each, and so on). While with unequal-cost scenarios each path will be associated with different number of buckets (according to the load sharing ratio).
CEF has three load-sharing options:
  • per-destination (per-session):
I prefer to name it per-session – as stated in the show ip cef x.x.x.x internal command output – since it is actually done based on both the source and the destination IP addresses in the IP packet rather than solely the destination, by hashing both into a 4-bit hash value that is used to select the outgoing interface) – This is the default CEF load sharing option.
It is clear that per-destination load-sharing performs statistical distribution of traffic, and accordingly load sharing becomes more effective as the number of source/destination pairs increases as compared to lower number of source/destination pairs. Obviously this might result in having one link overloaded while the other(s) underutilized, if a relatively heavy session flows between a certain source/destination pair over this link.
The hash calculation depends on the algorithm used. The original algorithm uses only the source and destination IP addresses to compute a 4-bit hash value, giving 16 probabilities, and thus choosing an outgoing bucket from the 16 available buckets pointing to one of the outgoing paths, this results in all the routers in the network running the same algorithm with the same results, which introduced a load sharing hitch called CEF Load-Sharing Polarization (you can see a good example for this in Cisco press book “Cisco Express Forwarding”). To circumvent this behavior the universal algorithm (the default in current IOS versions) adds a 32-bit router-specific value to the hash function (called Fixed ID, which can be manually controlled – a router uses its highest loopback IP address as this value when booting) and thus seeding the hash function on each router with a unique ID, ensuring that the same source/destination pair will hash into a different 4-bit value on different routers along the path and thus provides a better network wide load sharing and circumvent the Polarization issue.
NOTE There is a third available algorithm called the tunnel algorithm, I couldn’t find or understand its anatomy, but Cisco stated that this algorithm is meant to solve load sharing when tunneling techniques such as MPLS, GRE and L2TP are in operation, since with tunneling the traffic pattern is taken down to a small number of sessions (between the tunnel head/tail ends) which will introduce another form of traffic polarization. This algorithm also uses a unique per-router ID to work around this issue, again I can’t find more details about this algorithm, but if I do I’ll let you know.
  • per-packet
Packets are handled in a round-robin fashion, ensuring that the traffic is balanced over multiple links. However, using Per-packet load sharing is not generally recommended, because it most commonly results in out-of-order packets, affecting TCP traffic throughput (since TCP will bother to fix the out-of-order) and UDP data loss (since UDP will not bother to fix the out-of-order) and to make things more scary out-of-order packets might be interpreted as an attack by firewalls.
The default CEF load sharing mode is per-destination, and we can change this using the ip load-sharing per-packet interface command on the outgoing interfaces involved.
NOTE Since load sharing decisions are made on the outbound interfaces, thus either choosing to do per-packet or per-destination load sharing should be done on the outbound interfaces.
  • per-port (per-flow)
This is the most adequate option (was introduced with IOS 12.4(11)T release) with networks with low number of sources/destinations with the majority of the traffic between hosts that use different port numbers, commonly seen with Real-Time Protocol (RTP) streams, it simply adds the layer 4 source or destination ports or both in the CEF hashing function. This option is enabled via the ip cef load-sharing algorithm include-ports command in the global configuration.
The most common scenario with this option as the only effective solution is when having a subnet of hosts NATed to a single IP then having a router with multiple paths in the path to their traffic destination, per-destination option is obviously useless in this case if all the hosts are communicating with a single destination, since it is always a single source/destination pair, and accordingly if the layer 4 ports are involved in the hashing function this would enhance the load sharing process.
I hope that I’ve been informative.