External BGP Overview and Configuration

Module Introduction

Before you start the hands-on part of this module, you should load the appropriate configuration and verify that the testbed is up and running by executing the corresponding robot file:

student@tour:~/trainings_resources/robot$ robot bgp_ebgp/bgp_ebgp_setup.robot

BGP Overview

The Border Gateway Protocol (BGP) was originally designed to interconnect networks rather the nodes. These networks are called autonomous systems (AS) and are usually under different administrative domains. An important goal of BGP is to provide loop-free paths to destinations, which does not necessarily mean shortest paths. The reason for this is, that BGP connects autonomous systems and the protocol therefore is limited in the ability to influence decisions which are made in another AS. In BGP, it is more important to focus on policies, rather than efficiency. The current version BGPv4 is defined in RFC4271.

Two nodes exchanging BGP route information are called BGP peers or BGP speakers or BGP neighbors:

  • BGP peers which are in the same autonomous system use internal BGP

  • BGP peers in different autonomous systems use external BGP

BGP does not provide any sort of reliable transport itself, but relies on TCP to carry information between BGP peers using well-known TCP port number 179.

Because BGP uses TCP as transport protocol, BGP peers do not need to be directly connected.

AS numbers are officially assigned by a Regional Internet Registrar, e.g., ARIN, RIPE, or APNIC. Historically BGP AS numbers were 16-bit numbers, where ASN 64512-65534 are reserved for private use. In 2012, support for 32-bit ASN was defined with ASN 4200000000-4294967294 for private use.

Configuring Local BGP Setting

Before configuring the BGP peering itself, that RBFS uses two daemons for handling BGP-related processing, the

  • bgp.iod

  • bgp.appd

In order to improve scalability multiple instances of these daemons can run in parallel and be bound to specific instances or address-families. For beginning, we will configure one instance of each daemon and bind it to all routing instances and address-families:

cfg> set daemon-options * * * bgp.appd bd-name bgp.appd.1
cfg> set daemon-options * * * bgp.iod bd-name bgp.iod.1
cfg> commit

Next, we will configure a few parameters that define the local BGP speaker:

  • BGP Hostname

  • BGP AS number

  • Router ID

  • Address Families (e.g., IPv4 unicast, IPv6 unicast, etc.)

  • BGP Timers (holdtime, keepalive time)

All of these options are configured using the set instance <instance> protocol bgp hierarchy.

Exercise 1: BGP Node Configuration

Configure BGP in instance default with ASN 65001 and appropriate values for hostname and router ID. BGP should support IPv4 Unicast only. Also configure keepalive time of 10sec and holdtime of 30s.

Click to reveal the answer
cfg> set daemon-options * * * bgp.appd bd-name bgp.appd.1
cfg> set daemon-options * * * bgp.iod bd-name bgp.iod.1
cfg> set instance default protocol bgp hostname R1
cfg> set instance default protocol bgp local-as 65001
cfg> set instance default protocol bgp router-id 192.168.0.1
cfg> set instance default protocol bgp timer hold-time 30
cfg> set instance default protocol bgp timer keepalive 10
cfg> set instance default protocol bgp address-family ipv4 unicast
cfg> commit

The correct setting can be verified using the command

cfg> show bgp summary
Instance: default
  General information
    Hostname: R1, Domain name:
    Local AS: 65001, Version: 4
    Local preference: 100, eBGP Protocol preference: 20, iBGP Protocol preference: 200
    Router ID: 192.168.0.1, Cluster ID: 192.168.0.1
  Capabilities
    Route refresh: True, AS4: True, Graceful restart: False
  Best route selection
    Always compare MED: False, Ignore as path: False
    Ignore local preference: False, Ignore origin: False
    Ignore MED: False, Ignore route source: False
    Ignore router ID: False, Ignore uptime: True
    Ignore cluster length: False, Ignore peer IP: False
    Route select parameter: 0
  Timers
    Connect retry: 30s, Keepalive: 10s, Holdtime: 30s
  Statistics
    Peers configured: 0, Peers auto discovery: 0
      Peers in idle          : 0
      Peers in connect       : 0
      Peers in active        : 0
      Peers in opensent      : 0
      Peers in openconfirm   : 0
      Peers in established   : 0

Configuring BGP Peering

Now that we have configured some local parameters, we want to setup external BGP peerings to neighboring routers. We use the following setup:

ebgp lab
Figure 1. External BGP Lab Setup

The BGP peers must manually be defined as there is no auto-detection of neighbors.

As different autonomous systems usually belong to different organizations, there is no IGP running between them. This implies that the BGP peering configuration most often uses the IP addresses of the interface between the BGP peers.
By default, external BGP messages are sent with TTL of 1, i.e., the BGP peers must be directly connected. If you want to setup a BGP peering between nodes that are not directly connected or between two loopback interfaces, you need to modify the TTL setting with the multihop option.

BGP has many different attributes that can be associated with a BGP peer, e.g.

  • AS number of (remote) peer

  • Authentication

  • Supported Address Families

  • Policies to filter and modify BGP routes

  • Nexthop settings

  • Prefix limits

Depending on your peering rules, a lot of these settings are shared by multiple BGP peers. Thus, BGP defines the concept of BGP Peer Groups, which is a template of BGP settings that is defined once and then applied to specific BGP peers. The BGP peer groups are defined in the set instance <instance> protocol bgp peer-group <pgname>.

Exercise 2: eBGP Peer Configuration

Configure TCP Authentication under set instance default tcp authentication with key-id 1, password rtbrick and type HMAC-SHA-256-128.

Afterwards configure two peer groups and two peers according to the following table:

Peer Peer Group Peer Address Peer AS Address Family

R2

ISP2

172.16.0.2

65002

IPv4 Unicast

R3

ISP3

172.16.0.6

65003

IPv4 Unicast

Click to reveal the answer
cfg> set instance default tcp authentication BGP type HMAC-SHA-256-128
cfg> set instance default tcp authentication BGP key1-id 1
cfg> set instance default tcp authentication BGP key1-plain-text rtbrick
cfg> set instance default protocol bgp peer ipv4 172.16.0.2 172.16.0.1 authentication-id BGP
cfg> set instance default protocol bgp peer ipv4 172.16.0.2 172.16.0.1 peer-group ISP2
cfg> set instance default protocol bgp peer ipv4 172.16.0.6 172.16.0.5 authentication-id BGP
cfg> set instance default protocol bgp peer ipv4 172.16.0.6 172.16.0.5 peer-group ISP3
cfg> set instance default protocol bgp peer-group ISP2 remote-as 65002
cfg> set instance default protocol bgp peer-group ISP2 address-family ipv4 unicast
cfg> set instance default protocol bgp peer-group ISP3 remote-as 65003
cfg> set instance default protocol bgp peer-group ISP3 address-family ipv4 unicast
cfg> commit

If everything is configured correctly, you should see two BGP peers:

cfg> show bgp peer
Instance: default
  Peer            Remote AS    State         Up/Down Time           PfxRcvd     PfxSent
  R2              65002        Established   0d:20h:00m:09s         24          24
  192.168.0.3     65003        Established   0d:20h:00m:25s         24          24
Exercise 3: BGP Peer Details

You have successfully setup two BGP sessions. Use the show bgp peer command to get some more details on the BGP session to R3.

  • What is the peers keepalive and holddown timer set to? Which value is used on the session?

  • What BGP address families does R3 support?

Click to reveal the answer
cfg> show bgp peer R3
Peer: R3, Peer IP: 172.16.0.6, Remote AS: 65003, Local: 172.16.0.5, Local AS: 65001, Any AS: False
  Type: ebgp, State: Established, Up/Down Time:
  Discovered on interface: -
  Last transition: Wed Mar 29 11:16:13 GMT +0000 2023, Flap count: 0
  Peer ID        : 192.168.0.3, Local ID  : 192.168.0.1
  Instance       : default, Peer group: ISP3
  6PE enabled    : False
  Timer values:
    Peer keepalive : 30s, Local keepalive: 10s       (1)
    Peer holddown  : 90s, Local holddown : 30s       (2)
    Connect retry  : 30s
  Timers:
    Connect retry timer : 0s
    keepalive timer     : expires in 734739us
    Holddown timer      : expires in 27s 283608us
  NLRIs:
    Sent           : ['ipv4-unicast']
    Received       : ['ipv4-unicast', 'ipv6-unicast']
    Negotiated     : ['ipv4-unicast']
  <...>
1 The local keepalive timer is configured to be 10sec, while the peer keepalive timer is 30sec.
2 The local holddown timer is configured to be 30sec, while R3 has the holddown timer set to 90sec.

In case of a mismatch the lower value is used on the session. You can verify this in the Timers section: the value after expires in will never be more than 30sec.

From the NLRI section, you can see that R3 supports IPv4 Unicast and IPv6 Unicast. While R1 is only configured for IPv4 Unicast, both peers agree on the address-families they both support.

In BGP terminology a route is called a Network Layer Reachability Information (NLRI).

BGP Operation

BGP Route Tables

Now that we have established some BGP peerings, we need to understand how BGP peers exchange route update messages and how they are processed.

bgp route processing
Figure 2. BGP Route Processing in RBFS

BGP UPDATE messages received from a BGP neighbor are stored in a table called the RIB-IN. In BDS, the RIB-IN table name has the format <instance_name>.bgp.rib-in.afi.safi.<peer_ip>.<local_ip>, e.g., the IPv4 route updates received from R2 are stored in default.bgp.rib-in.ipv4.unicast.172.16.0.2.172.16.0.1. These tables are populated by the bgp.iod. The command show bgp rib-in can be used to view the content of the table:

cfg> show bgp rib-in ipv4 unicast peer R2
Instance: default, AFI: ipv4, SAFI: unicast
 Hostname: R2, Peer IP: 172.16.0.2, Source IP: 172.16.0.1, Received routes: 24
    Prefix                    Next Hop         MED     Local Preference  AS Path    Status
    172.16.102.0/27           172.16.0.2       0        -                 65002      Valid
    172.16.103.0/27           172.16.0.2       0        -                 65002, 65003 Valid
    172.16.104.0/27           172.16.0.2       0        -                 65002, 65004 Valid
    172.16.102.32/27          172.16.0.2       0        -                 65002      Valid
    172.16.103.32/29          172.16.0.2       0        -                 65002, 65003 Valid
    172.16.104.32/27          172.16.0.2       0        -                 65002, 65004 Valid
    <...>

The routes will then be passed from bgp.iod to the bgp.appd which does a BGP best path selection based on standard BGP rules. The results are stored in the BGP FIB. The corresponding BDS table is named <instance_name>.bgp.1.fib-local.afi.safi, e.g., the IPv4 BGP FIB is stored in default.bgp.1.fib-local.ipv4.unicast. In this table, all BGP route information from all peers is consolidated. The content of the BGP FIB can be retrieved by show bgp fib command:

cfg> show bgp fib ipv4 unicast
Instance: default, AFI: ipv4, SAFI: unicast
  Prefix                      Preference      Out Label            Next Hop
  172.16.102.0/27             20              -                    172.16.0.2
  172.16.102.32/27            20              -                    172.16.0.2
  172.16.102.64/29            20              -                    172.16.0.2
  172.16.102.72/29            20              -                    172.16.0.2
  172.16.102.112/28           20              -                    172.16.0.2
  <...>

As BGP is not the only routing protocol running on the system, the routing information base daemon (ribd) is responsible for evaluating all the route sources (e.g., BGP, static routes, direct routes, …​) and choosing the best path which will then be installed into the RIB and used to forward traffic. The show route command is used to retrieve the information stored in the RIB.

On the other hand, bgp.appd provides the BGP FIB for route update advertisements to BGP peers. These route updates are stored in the BGP RIB-OUT. Apart from the nexthop information, all BGP peers which belong to the same peer-group will get identical BGP updates. Therefore, the BGP RIB-OUT is stored in BDS on a per peer-group basis namely <instance_name>.bgp.1.peer-group.<pgname>.afi.safi, e.g., default.bgp.1.peer-group.ISP2.ipv4.unicast. The `show bgp rib-out`is used to view the RIB-OUT table:

cfg> show bgp rib-out ipv4 unicast peer R2
Instance: default, AFI: ipv4, SAFI: unicast
  Peer: R2, Sent routes: 24
    Prefix                MED     Local Preference  Origin        Next Hop       AS Path
    172.16.102.0/27       0       -                 Incomplete    -              65001, 65002
    172.16.103.0/27       0       -                 Incomplete    -              65001, 65003
    172.16.104.0/27       0       -                 Incomplete    -              65001, 65002, 65004
    172.16.102.32/27      0       -                 Incomplete    -              65001, 65002
    172.16.103.32/29      0       -                 Incomplete    -              65001, 65003
    172.16.104.32/27      0       -                 Incomplete    -              65001, 65002, 65004
    <...>
Exercise 4: BGP RIB-IN

You have received the route 192.168.0.104/32 from both of your peers? Please check in which way the route received from R2 differs from the route received from R3. Which of those routes is selected as best path?

Click to reveal the answer
cfg> show bgp rib-in ipv4 unicast peer R2 192.168.0.104/32
Instance: default, AFI: ipv4, SAFI: unicast
  Peer: R2, Received routes: 1
    192.168.0.104/32, Received path ID: 0, Next hop: 172.16.0.2
      Status: Valid
      Protocol source: bgp, Send path ID: 130982003
      AS path: 65002, 65004                (1)
      MED: 0, Local preference: -
      Community: ['104:1', '102:2']        (2)
      Extended community: -
      Large community: -
      Originator ID: -
      Cluster list: -
      Label: -, Last update: 0d:20h:40m:06s
cfg> show bgp rib-in ipv4 unicast peer R3 192.168.0.104/32
Instance: default, AFI: ipv4, SAFI: unicast
  Peer: R3, Received routes: 1
    192.168.0.104/32, Received path ID: 0, Next hop: 172.16.0.6
      Status: Valid
      Protocol source: bgp, Send path ID: 198092479
      AS path: 65003, 65004                (1)
      MED: 0, Local preference: -
      Community: ['103:3', '104:1']        (2)
      Extended community: -
      Large community: -
      Originator ID: -
      Cluster list: -
      Label: -, Last update: 0d:00h:24m:58s

Comparing the various attributes shown in the above output, the main difference is the

1 the as-path ("65002, 65004" vs "65003, 65004") and
2 the community.

From this output we have learned that this route is originated in AS65004 which is connected to both AS65002 and AS65003.

The output of show bgp fib shows that the route from R2 is preferred.

cfg> show bgp fib ipv4 unicast 192.168.0.104/32
Instance: default, AFI: ipv4, SAFI: unicast
  Prefix: 192.168.0.104/32
    Next hop key: 784ac3ae40014c2e2ebf604130c8aec71ed35515f5aa7499
    Peer: None, Peer domain: None
    Route source: bgp, Send path ID: 130982003, Received path ID: None, Path hash: None
    As path: 65002, 65004, Originator ID: None, Origin: Incomplete
    Community: ['104:1', '102:2']
    Extended community: None
    Large community: None
    Cluster list: None
    IGP metric: 4294967295, Local preference: None, Multi exit discriminator: 0
    Preference: 20, External route: None, Readvertised route: None
    Route up: None
    Next hop:
      172.16.0.2, Label -
By default, BGP only selects a single best path and installs it into the BGP FIB.

If you want to use load sharing, you must enable the multipath option. The BGP multipath option is configured under the set instance <instance> protocol bgp <afi> <safi> hierarchy, i.e., it can be set on a per address-family basis.

Exercise 5: BGP Multipath

Please set the BGP multipath to 4 for IPv4 Unicast. What has changed for 192.168.0.104/32 in the BGP FIB.

Click to reveal the answer
cfg> set instance default protocol bgp address-family ipv4 unicast multipath 4
cfg> commit
cfg> show bgp fib ipv4 unicast 192.168.0.104/32
Instance: default, AFI: ipv4, SAFI: unicast
  Prefix: 192.168.0.104/32
    Next hop key: 82ec1322c525cdfa8d5877afc1392b7275b94810cfcbd52b
    Peer: None, Peer domain: None
    Route source: bgp, Send path ID: 130982003, Received path ID: None, Path hash: None
    As path: 65002, 65004, Originator ID: None, Origin: Incomplete
    Community: ['104:1', '102:2']
    Extended community: None
    Large community: None
    Cluster list: None
    IGP metric: 4294967295, Local preference: None, Multi exit discriminator: 0
    Preference: 20, External route: None, Readvertised route: None
    Route up: None
    Next hop:
      172.16.0.6, Label -
    Next hop:
      172.16.0.2, Label -

Now you have two next hops because both routes are equally good.

Route Selection Process

We have already mentioned that BGP is designed to interconnect autonomous systems and that its primary focus is on selecting loop-free paths and on implementing policies, rather than on efficiency. In contrast to other routing protocols that rely on a single metric, BGP supports multiple path attributes that are assigned to a prefix and can be used for decision making. The path attributes are encoded in a TLV (type, length, value) format which allows to add new path attributes if needed. The corresponding type code also carries the information what to do with a path attribute if the peer does not recognize it. For a complete list of BGP attributes, see IANA BGP Parameters.

The most important BGP attributes are the AS_PATH attribute and the nexthop attribute. The AS_PATH attribute identifies the autonomous systems through which routing information has passed. Whenever a route is advertised between external BGP peers, the AS_PATH attribute is updated. The AS_PATH is also used for loop prevention as a BGP peer does not accept updates where its own AS number is already in the AS_PATH attribute.

As BGP has several path attributes that can serve as metric, we need to know in which order they are evaluated:

  1. Route Source: Always prefer routes from local which are locally originated over the received route. Note, this step is rarely used in the decision process.

  2. Local Preference: The BGP paths with highest local preference is chosen. As the name implies, the local preference determines the preferred exit point from the local AS point of view.

  3. AS_PATH length: The BGP path with shortest AS_PATH length is preferred, i.e., the path which passes the smallest number of autonomous systems.

  4. Origin: Prefer the path with lowest origin code (IGP < EGP < INCOMPLETE). Note, this step is rarely used in the decision process.

  5. Multiexit Discriminator (MED): The path with lowest MED value is preferred. If there is no MED, then it is assumed to be 0.

  6. Route Type: eBGP is always preferred over iBGP.

  7. IGP metric: The path with lowest IGP metric to the BGP nexthop is preferred.

  8. etc.

If multiple paths are equal up to this point and the multipath option is enabled, then multiple nexthops will be installed into the FIB.

From these rules it is clear that the local preference value is the best way to change the traffic leaving the autonomous system and the AS_PATH and MED value the best options to influence traffic entering the autonomous system. These values can changed using routing policies which will be discussed in Module Policies.

BGP Redistribution

Unlike IGP protocols, BGP is not enabled on any interfaces. If we look at the output of the show bgp rib-out command, we notice that R1 is only advertising routes to its peers which were learned from these peers before. As a consequence, no destination within AS65100 is reachable from the outside so far.

The process of injecting non-BGP routes into the BGP tables is called redistribution just like in IGPs. In order to make routes reachable from outside our AS, we need to redistribute direct routes, static routes or routes from other dynamic protocols into BGP using the set instance <instance_name> protocol bgp address-family <afi> <safi> redistribute …​ syntax.

Exercise 6: Redistribution

Configure R1 to redistribute direct and static routes into BGP. Verify that the loopback address is advertised to the neighboring peers.

Click to reveal the answer
cfg> set instance default protocol bgp address-family ipv4 unicast redistribute static
cfg> set instance default protocol bgp address-family ipv4 unicast redistribute direct
cfg> commit
cfg> show bgp rib-out ipv4 unicast peer R2 192.168.0.1/32
Instance: default, AFI: ipv4, SAFI: unicast
  Peer: R2, Sent routes: 1
    Prefix: 192.168.0.1/32, RD: None, Send path ID: 0, Next hop: None
      Peer: -, Peer domain: -, Route source: -, Received path ID: None, Path hash: 04f28fc9becac4e972dd395cf88430af0f89b3f511221b56
      AS path: 65001, Originator ID: None, Origin: Incomplete
      Community: None, Extended community: None, Large community: None
      Cluster list: None
      IGP metric: None, Local preference: None, Multi exit discriminator: 50
      Preference: None, External route: None, Readvertised route: None
      Label: -, Last update: 0d:00h:01m:38s
As a rule, you want to hide the details of your autonomous system from the outside world. Best practice is to generate a couple of summary routes by configuring static routes with nexthop pointing to null and only redistribute these summaries into BGP.

BGP IPv6 Support

In its original specification, BGP only supported IPv4 route exchange. RFC2283 introduces a concept called multiprotocol extensions which enables BGP to carry routing information for other network protocols as well. For this purpose, a new TLV for multiprotocol NLRI is introduced that also supports nexthops other than IPv4 nexthops.

If you already have a BGP session exchanging IPv4 route information, there is no need for a second BGP session. You can just add another address-family.

Although the nexthop information for IPv6 NLRI must be an IPv6 address, the BGP control plane session can still be based on IPv4.
Exercise 7: Configuring IPv6 Support

Configure the peer-group ISP3 to support IPv6 unicast and redistribute direct and static routes into BGP. Verify that the IPv6 loopback address is advertised to R3. Also verify that you can send ICMP ECHO requests to fc00:c0a8::192:168:0:103.

Click to reveal the answer
cfg> set instance default protocol bgp address-family ipv6 unicast
cfg> set instance default protocol bgp address-family ipv6 unicast redistribute direct
cfg> set instance default protocol bgp address-family ipv6 unicast redistribute static
cfg> set instance default protocol bgp peer-group ISP3 address-family ipv6 unicast
cfg> commit
cfg> ping fc00:c0a8::192:168:0:103 source-ip fc00:c0a8::192:168:0:1 count 3
68 bytes from fc00:c0a8::192:168:0:103: icmp_seq=1 ttl=64 time=7.6699 ms
68 bytes from fc00:c0a8::192:168:0:103: icmp_seq=2 ttl=64 time=9.8181 ms
68 bytes from fc00:c0a8::192:168:0:103: icmp_seq=3 ttl=64 time=2.3385 ms
Statistics: 3 sent, 3 received, 0% packet loss

Summary

If you have completed the exercise, you can check the results by executing

student@tour:~/trainings_resources/robot$ robot bgp_ebgp/bgp_ebgp_verify.robot