External BGP Overview and Configuration
Module Introduction
Before you start the hands-on part of this module, you should load the appropriate configuration and verify that the testbed is up and running by executing the corresponding robot file:
student@tour:~/trainings_resources/robot$ robot bgp_ebgp/bgp_ebgp_setup.robot
BGP Overview
The Border Gateway Protocol (BGP) was originally designed to interconnect networks rather the nodes. These networks are called autonomous systems (AS) and are usually under different administrative domains. An important goal of BGP is to provide loop-free paths to destinations, which does not necessarily mean shortest paths. The reason for this is, that BGP connects autonomous systems and the protocol therefore is limited in the ability to influence decisions which are made in another AS. In BGP, it is more important to focus on policies, rather than efficiency. The current version BGPv4 is defined in RFC4271.
Two nodes exchanging BGP route information are called BGP peers or BGP speakers or BGP neighbors:
-
BGP peers which are in the same autonomous system use internal BGP
-
BGP peers in different autonomous systems use external BGP
BGP does not provide any sort of reliable transport itself, but relies on TCP to carry information between BGP peers using well-known TCP port number 179.
Because BGP uses TCP as transport protocol, BGP peers do not need to be directly connected. |
AS numbers are officially assigned by a Regional Internet Registrar, e.g., ARIN, RIPE, or APNIC. Historically BGP AS numbers were 16-bit numbers, where ASN 64512-65534 are reserved for private use. In 2012, support for 32-bit ASN was defined with ASN 4200000000-4294967294 for private use.
Configuring Local BGP Setting
Before configuring the BGP peering itself, that RBFS uses two daemons for handling BGP-related processing, the
-
bgp.iod
-
bgp.appd
In order to improve scalability multiple instances of these daemons can run in parallel and be bound to specific instances or address-families. For beginning, we will configure one instance of each daemon and bind it to all routing instances and address-families:
cfg> set daemon-options * * * bgp.appd bd-name bgp.appd.1
cfg> set daemon-options * * * bgp.iod bd-name bgp.iod.1
cfg> commit
Next, we will configure a few parameters that define the local BGP speaker:
-
BGP Hostname
-
BGP AS number
-
Router ID
-
Address Families (e.g., IPv4 unicast, IPv6 unicast, etc.)
-
BGP Timers (holdtime, keepalive time)
All of these options are configured using the set instance <instance> protocol bgp
hierarchy.
Configure BGP in instance default with ASN 65001 and appropriate values for hostname and router ID. BGP should support IPv4 Unicast only. Also configure keepalive time of 10sec and holdtime of 30s.
Click to reveal the answer
cfg> set daemon-options * * * bgp.appd bd-name bgp.appd.1
cfg> set daemon-options * * * bgp.iod bd-name bgp.iod.1
cfg> set instance default protocol bgp hostname R1
cfg> set instance default protocol bgp local-as 65001
cfg> set instance default protocol bgp router-id 192.168.0.1
cfg> set instance default protocol bgp timer hold-time 30
cfg> set instance default protocol bgp timer keepalive 10
cfg> set instance default protocol bgp address-family ipv4 unicast
cfg> commit
The correct setting can be verified using the command
cfg> show bgp summary
Instance: default
General information
Hostname: R1, Domain name:
Local AS: 65001, Version: 4
Local preference: 100, eBGP Protocol preference: 20, iBGP Protocol preference: 200
Router ID: 192.168.0.1, Cluster ID: 192.168.0.1
Capabilities
Route refresh: True, AS4: True, Graceful restart: False
Best route selection
Always compare MED: False, Ignore as path: False
Ignore local preference: False, Ignore origin: False
Ignore MED: False, Ignore route source: False
Ignore router ID: False, Ignore uptime: True
Ignore cluster length: False, Ignore peer IP: False
Route select parameter: 0
Timers
Connect retry: 30s, Keepalive: 10s, Holdtime: 30s
Statistics
Peers configured: 0, Peers auto discovery: 0
Peers in idle : 0
Peers in connect : 0
Peers in active : 0
Peers in opensent : 0
Peers in openconfirm : 0
Peers in established : 0
Configuring BGP Peering
Now that we have configured some local parameters, we want to setup external BGP peerings to neighboring routers. We use the following setup:
The BGP peers must manually be defined as there is no auto-detection of neighbors.
As different autonomous systems usually belong to different organizations, there is no IGP running between them. This implies that the BGP peering configuration most often uses the IP addresses of the interface between the BGP peers. |
By default, external BGP messages are sent with TTL of 1, i.e., the BGP peers must be directly connected. If you want to setup a BGP peering between nodes that are not directly connected or between two loopback interfaces, you need to modify the TTL setting with the multihop option. |
BGP has many different attributes that can be associated with a BGP peer, e.g.
-
AS number of (remote) peer
-
Authentication
-
Supported Address Families
-
Policies to filter and modify BGP routes
-
Nexthop settings
-
Prefix limits
Depending on your peering rules, a lot of these settings are shared by multiple BGP peers. Thus, BGP defines the concept of BGP Peer Groups, which is a template of BGP settings that is defined once and then applied to specific BGP peers. The BGP peer groups are defined in the set instance <instance> protocol bgp peer-group <pgname>
.
Configure TCP Authentication under set instance default tcp authentication
with key-id 1, password rtbrick and type HMAC-SHA-256-128.
Afterwards configure two peer groups and two peers according to the following table:
Peer | Peer Group | Peer Address | Peer AS | Address Family |
---|---|---|---|---|
R2 |
ISP2 |
172.16.0.2 |
65002 |
IPv4 Unicast |
R3 |
ISP3 |
172.16.0.6 |
65003 |
IPv4 Unicast |
Click to reveal the answer
cfg> set instance default tcp authentication BGP type HMAC-SHA-256-128
cfg> set instance default tcp authentication BGP key1-id 1
cfg> set instance default tcp authentication BGP key1-plain-text rtbrick
cfg> set instance default protocol bgp peer ipv4 172.16.0.2 172.16.0.1 authentication-id BGP
cfg> set instance default protocol bgp peer ipv4 172.16.0.2 172.16.0.1 peer-group ISP2
cfg> set instance default protocol bgp peer ipv4 172.16.0.6 172.16.0.5 authentication-id BGP
cfg> set instance default protocol bgp peer ipv4 172.16.0.6 172.16.0.5 peer-group ISP3
cfg> set instance default protocol bgp peer-group ISP2 remote-as 65002
cfg> set instance default protocol bgp peer-group ISP2 address-family ipv4 unicast
cfg> set instance default protocol bgp peer-group ISP3 remote-as 65003
cfg> set instance default protocol bgp peer-group ISP3 address-family ipv4 unicast
cfg> commit
If everything is configured correctly, you should see two BGP peers:
cfg> show bgp peer
Instance: default
Peer Remote AS State Up/Down Time PfxRcvd PfxSent
R2 65002 Established 0d:20h:00m:09s 24 24
192.168.0.3 65003 Established 0d:20h:00m:25s 24 24
You have successfully setup two BGP sessions. Use the show bgp peer
command to get some more details on the BGP session to R3.
-
What is the peers keepalive and holddown timer set to? Which value is used on the session?
-
What BGP address families does R3 support?
Click to reveal the answer
cfg> show bgp peer R3
Peer: R3, Peer IP: 172.16.0.6, Remote AS: 65003, Local: 172.16.0.5, Local AS: 65001, Any AS: False
Type: ebgp, State: Established, Up/Down Time:
Discovered on interface: -
Last transition: Wed Mar 29 11:16:13 GMT +0000 2023, Flap count: 0
Peer ID : 192.168.0.3, Local ID : 192.168.0.1
Instance : default, Peer group: ISP3
6PE enabled : False
Timer values:
Peer keepalive : 30s, Local keepalive: 10s (1)
Peer holddown : 90s, Local holddown : 30s (2)
Connect retry : 30s
Timers:
Connect retry timer : 0s
keepalive timer : expires in 734739us
Holddown timer : expires in 27s 283608us
NLRIs:
Sent : ['ipv4-unicast']
Received : ['ipv4-unicast', 'ipv6-unicast']
Negotiated : ['ipv4-unicast']
<...>
1 | The local keepalive timer is configured to be 10sec, while the peer keepalive timer is 30sec. |
2 | The local holddown timer is configured to be 30sec, while R3 has the holddown timer set to 90sec. |
In case of a mismatch the lower value is used on the session. You can verify this in the Timers section: the value after expires in
will never be more than 30sec.
From the NLRI
section, you can see that R3 supports IPv4 Unicast and IPv6 Unicast. While R1 is only configured for IPv4 Unicast, both peers agree on the address-families they both support.
In BGP terminology a route is called a Network Layer Reachability Information (NLRI). |
BGP Operation
BGP Route Tables
Now that we have established some BGP peerings, we need to understand how BGP peers exchange route update messages and how they are processed.
BGP UPDATE messages received from a BGP neighbor are stored in a table called the RIB-IN. In BDS, the RIB-IN table name has the format <instance_name>.bgp.rib-in.afi.safi.<peer_ip>.<local_ip>
, e.g., the IPv4 route updates received from R2 are stored in default.bgp.rib-in.ipv4.unicast.172.16.0.2.172.16.0.1
. These tables are populated by the bgp.iod. The command show bgp rib-in
can be used to view the content of the table:
cfg> show bgp rib-in ipv4 unicast peer R2
Instance: default, AFI: ipv4, SAFI: unicast
Hostname: R2, Peer IP: 172.16.0.2, Source IP: 172.16.0.1, Received routes: 24
Prefix Next Hop MED Local Preference AS Path Status
172.16.102.0/27 172.16.0.2 0 - 65002 Valid
172.16.103.0/27 172.16.0.2 0 - 65002, 65003 Valid
172.16.104.0/27 172.16.0.2 0 - 65002, 65004 Valid
172.16.102.32/27 172.16.0.2 0 - 65002 Valid
172.16.103.32/29 172.16.0.2 0 - 65002, 65003 Valid
172.16.104.32/27 172.16.0.2 0 - 65002, 65004 Valid
<...>
The routes will then be passed from bgp.iod to the bgp.appd which does a BGP best path selection based on standard BGP rules. The results are stored in the BGP FIB. The corresponding BDS table is named <instance_name>.bgp.1.fib-local.afi.safi
, e.g., the IPv4 BGP FIB is stored in default.bgp.1.fib-local.ipv4.unicast
. In this table, all BGP route information from all peers is consolidated. The content of the BGP FIB can be retrieved by show bgp fib
command:
cfg> show bgp fib ipv4 unicast
Instance: default, AFI: ipv4, SAFI: unicast
Prefix Preference Out Label Next Hop
172.16.102.0/27 20 - 172.16.0.2
172.16.102.32/27 20 - 172.16.0.2
172.16.102.64/29 20 - 172.16.0.2
172.16.102.72/29 20 - 172.16.0.2
172.16.102.112/28 20 - 172.16.0.2
<...>
As BGP is not the only routing protocol running on the system, the routing information base daemon (ribd) is responsible for evaluating all the route sources (e.g., BGP, static routes, direct routes, …) and choosing the best path which will then be installed into the RIB and used to forward traffic. The show route
command is used to retrieve the information stored in the RIB.
On the other hand, bgp.appd provides the BGP FIB for route update advertisements to BGP peers. These route updates are stored in the BGP RIB-OUT. Apart from the nexthop information, all BGP peers which belong to the same peer-group will get identical BGP updates. Therefore, the BGP RIB-OUT is stored in BDS on a per peer-group basis namely <instance_name>.bgp.1.peer-group.<pgname>.afi.safi
, e.g., default.bgp.1.peer-group.ISP2.ipv4.unicast
. The `show bgp rib-out`is used to view the RIB-OUT table:
cfg> show bgp rib-out ipv4 unicast peer R2
Instance: default, AFI: ipv4, SAFI: unicast
Peer: R2, Sent routes: 24
Prefix MED Local Preference Origin Next Hop AS Path
172.16.102.0/27 0 - Incomplete - 65001, 65002
172.16.103.0/27 0 - Incomplete - 65001, 65003
172.16.104.0/27 0 - Incomplete - 65001, 65002, 65004
172.16.102.32/27 0 - Incomplete - 65001, 65002
172.16.103.32/29 0 - Incomplete - 65001, 65003
172.16.104.32/27 0 - Incomplete - 65001, 65002, 65004
<...>
You have received the route 192.168.0.104/32 from both of your peers? Please check in which way the route received from R2 differs from the route received from R3. Which of those routes is selected as best path?
Click to reveal the answer
cfg> show bgp rib-in ipv4 unicast peer R2 192.168.0.104/32
Instance: default, AFI: ipv4, SAFI: unicast
Peer: R2, Received routes: 1
192.168.0.104/32, Received path ID: 0, Next hop: 172.16.0.2
Status: Valid
Protocol source: bgp, Send path ID: 130982003
AS path: 65002, 65004 (1)
MED: 0, Local preference: -
Community: ['104:1', '102:2'] (2)
Extended community: -
Large community: -
Originator ID: -
Cluster list: -
Label: -, Last update: 0d:20h:40m:06s
cfg> show bgp rib-in ipv4 unicast peer R3 192.168.0.104/32
Instance: default, AFI: ipv4, SAFI: unicast
Peer: R3, Received routes: 1
192.168.0.104/32, Received path ID: 0, Next hop: 172.16.0.6
Status: Valid
Protocol source: bgp, Send path ID: 198092479
AS path: 65003, 65004 (1)
MED: 0, Local preference: -
Community: ['103:3', '104:1'] (2)
Extended community: -
Large community: -
Originator ID: -
Cluster list: -
Label: -, Last update: 0d:00h:24m:58s
Comparing the various attributes shown in the above output, the main difference is the
1 | the as-path ("65002, 65004" vs "65003, 65004") and |
2 | the community. |
From this output we have learned that this route is originated in AS65004 which is connected to both AS65002 and AS65003.
The output of show bgp fib
shows that the route from R2 is preferred.
cfg> show bgp fib ipv4 unicast 192.168.0.104/32
Instance: default, AFI: ipv4, SAFI: unicast
Prefix: 192.168.0.104/32
Next hop key: 784ac3ae40014c2e2ebf604130c8aec71ed35515f5aa7499
Peer: None, Peer domain: None
Route source: bgp, Send path ID: 130982003, Received path ID: None, Path hash: None
As path: 65002, 65004, Originator ID: None, Origin: Incomplete
Community: ['104:1', '102:2']
Extended community: None
Large community: None
Cluster list: None
IGP metric: 4294967295, Local preference: None, Multi exit discriminator: 0
Preference: 20, External route: None, Readvertised route: None
Route up: None
Next hop:
172.16.0.2, Label -
By default, BGP only selects a single best path and installs it into the BGP FIB. |
If you want to use load sharing, you must enable the multipath option. The BGP multipath option is configured under the set instance <instance> protocol bgp <afi> <safi>
hierarchy, i.e., it can be set on a per address-family basis.
Please set the BGP multipath to 4 for IPv4 Unicast. What has changed for 192.168.0.104/32 in the BGP FIB.
Click to reveal the answer
cfg> set instance default protocol bgp address-family ipv4 unicast multipath 4
cfg> commit
cfg> show bgp fib ipv4 unicast 192.168.0.104/32
Instance: default, AFI: ipv4, SAFI: unicast
Prefix: 192.168.0.104/32
Next hop key: 82ec1322c525cdfa8d5877afc1392b7275b94810cfcbd52b
Peer: None, Peer domain: None
Route source: bgp, Send path ID: 130982003, Received path ID: None, Path hash: None
As path: 65002, 65004, Originator ID: None, Origin: Incomplete
Community: ['104:1', '102:2']
Extended community: None
Large community: None
Cluster list: None
IGP metric: 4294967295, Local preference: None, Multi exit discriminator: 0
Preference: 20, External route: None, Readvertised route: None
Route up: None
Next hop:
172.16.0.6, Label -
Next hop:
172.16.0.2, Label -
Now you have two next hops because both routes are equally good.
Route Selection Process
We have already mentioned that BGP is designed to interconnect autonomous systems and that its primary focus is on selecting loop-free paths and on implementing policies, rather than on efficiency. In contrast to other routing protocols that rely on a single metric, BGP supports multiple path attributes that are assigned to a prefix and can be used for decision making. The path attributes are encoded in a TLV (type, length, value) format which allows to add new path attributes if needed. The corresponding type code also carries the information what to do with a path attribute if the peer does not recognize it. For a complete list of BGP attributes, see IANA BGP Parameters.
The most important BGP attributes are the AS_PATH attribute and the nexthop attribute. The AS_PATH attribute identifies the autonomous systems through which routing information has passed. Whenever a route is advertised between external BGP peers, the AS_PATH attribute is updated. The AS_PATH is also used for loop prevention as a BGP peer does not accept updates where its own AS number is already in the AS_PATH attribute.
As BGP has several path attributes that can serve as metric, we need to know in which order they are evaluated:
-
Route Source: Always prefer routes from local which are locally originated over the received route. Note, this step is rarely used in the decision process.
-
Local Preference: The BGP paths with highest local preference is chosen. As the name implies, the local preference determines the preferred exit point from the local AS point of view.
-
AS_PATH length: The BGP path with shortest AS_PATH length is preferred, i.e., the path which passes the smallest number of autonomous systems.
-
Origin: Prefer the path with lowest origin code (IGP < EGP < INCOMPLETE). Note, this step is rarely used in the decision process.
-
Multiexit Discriminator (MED): The path with lowest MED value is preferred. If there is no MED, then it is assumed to be 0.
-
Route Type: eBGP is always preferred over iBGP.
-
IGP metric: The path with lowest IGP metric to the BGP nexthop is preferred.
-
etc.
If multiple paths are equal up to this point and the multipath option is enabled, then multiple nexthops will be installed into the FIB.
|
From these rules it is clear that the local preference value is the best way to change the traffic leaving the autonomous system and the AS_PATH and MED value the best options to influence traffic entering the autonomous system. These values can changed using routing policies which will be discussed in Module Policies.
BGP Redistribution
Unlike IGP protocols, BGP is not enabled on any interfaces. If we look at the output of the show bgp rib-out
command, we notice that R1 is only advertising routes to its peers which were learned from these peers before. As a consequence, no destination within AS65100 is reachable from the outside so far.
The process of injecting non-BGP routes into the BGP tables is called redistribution just like in IGPs. In order to make routes reachable from outside our AS, we need to redistribute direct routes, static routes or routes from other dynamic protocols into BGP using the set instance <instance_name> protocol bgp address-family <afi> <safi> redistribute …
syntax.
Configure R1 to redistribute direct and static routes into BGP. Verify that the loopback address is advertised to the neighboring peers.
Click to reveal the answer
cfg> set instance default protocol bgp address-family ipv4 unicast redistribute static
cfg> set instance default protocol bgp address-family ipv4 unicast redistribute direct
cfg> commit
cfg> show bgp rib-out ipv4 unicast peer R2 192.168.0.1/32
Instance: default, AFI: ipv4, SAFI: unicast
Peer: R2, Sent routes: 1
Prefix: 192.168.0.1/32, RD: None, Send path ID: 0, Next hop: None
Peer: -, Peer domain: -, Route source: -, Received path ID: None, Path hash: 04f28fc9becac4e972dd395cf88430af0f89b3f511221b56
AS path: 65001, Originator ID: None, Origin: Incomplete
Community: None, Extended community: None, Large community: None
Cluster list: None
IGP metric: None, Local preference: None, Multi exit discriminator: 50
Preference: None, External route: None, Readvertised route: None
Label: -, Last update: 0d:00h:01m:38s
As a rule, you want to hide the details of your autonomous system from the outside world. Best practice is to generate a couple of summary routes by configuring static routes with nexthop pointing to null and only redistribute these summaries into BGP. |
BGP IPv6 Support
In its original specification, BGP only supported IPv4 route exchange. RFC2283 introduces a concept called multiprotocol extensions which enables BGP to carry routing information for other network protocols as well. For this purpose, a new TLV for multiprotocol NLRI is introduced that also supports nexthops other than IPv4 nexthops.
If you already have a BGP session exchanging IPv4 route information, there is no need for a second BGP session. You can just add another address-family.
Although the nexthop information for IPv6 NLRI must be an IPv6 address, the BGP control plane session can still be based on IPv4. |
Configure the peer-group ISP3 to support IPv6 unicast and redistribute direct and static routes into BGP. Verify that the IPv6 loopback address is advertised to R3. Also verify that you can send ICMP ECHO requests to fc00:c0a8::192:168:0:103.
Click to reveal the answer
cfg> set instance default protocol bgp address-family ipv6 unicast
cfg> set instance default protocol bgp address-family ipv6 unicast redistribute direct
cfg> set instance default protocol bgp address-family ipv6 unicast redistribute static
cfg> set instance default protocol bgp peer-group ISP3 address-family ipv6 unicast
cfg> commit
cfg> ping fc00:c0a8::192:168:0:103 source-ip fc00:c0a8::192:168:0:1 count 3
68 bytes from fc00:c0a8::192:168:0:103: icmp_seq=1 ttl=64 time=7.6699 ms
68 bytes from fc00:c0a8::192:168:0:103: icmp_seq=2 ttl=64 time=9.8181 ms
68 bytes from fc00:c0a8::192:168:0:103: icmp_seq=3 ttl=64 time=2.3385 ms
Statistics: 3 sent, 3 received, 0% packet loss