BGP, Border Gateway Protocol, is the routing protocol that connects the internet, and it is versatile enough to be used from small to large environments. This will be a small reference for BGP basic knowledge.
– Basics of BGP
– Confederations
– Route Reflectors
– Neighbour Formations
– Path Selection
– BGP Synchronisation
– BGP Summarisation
– eBGP Multihop
– iBGP & eBGP Next Hop Handling
– Peer Groups
Basics of BGP
BGP Is an exterior gateway protocol (EGP). BGP uses on autonomous system numbers to determine one organisation from another. There are both public and private AS numbers.
There are two flavours of BGP, eBGP and iBGP. eBGP is external between different AS numbers, and iBGP is BGP inside the same AS.
BGP Characteristics
– Forms neighbour relationships to a statically configured IP address
– TCP Session established between neighbours on TCP 179
– Advertises address prefix and length, NLRI (Network layer and reachability information). Extra parameters can be added, more than IGPs can
– Advertises a collection of path attributes for path selection (the best route)
– Path vector routing protocol. BGP, gives the entire path of each AS to the destination (AS Path)
iBGP Full Mesh
iBGP requires all routers to be connected in a full mesh topology and therefore all routers are neighbours with each other. This is fine for small deployments, however can quickly end up becoming a problem. The formula to understand how many neighbour relationships are needed is a number of routers.
n * (n – 1) /2
n representing the number of routers. The number output is the number of neighbour relationships that are needed for the full mesh.
With 5 routers there are 10 neighbour relationships needed, with 10 routers, there are 45 neighbour relationships.
Confederations
BGP Confederations break up the large full mesh required by iBGP by splitting the single large AS into subautonomous systems (sub-AS). Each sub-AS has a unique sub-AS number assigned to it, which are usually taken from the private AS range of 64,512 and 65,535.
Inside each sub-AS the routers of that sub-AS need to adhere to the full mesh topology principle still. Connections between different sub-AS have an eBGP connections. To avoid routing loops, a sub-AS will use confederation sequences that are like the AS paths. The confederation sequences are usually made up of private AS numbers.
Route Reflectors
A BGP route reflector is designed to remove the limitation of iBGP that requires a full mesh and all BGP routers in the single AS to be neighbours. The route reflector is a single router in the AS that has neighbour relationships to all other routers in the AS.
This router will receive updates (NRLI) from each router and reflect, advertise them to the other routers in the AS. This is much more efficient.
The router reflector isn’t a perfect solution. As the number of iBGP routers increases, a route reflector also becomes difficult to scale. A solution to this is to cluster groups of routers together for hierarchical route reflection.
Neighbour Formations
As BGP uses TCP to form neighbours, there is the three-way handshake needed to start the process off of forming neighbours.
1. To begin with, the routers are in Idle state. Nothing has yet happened.
2. Next, the routers start the TCP three-way handshake, which moves them to the Connected state.
3. Once the TCP session is established, the routers move to the Active state. Active does not mean finished. It means the routers are trying to form a BGP neighbour relationship.
4. An OpenSent state, the open message will contain things like BGP AS number, versions. If everything is OK, BGP will start to send keepalive messages.
5. OpenConfirm state is when the router is waiting for the keepalive messages. Once the keepalives message is received, then it can progress to the Established state.
6. In the Established state, the BGP neighbours can start to send update messages. BGP uses a hold timer for any missed keepalive messages. It is reset when keepalive messages are received.
BGP Messages and Timers
The Open Message is sent during the OpenSent state includes; BGP version, local AS number, hold timer, BGP router ID, optional parameters. The default keepalive is 60 seconds, and hold timer is 180 seconds.
The Update Message contains: NRLI, path attributes and can include withdrawn routes.
The Keepalive Message is what keeps the holdtimer from expiring.
The Notification Message contains an error code, error subcode and information about the error
Basic BGP Configuration
I have created a small two router topology to explore the messages that are sent.
0 1 2 3 4 5 6 |
interface Ethernet0/0 ip address 192.200.100.1 255.255.255.252 router bgp 65001 neighbor 192.200.100.2 remote-as 65002 |
0 1 2 3 4 5 6 |
interface Ethernet0/0 ip address 192.200.100.2 255.255.255.252 router bgp 65001 neighbor 192.200.100.1 remote-as 65001 |
Once configured, the routers will form neighbours. The formation can be seen in the Wireshark capture below. I haven’t added any routes yet, so nothing special is happening.
Now that the routers have formed their neighbour relationship, they will continue to pass back the keepalive messages to maintain the TCP session and neighbour relationship.
Next is to add some routes to have them send updates to make this useful. I’m simply going to redistribute the connected networks on the routers to see the update messages.
0 1 2 3 |
IOU9(config-if)#router bgp 65001 IOU9(config-router)#redistribute connected |
0 1 2 3 |
IOU12(config-if)#router bgp 65002 IOU12(config-router)#redistribute connected |
Removing the redistribute connected
command will remove the routes in an update message containing the withdrawn routes.
0 1 2 3 |
IOU9(config-if)#router bgp 65001 IOU9(config-router)#no redistribute connected |
Path Selection
There are numerous path attributes, not all of them are used. I will focus on the eight most important. An easy way to remember them is to use a mnemonic device, We Love Oranges, AS Oranges Mean Pure Refreshment.
Weight
Local Preference
Originate
AS Path Length
Origin Type
Multi-Exit Discriminator (MED)
Paths
Router ID
This list is sequential, if the routes are tied on each attribute then they move to the next attribute.
Weight: The path with the highest weight is preferred. This is only locally significant and not affect any other router within the AS. Cisco only, used for outbound path selection
Local Preference: The path with the highest local preference is preferred. This is not locally significant and will affect other routers in the AS. Used for outbound path selection
Originate: A path that originated on the local router is preferred to any paths learnt from a BGP peer. Cisco adds a weight of, 32768 to any prefixes advertised into the local router.
AS Path: The shortest AS path is preferred. Used for inbound path selection
Origin: Where was the path learnt from originally; IGP, INCOMPLETE. IGP via network statement and INCOMPLETE via a redistribution command. IGP is preferred to INCOMPLETE.
Multi-Exit Discriminator (MED): Lowest MED is preferred. Only performed if the first hop AS is identical. If the first hop is different, then this is skipped.
Paths: By default, only a single path is installed into the routing table. This can be changed using the maximum-paths
command in Cisco and Arista.
Router ID: Compare the router ID of the peer that the path is learnt from. Lower is preferred.BGP Synchronisation
To get to grips with all these attributes, I am going to share the output from show ip bgp
from my VXLAN topology. This has three spine and six leaf switches. This output is taken from leaf1 which has ECMP and multipath configured. There are a lot of networks in the BGP routing table, so I have split this up to get the parts I need, and then the entire table is below.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 |
leaf1#sh ip bgp BGP routing table information for VRF default Router identifier 10.10.10.11, local AS number 65111 Route status codes: s - suppressed, * - valid, > - active, E - ECMP head, e - ECMP S - Stale, c - Contributing to ECMP, b - backup, L - labeled-unicast % - Pending BGP convergence Origin codes: i - IGP, e - EGP, ? - incomplete RPKI Origin Validation codes: V - valid, I - invalid, U - unknown AS Path Attributes: Or-ID - Originator ID, C-LST - Cluster List, LL Nexthop - Link Local Nexthop Network Next Hop Metric AIGP LocPref Weight Path * > 10.1.11.0/31 - - - - 0 i * 10.1.11.0/31 10.1.11.0 0 - 100 0 65000 i * 10.1.11.0/31 10.10.10.1 0 - 100 0 65000 i * > 10.1.12.0/31 10.1.11.0 0 - 100 0 65000 i * 10.1.12.0/31 10.10.10.1 0 - 100 0 65000 i * 10.1.12.0/31 10.3.12.1 0 - 100 0 65000 65112 i * 10.1.12.0/31 10.3.11.0 0 - 100 0 65000 65112 i * 10.1.12.0/31 10.2.11.0 0 - 100 0 65000 65112 i * 10.1.12.0/31 10.2.12.1 0 - 100 0 65000 65112 i * > 10.1.13.0/31 10.1.11.0 0 - 100 0 65000 i * 10.1.13.0/31 10.10.10.1 0 - 100 0 65000 i * 10.1.13.0/31 10.3.13.1 0 - 100 0 65000 65113 i * 10.1.13.0/31 10.3.11.0 0 - 100 0 65000 65113 i * 10.1.13.0/31 10.2.11.0 0 - 100 0 65000 65113 i * 10.1.13.0/31 10.2.13.1 0 - 100 0 65000 65113 i * > 10.1.14.0/31 10.1.11.0 0 - 100 0 65000 i * 10.1.14.0/31 10.10.10.1 0 - 100 0 65000 i * 10.1.14.0/31 10.3.14.1 0 - 100 0 65000 65114 i * 10.1.14.0/31 10.3.11.0 0 - 100 0 65000 65114 i * 10.1.14.0/31 10.2.14.1 0 - 100 0 65000 65114 i * 10.1.14.0/31 10.2.11.0 0 - 100 0 65000 65114 i * > 10.1.15.0/31 10.1.11.0 0 - 100 0 65000 i * 10.1.15.0/31 10.10.10.1 0 - 100 0 65000 i * 10.1.15.0/31 10.3.15.1 0 - 100 0 65000 65115 i * 10.1.15.0/31 10.3.11.0 0 - 100 0 65000 65115 i * 10.1.15.0/31 10.2.11.0 0 - 100 0 65000 65115 i * 10.1.15.0/31 10.2.15.1 0 - 100 0 65000 65115 i * > 10.2.11.0/31 - - - - 0 i * 10.2.11.0/31 10.2.11.0 0 - 100 0 65000 i * 10.2.11.0/31 10.10.10.2 0 - 100 0 65000 i * > 10.2.12.0/31 10.2.11.0 0 - 100 0 65000 i * 10.2.12.0/31 10.10.10.2 0 - 100 0 65000 i * 10.2.12.0/31 10.3.12.1 0 - 100 0 65000 65112 i * 10.2.12.0/31 10.3.11.0 0 - 100 0 65000 65112 i * 10.2.12.0/31 10.1.12.1 0 - 100 0 65000 65112 i * 10.2.12.0/31 10.1.11.0 0 - 100 0 65000 65112 i * > 10.2.13.0/31 10.2.11.0 0 - 100 0 65000 i * 10.2.13.0/31 10.10.10.2 0 - 100 0 65000 i * 10.2.13.0/31 10.3.13.1 0 - 100 0 65000 65113 i * 10.2.13.0/31 10.3.11.0 0 - 100 0 65000 65113 i * 10.2.13.0/31 10.1.13.1 0 - 100 0 65000 65113 i * 10.2.13.0/31 10.1.11.0 0 - 100 0 65000 65113 i * > 10.2.14.0/31 10.2.11.0 0 - 100 0 65000 i * 10.2.14.0/31 10.10.10.2 0 - 100 0 65000 i * 10.2.14.0/31 10.3.14.1 0 - 100 0 65000 65114 i * 10.2.14.0/31 10.3.11.0 0 - 100 0 65000 65114 i * 10.2.14.0/31 10.1.14.1 0 - 100 0 65000 65114 i * 10.2.14.0/31 10.1.11.0 0 - 100 0 65000 65114 i * > 10.2.15.0/31 10.2.11.0 0 - 100 0 65000 i * 10.2.15.0/31 10.10.10.2 0 - 100 0 65000 i * 10.2.15.0/31 10.3.15.1 0 - 100 0 65000 65115 i * 10.2.15.0/31 10.3.11.0 0 - 100 0 65000 65115 i * 10.2.15.0/31 10.1.15.1 0 - 100 0 65000 65115 i * 10.2.15.0/31 10.1.11.0 0 - 100 0 65000 65115 i * > 10.3.11.0/31 - - - - 0 i * 10.3.11.0/31 10.3.11.0 0 - 100 0 65000 i * 10.3.11.0/31 10.10.10.3 0 - 100 0 65000 i * > 10.3.12.0/31 10.3.11.0 0 - 100 0 65000 i * 10.3.12.0/31 10.10.10.3 0 - 100 0 65000 i * 10.3.12.0/31 10.1.12.1 0 - 100 0 65000 65112 i * 10.3.12.0/31 10.1.11.0 0 - 100 0 65000 65112 i * 10.3.12.0/31 10.2.11.0 0 - 100 0 65000 65112 i * 10.3.12.0/31 10.2.12.1 0 - 100 0 65000 65112 i * > 10.3.13.0/31 10.3.11.0 0 - 100 0 65000 i * 10.3.13.0/31 10.10.10.3 0 - 100 0 65000 i * 10.3.13.0/31 10.1.11.0 0 - 100 0 65000 65113 i * 10.3.13.0/31 10.1.13.1 0 - 100 0 65000 65113 i * 10.3.13.0/31 10.2.13.1 0 - 100 0 65000 65113 i * 10.3.13.0/31 10.2.11.0 0 - 100 0 65000 65113 i * > 10.3.14.0/31 10.3.11.0 0 - 100 0 65000 i * 10.3.14.0/31 10.10.10.3 0 - 100 0 65000 i * 10.3.14.0/31 10.2.11.0 0 - 100 0 65000 65114 i * 10.3.14.0/31 10.1.11.0 0 - 100 0 65000 65114 i * 10.3.14.0/31 10.2.14.1 0 - 100 0 65000 65114 i * 10.3.14.0/31 10.1.14.1 0 - 100 0 65000 65114 i * > 10.3.15.0/31 10.3.11.0 0 - 100 0 65000 i * 10.3.15.0/31 10.10.10.3 0 - 100 0 65000 i * 10.3.15.0/31 10.2.11.0 0 - 100 0 65000 65115 i * 10.3.15.0/31 10.1.15.1 0 - 100 0 65000 65115 i * 10.3.15.0/31 10.1.11.0 0 - 100 0 65000 65115 i * 10.3.15.0/31 10.2.15.1 0 - 100 0 65000 65115 i * > 10.10.10.1/32 10.1.11.0 0 - 100 0 65000 i * 10.10.10.1/32 10.10.10.1 0 - 100 0 65000 i * > 10.10.10.2/32 10.2.11.0 0 - 100 0 65000 i * 10.10.10.2/32 10.10.10.2 0 - 100 0 65000 i * > 10.10.10.3/32 10.3.11.0 0 - 100 0 65000 i * 10.10.10.3/32 10.10.10.3 0 - 100 0 65000 i * > 10.10.10.11/32 - - - - 0 i * >Ec 10.10.10.12/32 10.1.11.0 0 - 100 0 65000 65112 i * ec 10.10.10.12/32 10.2.11.0 0 - 100 0 65000 65112 i * ec 10.10.10.12/32 10.3.11.0 0 - 100 0 65000 65112 i * E 10.10.10.12/32 10.2.12.1 0 - 100 0 65000 65112 i * e 10.10.10.12/32 10.1.12.1 0 - 100 0 65000 65112 i * e 10.10.10.12/32 10.3.12.1 0 - 100 0 65000 65112 i * >Ec 10.10.10.13/32 10.2.11.0 0 - 100 0 65000 65113 i * ec 10.10.10.13/32 10.1.11.0 0 - 100 0 65000 65113 i * ec 10.10.10.13/32 10.3.11.0 0 - 100 0 65000 65113 i * E 10.10.10.13/32 10.2.13.1 0 - 100 0 65000 65113 i * e 10.10.10.13/32 10.1.13.1 0 - 100 0 65000 65113 i * e 10.10.10.13/32 10.3.13.1 0 - 100 0 65000 65113 i * >Ec 10.10.10.14/32 10.2.11.0 0 - 100 0 65000 65114 i * ec 10.10.10.14/32 10.1.11.0 0 - 100 0 65000 65114 i * ec 10.10.10.14/32 10.3.11.0 0 - 100 0 65000 65114 i * E 10.10.10.14/32 10.2.14.1 0 - 100 0 65000 65114 i * e 10.10.10.14/32 10.1.14.1 0 - 100 0 65000 65114 i * e 10.10.10.14/32 10.3.14.1 0 - 100 0 65000 65114 i * >Ec 10.10.10.15/32 10.1.11.0 0 - 100 0 65000 65115 i * ec 10.10.10.15/32 10.2.11.0 0 - 100 0 65000 65115 i * ec 10.10.10.15/32 10.3.11.0 0 - 100 0 65000 65115 i * E 10.10.10.15/32 10.1.15.1 0 - 100 0 65000 65115 i * e 10.10.10.15/32 10.2.15.1 0 - 100 0 65000 65115 i * e 10.10.10.15/32 10.3.15.1 0 - 100 0 65000 65115 i * > 172.16.1.0/24 - - - - 0 i * Ec 172.16.1.0/24 10.2.11.0 0 - 100 0 65000 i * e 172.16.1.0/24 10.1.11.0 0 - 100 0 65000 i * e 172.16.1.0/24 10.3.11.0 0 - 100 0 65000 i * E 172.16.1.0/24 10.10.10.2 0 - 100 0 65000 i * e 172.16.1.0/24 10.10.10.1 0 - 100 0 65000 i * e 172.16.1.0/24 10.10.10.3 0 - 100 0 65000 i * >E 172.20.1.1/32 - - - - 0 i * e 172.20.1.1/32 - - - - 0 i * >Ec 172.20.2.2/32 10.1.11.0 0 - 100 0 65000 65112 i * ec 172.20.2.2/32 10.2.11.0 0 - 100 0 65000 65112 i * ec 172.20.2.2/32 10.3.11.0 0 - 100 0 65000 65112 i * E 172.20.2.2/32 10.2.12.1 0 - 100 0 65000 65112 i * e 172.20.2.2/32 10.1.12.1 0 - 100 0 65000 65112 i * e 172.20.2.2/32 10.3.12.1 0 - 100 0 65000 65112 i * >Ec 172.20.3.3/32 10.2.11.0 0 - 100 0 65000 65113 i * ec 172.20.3.3/32 10.1.11.0 0 - 100 0 65000 65113 i * ec 172.20.3.3/32 10.3.11.0 0 - 100 0 65000 65113 i * E 172.20.3.3/32 10.2.13.1 0 - 100 0 65000 65113 i * e 172.20.3.3/32 10.1.13.1 0 - 100 0 65000 65113 i * e 172.20.3.3/32 10.3.13.1 0 - 100 0 65000 65113 i * >Ec 172.20.4.4/32 10.2.11.0 0 - 100 0 65000 65114 i * ec 172.20.4.4/32 10.1.11.0 0 - 100 0 65000 65114 i * ec 172.20.4.4/32 10.3.11.0 0 - 100 0 65000 65114 i * E 172.20.4.4/32 10.2.14.1 0 - 100 0 65000 65114 i * e 172.20.4.4/32 10.1.14.1 0 - 100 0 65000 65114 i * e 172.20.4.4/32 10.3.14.1 0 - 100 0 65000 65114 i * >Ec 172.20.5.5/32 10.1.11.0 0 - 100 0 65000 65115 i * ec 172.20.5.5/32 10.2.11.0 0 - 100 0 65000 65115 i * ec 172.20.5.5/32 10.3.11.0 0 - 100 0 65000 65115 i * E 172.20.5.5/32 10.1.15.1 0 - 100 0 65000 65115 i * e 172.20.5.5/32 10.2.15.1 0 - 100 0 65000 65115 i * e 172.20.5.5/32 10.3.15.1 0 - 100 0 65000 65115 i |
BGP Synchronisation
BGP synchronisation is by default disabled, and it is unlikely that it will be enabled. Basically it is denying a BGP route advertisement if that network isn’t also learnt on an IGP like OSPF.
I will demonstrate this with the topology below. IOU7 has a loopback interface of 3.3.3.3. This is advertised via the redistribute connected
command into BGP. IOU9 learns this and also advertises it to IOU12 via eBGP.
So now all the routers know about 3.3.3.3/32
.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
IOU12# debug ip bgp updates *Jan 29 18:59:51.464: %BGP-5-ADJCHANGE: neighbor 192.200.100.1 Up *Jan 29 18:59:53.493: BGP(0): 192.200.100.1 rcvd UPDATE w/ attr: nexthop 192.200.100.1, origin ?, merged path 65001, AS_PATH *Jan 29 18:59:53.494: BGP(0): 192.200.100.1 rcvd 3.3.3.3/32 *Jan 29 18:59:53.495: BGP(0): 192.200.100.1 rcvd 172.16.1.0/30 *Jan 29 18:59:53.496: BGP(0): Revise route installing 1 of 1 routes for 3.3.3.3/32 -> 192.200.100.1(global) to main IP table *Jan 29 18:59:53.496: BGP(0): Revise route installing 1 of 1 routes for 172.16.1.0/30 -> 192.200.100.1(global) to main IP table IOU12#sh ip route Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2 ia - IS-IS inter area, * - candidate default, U - per-user static route o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP a - application route + - replicated route, % - next hop override, p - overrides from PfR Gateway of last resort is not set 3.0.0.0/32 is subnetted, 1 subnets B 3.3.3.3 [20/0] via 192.200.100.1, 00:00:43 172.16.0.0/30 is subnetted, 1 subnets B 172.16.1.0 [20/0] via 192.200.100.1, 00:00:43 192.200.100.0/24 is variably subnetted, 2 subnets, 2 masks C 192.200.100.0/30 is directly connected, GigabitEthernet0/1 L 192.200.100.2/32 is directly connected, GigabitEthernet0/1 |
When BGP synchronisation
is enabled on IOU9 and the BGP session cleared using clear ip bgp *
and then re-established, IOU12 will not know about 3.3.3.3
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
IOU9(config-router)#do sh run | sec router bgp router bgp 65001 bgp log-neighbor-changes neighbor 172.16.1.2 remote-as 65001 neighbor 192.200.100.2 remote-as 65002 ! address-family ipv4 synchronization neighbor 172.16.1.2 activate neighbor 192.200.100.2 activate exit-address-family IOU9(config-router)#synchronization IOU9(config-router)#do clear ip bgp * IOU9(config-router)# *Jan 29 19:02:01.447: %BGP-3-NOTIFICATION_MANY: sent to 2 sessions 6/4 (Administrative Reset) for all peers IOU9(config-router)# *Jan 29 19:02:01.463: %BGP-5-ADJCHANGE: neighbor 172.16.1.2 Down User reset *Jan 29 19:02:01.463: %BGP_SESSION-5-ADJCHANGE: neighbor 172.16.1.2 IPv4 Unicast topology base removed from session User reset *Jan 29 19:02:01.466: %BGP-5-ADJCHANGE: neighbor 192.200.100.2 Down User reset *Jan 29 19:02:01.466: %BGP_SESSION-5-ADJCHANGE: neighbor 192.200.100.2 IPv4 Unicast topology base removed from session User reset IOU9(config-router)# *Jan 29 19:02:07.187: %BGP-5-ADJCHANGE: neighbor 192.200.100.2 Up *Jan 29 19:02:07.300: %BGP-5-ADJCHANGE: neighbor 172.16.1.2 Up |
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
*Jan 29 19:02:06.476: %BGP-5-ADJCHANGE: neighbor 192.200.100.1 Up IOU12# *Jan 29 19:02:07.597: BGP(0): 192.200.100.1 rcvd UPDATE w/ attr: nexthop 192.200.100.1, origin ?, merged path 65001, AS_PATH *Jan 29 19:02:07.598: BGP(0): 192.200.100.1 rcvd 172.16.1.0/30 *Jan 29 19:02:07.599: BGP(0): Revise route installing 1 of 1 routes for 172.16.1.0/30 -> 192.200.100.1(global) to main IP table IOU12#sh ip route Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2 ia - IS-IS inter area, * - candidate default, U - per-user static route o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP a - application route + - replicated route, % - next hop override, p - overrides from PfR Gateway of last resort is not set 172.16.0.0/30 is subnetted, 1 subnets B 172.16.1.0 [20/0] via 192.200.100.1, 00:01:07 192.200.100.0/24 is variably subnetted, 2 subnets, 2 masks C 192.200.100.0/30 is directly connected, GigabitEthernet0/1 L 192.200.100.2/32 is directly connected, GigabitEthernet0/1 |
To get this to work again, there ate two options.
1. Disable synchronisation
2. Configure an IGP to permit synchronisation to perform the function it is designed for.
I’ll configure OSPF to fulfil the requirement to get BGP synchronisation working.
0 1 2 3 4 |
IOU7(config-router)#do sh run | sec ospf router ospf 1 network 0.0.0.0 255.255.255.255 area 0 |
0 1 2 3 4 |
IOU9(config-router)#do sh run | sec ospf router ospf 1 network 0.0.0.0 255.255.255.255 area 0 |
Now that IOU9 has learnt about the 3.3.3.3 network over OSPF it will pass it along via eBGP to IOU12
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
IOU9(config-router)#do sh ip route Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2 ia - IS-IS inter area, * - candidate default, U - per-user static route o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP a - application route + - replicated route, % - next hop override, p - overrides from PfR Gateway of last resort is not set 3.0.0.0/32 is subnetted, 1 subnets O 3.3.3.3 [110/2] via 172.16.1.2, 00:02:31, GigabitEthernet0/0 172.16.0.0/16 is variably subnetted, 2 subnets, 2 masks C 172.16.1.0/30 is directly connected, GigabitEthernet0/0 L 172.16.1.1/32 is directly connected, GigabitEthernet0/0 192.200.100.0/24 is variably subnetted, 2 subnets, 2 masks C 192.200.100.0/30 is directly connected, GigabitEthernet0/1 L 192.200.100.1/32 is directly connected, GigabitEthernet0/1 |
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
*Jan 29 19:04:53.399: %BGP-5-ADJCHANGE: neighbor 192.200.100.1 Up IOU12# IOU12# *Jan 29 19:04:53.576: BGP(0): 192.200.100.1 rcvd UPDATE w/ attr: nexthop 192.200.100.1, origin ?, merged path 65001, AS_PATH *Jan 29 19:04:53.577: BGP(0): 192.200.100.1 rcvd 3.3.3.3/32 *Jan 29 19:04:53.577: BGP(0): 192.200.100.1 rcvd 172.16.1.0/30 *Jan 29 19:04:53.578: BGP(0): Revise route installing 1 of 1 routes for 3.3.3.3/32 -> 192.200.100.1(global) to main IP table *Jan 29 19:04:53.579: BGP(0): Revise route installing 1 of 1 routes for 172.16.1.0/30 -> 192.200.100.1(global) to main IP table IOU12#sh ip route Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2 ia - IS-IS inter area, * - candidate default, U - per-user static route o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP a - application route + - replicated route, % - next hop override, p - overrides from PfR Gateway of last resort is not set 3.0.0.0/32 is subnetted, 1 subnets B 3.3.3.3 [20/0] via 192.200.100.1, 00:00:29 172.16.0.0/30 is subnetted, 1 subnets B 172.16.1.0 [20/0] via 192.200.100.1, 00:00:29 192.200.100.0/24 is variably subnetted, 2 subnets, 2 masks C 192.200.100.0/30 is directly connected, GigabitEthernet0/1 L 192.200.100.2/32 is directly connected, GigabitEthernet0/1 |
BGP Summarisation
BGP summarisation is route summarisation. This is as in IGP route summarisation the ability to change the subnet mask of the route to include a large set of networks. BGP calls this route summarisation an aggregate
.
In the example lab, I will do have 4 /24 networks and advertise them using a single /22 network.
IOU7 has four /24 networks and when redistributed into BGP they are sent over as individual networks to IOU9 that appear in the IOU9 routing table as four /24 networks. These four networks can be summarised into one single /22 network to be advertised.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
IOU9(config-router)#do sh ip route bgp Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2 ia - IS-IS inter area, * - candidate default, U - per-user static route o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP a - application route + - replicated route, % - next hop override, p - overrides from PfR Gateway of last resort is not set 1.0.0.0/24 is subnetted, 4 subnets B 1.1.0.0 [20/0] via 172.16.1.2, 00:00:22 B 1.1.1.0 [20/0] via 172.16.1.2, 00:00:22 B 1.1.2.0 [20/0] via 172.16.1.2, 00:00:22 B 1.1.3.0 [20/0] via 172.16.1.2, 00:00:22 |
0 1 2 |
IOU7(config-router)#aggregate-address 1.1.0.0 255.255.252.0 summary-only |
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
IOU9(config-router)#do sh ip route bgp Codes: L - local, C - connected, S - static, R - RIP, M - mobile, B - BGP D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area N1 - OSPF NSSA external type 1, N2 - OSPF NSSA external type 2 E1 - OSPF external type 1, E2 - OSPF external type 2 i - IS-IS, su - IS-IS summary, L1 - IS-IS level-1, L2 - IS-IS level-2 ia - IS-IS inter area, * - candidate default, U - per-user static route o - ODR, P - periodic downloaded static route, H - NHRP, l - LISP a - application route + - replicated route, % - next hop override, p - overrides from PfR Gateway of last resort is not set 1.0.0.0/22 is subnetted, 1 subnets B 1.1.0.0 [20/0] via 172.16.1.2, 00:01:38 |
eBGP Multihop
BGP does not have to only form neighbour relationships with adjacent routers. BGP can form neighbour relationships with routers that are several hops away. For this to happen, BGP relies on the existing routing that it has to reach the neighbour. Remember, BGP forms neighbours to manually set IP addresses using TCP 179. This is known as eBGP multihop.
I’m going to use the exact same topology and allow IOU7 to form a neighbour relationship with IOU12.
0 1 2 3 4 5 |
IOU7(config)#router bgp 65000 IOU7(config-router)#neighbor 192.200.100.2 remote-as 65002 IOU7(config-router)#neighbor 192.200.100.2 ebgp-multihop 2 IOU7(config-router)# redistribute connected |
0 1 2 3 4 5 |
IOU12(config)#router bgp 65002 IOU12(config-router)#neighbor 172.16.1.2 remote-as 65000 IOU12(config-router)#neighbor 172.16.1.2 ebgp-multihop 2 IOU12(config-router)#redistribute connected |
Router IOU7 now has two neighbours, IOU9 and IOU12
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
IOU7(config-router)#do sh ip bgp summ BGP router identifier 3.3.3.3, local AS number 65000 BGP table version is 14, main routing table version 14 7 network entries using 1008 bytes of memory 8 path entries using 640 bytes of memory 4/3 BGP path/bestpath attribute entries using 608 bytes of memory 2 BGP AS-PATH entries using 48 bytes of memory 0 BGP route-map cache entries using 0 bytes of memory 0 BGP filter-list cache entries using 0 bytes of memory BGP using 2304 total bytes of memory BGP activity 7/0 prefixes, 8/0 paths, scan interval 60 secs Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd 172.16.1.1 4 65001 18 18 12 0 0 00:10:37 1 192.200.100.2 4 65002 8 8 1 0 0 00:00:23 1 |
IOU12 is learning about the 1.1.0.0/22 network via two AS’. IOU12 has picked the best route, which is via IOU7, to install into its routing table. This is due to fewer AS’ in the path. As we can see from the diagram, this is going the exact same way.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
IOU12(config-router)#do sh ip bgp BGP table version is 54, local router ID is 192.200.100.2 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter, x best-external, a additional-path, c RIB-compressed, Origin codes: i - IGP, e - EGP, ? - incomplete RPKI validation codes: V valid, I invalid, N Not found Network Next Hop Metric LocPrf Weight Path * 1.1.0.0/22 172.16.1.2 0 0 65000 i *> 192.200.100.1 0 65001 65000 i *> 172.16.1.0/30 172.16.1.2 0 0 65000 ? * 192.200.100.1 0 65001 65000 ? *> 192.200.100.0/30 0.0.0.0 0 32768 ? |
As a side note, due to the nature of this topology, IOU12 is receiving updates about its own network 192.200.100.0/30 from IOU7. This is because IOU7 is learning about the 192.200.100.0/30 network from IOU9 that is between the two neighbours. IOU12 is seeing this update and is denying the update due to its own AS number being in the path. This is the loop prevention method of BGP. The update can be seen using the debug command debug ip bgp updates
.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
*Jan 29 19:51:00.532: %BGP-5-ADJCHANGE: neighbor 172.16.1.2 Up *Jan 29 19:51:00.535: BGP(0): (base) 172.16.1.2 send UPDATE (format) 172.16.1.0/30, next 192.200.100.2, metric 0, path 65001 65000 *Jan 29 19:51:00.536: BGP(0): (base) 172.16.1.2 send UPDATE (format) 1.1.0.0/22, next 192.200.100.2, metric 0, path 65001 65000 *Jan 29 19:51:00.536: BGP(0): (base) 172.16.1.2 send UPDATE (format) 192.200.100.0/30, next 192.200.100.2, metric 0, path Local *Jan 29 19:51:00.540: BGP: nbr_topo global 172.16.1.2 IPv4 Unicast:base (0xD901070:1) rcvd Refresh Start-of-RIB *Jan 29 19:51:00.541: BGP: nbr_topo global 172.16.1.2 IPv4 Unicast:base (0xD901070:1) refresh_epoch is 2 *Jan 29 19:51:00.543: BGP(0): 172.16.1.2 rcvd UPDATE w/ attr: nexthop 172.16.1.2, origin i, metric 0, atomic-aggregate, aggregated by 65000 3.3.3.3, merged path 65000, AS_PATH *Jan 29 19:51:00.544: BGP(0): 172.16.1.2 rcvd 1.1.0.0/22 *Jan 29 19:51:00.545: BGP(0): 172.16.1.2 rcv UPDATE w/ attr: nexthop 172.16.1.2, origin ?, originator 0.0.0.0, merged path 65000 65001 65002, AS_PATH , community , extended community , SSA attribute *Jan 29 19:51:00.546: BGPSSA ssacount is 0 *Jan 29 19:51:00.546: BGP(0): 172.16.1.2 rcv UPDATE about 192.200.100.0/30 -- DENIED due to: AS-PATH contains our ow IOU12(config-router)#n AS; *Jan 29 19:51:00.547: BGP(0): 172.16.1.2 rcvd UPDATE w/ attr: nexthop 172.16.1.2, origin ?, metric 0, merged path 65000, AS_PATH *Jan 29 19:51:00.548: BGP(0): 172.16.1.2 rcvd 172.16.1.0/30 *Jan 29 19:51:00.549: BGP: nbr_topo global 172.16.1.2 IPv4 Unicast:base (0xD901070:1) rcvd Refresh End-of-RIB IOU12(config-router)# *Jan 29 19:51:04.749: BGP(0): Revise route installing 1 of 1 routes for 1.1.0.0/22 -> 172.16.1.2(global) to main IP table |
iBGP & eBGP Next Hop Handling
iBGP and eBGP handle next hops differently. When the iBGP router advertises a route that it learns over eBGP to an iBGP neighbour, it does not pass along itself as the next hop. This means that if the downstream iBGP router that receives the route, it could be sending traffic to a destination that it does not know about.
I have created an example to show this using the three routers. IOU12 is advertising 192.168.20.0/24 over eBGP to IOU9. IOU9 is then sending that network as an update to IOU7. IOU7 is receiving the next hop as the same address as IOU9, which is 192.200.100.2. However, IOU7 has no record of this address in its routing table. Traffic does not go anywhere.
Look at the 192.168.20.0
, the next hop is 192.200.100.2
in IOU7
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
IOU9(config-router)#do sh ip bgp BGP table version is 27, local router ID is 192.200.100.1 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter, x best-external, a additional-path, c RIB-compressed, Origin codes: i - IGP, e - EGP, ? - incomplete RPKI validation codes: V valid, I invalid, N Not found Network Next Hop Metric LocPrf Weight Path *>i 1.1.0.0/22 172.16.1.2 0 100 0 i r>i 172.16.1.0/30 172.16.1.2 0 100 0 ? *> 192.168.20.0 192.200.100.2 0 0 65002 ? r> 192.200.100.0/30 192.200.100.2 0 0 65002 ? IOU9(config-router)#do ping 192.168.20.1 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 192.168.20.1, timeout is 2 seconds: !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 3/3/3 ms IOU9(config-router)# |
This is the same BGP update for IOU7, which is what causes the problem.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
IOU7(config-router)#do sh ip bgp BGP table version is 1, local router ID is 1.1.3.1 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter, x best-external, a additional-path, c RIB-compressed, Origin codes: i - IGP, e - EGP, ? - incomplete RPKI validation codes: V valid, I invalid, N Not found Network Next Hop Metric LocPrf Weight Path * i 192.168.20.0 192.200.100.2 0 100 0 65002 ? * i 192.200.100.0/30 192.200.100.2 0 100 0 65002 ? IOU7(config-router)#do ping 192.200.100.1 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 192.200.100.1, timeout is 2 seconds: ..... Success rate is 0 percent (0/5) |
The fix is to update the router that is receiving the router over eBGP, IOU9. This router will modify the advertisement to use its own IP address as the next hop and not the original.
0 1 2 3 4 |
router bgp 65001 neighbor 172.16.1.2 remote-as 65001 neighbor 172.16.1.2 next-hop-self |
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
IOU7(config-router)#do sh ip bgp BGP table version is 13, local router ID is 1.1.3.1 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal, r RIB-failure, S Stale, m multipath, b backup-path, f RT-Filter, x best-external, a additional-path, c RIB-compressed, Origin codes: i - IGP, e - EGP, ? - incomplete RPKI validation codes: V valid, I invalid, N Not found Network Next Hop Metric LocPrf Weight Path *>i 192.168.20.0 172.16.1.1 0 100 0 65002 ? *>i 192.200.100.0/30 172.16.1.1 0 100 0 65002 ? IOU7(config-router)#do ping 192.200.100.1 Type escape sequence to abort. Sending 5, 100-byte ICMP Echos to 192.200.100.1, timeout is 2 seconds: !!!!! Success rate is 100 percent (5/5), round-trip min/avg/max = 2/2/4 ms |
Peer Groups
A BGP per group has two functions. The first is, it simplifies the configuration by allowing multiple attributes to be part of a group and adding neighbours to that group. Less config per neighbour now. The second function is that the router will check the routing table only once, and updates are sent to all peer group members instead of the updates being sent to each individual peer.
I have reconfigured BGP to be using IOU7 and IOU9 as an ISP type. Both are now connected to IOU12 via eBGP. A peer-group has been configured on IOU12 so the two routers in AS 65001 can be grouped together.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
IOU12(config-router)#do sh run | sec bgp router bgp 65002 bgp log-neighbor-changes neighbor ISP-65001 peer-group neighbor ISP-65001 remote-as 65001 neighbor 40.200.100.1 remote-as 65001 neighbor 40.200.100.1 peer-group ISP-65001 neighbor 192.200.100.1 remote-as 65001 neighbor 192.200.100.1 peer-group ISP-65001 ! address-family ipv4 redistribute connected neighbor 40.200.100.1 activate neighbor 192.200.100.1 activate exit-address-family |
This does not look like a lot of line saving from a configuration point of view, however where there are more peers and more BGP options configured for the peers the savings of configuration can be a lot. If you look at the below output from my Arista VXLAN BGP configuration on spine1, the EVPN peer group has five BGP options configured. These five lines would need to be multiplied per EVPN leaf switch without the peer group.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
spine1#sh run sec bgp router bgp 65000 router-id 10.10.10.1 distance bgp 20 200 200 maximum-paths 4 ecmp 64 neighbor EVPN peer group neighbor EVPN next-hop-unchanged neighbor EVPN update-source Loopback0 neighbor EVPN ebgp-multihop 3 neighbor EVPN send-community extended neighbor 10.10.10.11 peer group EVPN neighbor 10.10.10.11 remote-as 65111 neighbor 10.10.10.11 description LEAF1 neighbor 10.10.10.12 peer group EVPN neighbor 10.10.10.12 remote-as 65112 neighbor 10.10.10.12 description LEAF2 neighbor 10.10.10.13 peer group EVPN neighbor 10.10.10.13 remote-as 65113 neighbor 10.10.10.13 description LEAF3 neighbor 10.10.10.14 peer group EVPN neighbor 10.10.10.14 remote-as 65114 neighbor 10.10.10.14 description LEAF4 neighbor 10.10.10.15 peer group EVPN neighbor 10.10.10.15 remote-as 65115 neighbor 10.10.10.15 description LEAF5 redistribute connected ! address-family evpn neighbor EVPN activate |