Introduction

We've all been there, it's supposed to be a relatively simple change and then BOOM! Spanning tree topology change blows up the network :( There is a movement in the data centre space to push the layer 2 boundary down into the host to avoid the bandwidth waste of spanning tree link blocking and the nightmare failure scenarios of technologies to "prevent" spanning tree issues like MLAG and VSS.

Current data centre design utilizing MLAG.

linux-mlag-topology
Future data centre design utilizing routing on the host.

linux-routed-host-topology

Pete Lumbis of Cumulus Networks has written an excellent blog post describing the evolution of data centre design from legacy models to the routed host model. It's a quick read that covers the topic very well.

This post will cover enabling routing on the host by installing FRR on an Ubuntu 1604 host and configuring BGP peering with Cumulus Linux switches.

Free Range Routing (FRR) is an open source IP routing suite for linux. FRR is a fork of the Quagga project with over 130 contributors and support from vendors such as Big Switch and Cumulus Networks. FRR aims to implement seemless integration with the native linux network stacks and currently supports the BGP, OSPF, ISIS, RIP routing protocols with support for EIGRP on the way.

Topology

linux-roh-lab-topology

For reference the following software will be used in this post.

  • Ubuntu Minimal - 16.04
  • Free Range Routing - 3.0.2
  • Cumulus Linux - 3.4.3

Switch Configuration

First up I will configure the leaf switches, for this task I will use the network command line utiliy (NCLU).

leaf01

  
# Configure interfaces
sudo net add interface swp1 ipv6 nd ra-interval 5
sudo net del interface swp1 ipv6 nd suppress-ra 
sudo net add loopback lo ip address 10.2.0.1/32

# Configure BGP
sudo net add bgp autonomous-system 65201
sudo net add bgp router-id 10.2.0.1
sudo net add bgp bestpath as-path multipath-relax
sudo net add bgp bestpath compare-routerid
sudo net add bgp neighbor fabric peer-group
sudo net add bgp neighbor fabric remote-as external
sudo net add bgp neighbor fabric description Internal Fabric Network
sudo net add bgp neighbor fabric capability extended-nexthop
sudo net add bgp neighbor swp1 interface peer-group fabric
sudo net add bgp ipv4 unicast network 10.2.0.1/32
sudo net add bgp ipv6 unicast neighbor fabric activate

# Save configuration
sudo net commit
            

leaf02

  
# Configure interfaces
sudo net add interface swp1 ipv6 nd ra-interval 5
sudo net del interface swp1 ipv6 nd suppress-ra 
sudo net add loopback lo ip address 10.2.0.2/32

# Configure BGP
sudo net add bgp autonomous-system 65202
sudo net add bgp router-id 10.2.0.2
sudo net add bgp bestpath as-path multipath-relax
sudo net add bgp bestpath compare-routerid
sudo net add bgp neighbor fabric peer-group
sudo net add bgp neighbor fabric remote-as external
sudo net add bgp neighbor fabric description Internal Fabric Network
sudo net add bgp neighbor fabric capability extended-nexthop
sudo net add bgp neighbor swp1 interface peer-group fabric
sudo net add bgp ipv4 unicast network 10.2.0.2/32
sudo net add bgp ipv6 unicast neighbor fabric activate

# Save configuration
sudo net commit
            

FRR Installation

Alright, lets move onto the host machine. I had to do an apt update otherwise I would get an error when installing the required packages.

  
sudo apt update -y
            

Install the dependencies.

  
sudo apt install -y iproute libc-ares2
            

Download and install the FRR package.


# download
wget https://github.com/FRRouting/frr/releases/download/frr-3.0.2/frr_3.0.2-1-ubuntu16.04.1_amd64.deb

# install
sudo dpkg -i frr_3.0.2-1-ubuntu16.04.1_amd64.deb
            

FRR Configuration

Add the uplink interfaces to the /etc/network/interfaces config file.

  
auto eth1
iface eth1 inet manual

auto eth2
iface eth2 inet manual
            

Enable the routing daemons in the /etc/frr/daemons config file.

  
# Change no to yes
zebra=yes
bgpd=yes
            

Create a file called /etc/frr/frr.conf for the BGP configurations with the the following contents.

  
!
service integrated-vtysh-config
!
int lo
 ip address 10.3.0.1/32
 ip address 30.30.30.30/32
!
interface eth1
 ipv6 nd ra-interval 5
 no ipv6 nd suppress-ra
!
interface eth2
 ipv6 nd ra-interval 5
 no ipv6 nd suppress-ra
!
router bgp 65301
 bgp router-id 10.3.0.1
 bgp bestpath as-path multipath-relax
 bgp bestpath compare-routerid
 neighbor fabric peer-group
 neighbor fabric remote-as external
 neighbor fabric description Internal Fabric Network
 neighbor fabric capability extended-nexthop
 neighbor eth1 interface peer-group fabric
 neighbor eth2 interface peer-group fabric
 !
 address-family ipv4 unicast
  network 10.3.0.1/32
  network 30.30.30.30/32
  neighbor fabric prefix-list host-routes-out out
 exit-address-family
 !
 address-family ipv6 unicast
  neighbor fabric activate
 exit-address-family
!
ip prefix-list host-routes-out seq 100 permit 10.3.0.1/32
ip prefix-list host-routes-out seq 200 permit 30.30.30.30/32
ip prefix-list host-routes-out seq 300 deny 0.0.0.0/0 le 32
!
line vty
!
end
            

Restart the networking and frr services.


sudo systemctl restart networking.service
sudo systemctl restart frr.service
            

Did you notice that no IP addresses where configured for the BGP peering ? This is possible with the use of BGP unnumbered. BGP unnumbered; specified in rfc5549 allows for the advertising of an IPv4 route with an IPv6 next-hop. The above configuration uses IPv6 dynamic link local neighbor discovery for the BGP peering address. More info can be found here and here.

Alright, that's all the config out of the way lets move onto verifing the operation.

Verification

Verify BGP peering is up and prefixes are learned

  
host01# show bgp ipv4 unicast summary

# output
BGP router identifier 10.3.0.1, local AS number 65301 vrf-id 0
BGP table version 4
RIB entries 7, using 952 bytes of memory
Peers 2, using 42 KiB of memory
Peer groups 1, using 72 bytes of memory

Neighbor        V         AS MsgRcvd MsgSent   TblVer  InQ OutQ  Up/Down State/PfxRcd
eth1            4      65201      62      62        0    0    0 00:02:51            1
eth2            4      65202      61      61        0    0    0 00:02:50            1

Total number of neighbors 2
            

Confirm the prefixes being learned via BGP.

  
host01# show ip route

# output
Codes: K - kernel route, C - connected, S - static, R - RIP,
        O - OSPF, I - IS-IS, B - BGP, P - PIM, N - NHRP, T - Table,
        v - VNC, V - VNC-Direct,
        > - selected route, * - FIB route

K>* 0.0.0.0/0 via 192.168.121.1, eth0
B>* 10.2.0.1/32 [20/0] via fe80::2ab7:adff:fe23:355e, eth1, 00:06:24
B>* 10.2.0.2/32 [20/0] via fe80::2ab7:adff:fe72:e8e1, eth2, 00:06:23
C>* 10.3.0.1/32 is directly connected, lo
C>* 30.30.30.30/32 is directly connected, lo
C>* 192.168.121.0/24 is directly connected, eth0
            

Good, we are learning the loopback IP addresses of leaf01/2 which is what we expect.

As a final test, lets check that we are learning the loopback addresses from host01 on leaf01.

  
vagrant@leaf01:~$ sudo net show route
show ip route
=============
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel,
       > - selected route, * - FIB route

K>* 0.0.0.0/0 via 192.168.121.1, eth0
C>* 10.2.0.1/32 is directly connected, lo
B>* 10.3.0.1/32 [20/0] via fe80::2ab7:adff:fe30:c4, swp1, 03:13:24
B>* 30.30.30.30/32 [20/0] via fe80::2ab7:adff:fe30:c4, swp1, 03:13:24
C>* 192.168.121.0/24 is directly connected, eth0
            

leaf01 is learning the loopback addresses advertised by host01. That looks like success to me !

Summary

Routing on the host is a nice solution to those pesky layer 2 problems we get in the networking world and with the rise of micro-service architectures it makes alot of sense as a data centre design choice.

Links

https://frrouting.org/
https://docs.cumulusnetworks.com/display/ROH/Routing+on+the+Host
https://github.com/FRRouting/frr/releases
https://tools.ietf.org/html/rfc5549






















Published: 2018-01-05