One of the original virtualization in computing was in the network construct. VLANS allowed for logical segmentation of the network. This worked well for a long time , but with the emergence of cloud computing demands, the need to scale this virtualization demanded more. In order to scale , the network needed to grow beyond the 4096 vlans. There is three technologies that have allowed the network scaled into millions of virtual networks.
Spine & Leaf
Spine and Leaf Networks are a designed to minimize the complexity of the feature of the network , minimize the features onboard the box and maximizing the bandwidth capacity. The goal of the organization is to create non contending network , oversubscription model.
What this means in practice is if you have a 1RU Switch with 48 x 10G ports + 6x 100G ports. You can connect all 48 ports to servers , run all 6 , 100G ports upstream to the Spine switches. This would be an oversubscription rate of 1.25 , or 25% more bandwidth can be achieve upstream than down to the servers. This should allow for all traffic to be non-contending. This topology design build the method and bandwidth delivery to achieve cloud scaling.
VXLAN
VXLAN is the overlay technology that allows a Layer 2 domains t be bridged over a routed IP Network. This is based on a tunneling technology , VXLAN makes use of four different headers (RFC 7348) which encapsulate the MAC ,IP,Port Header to forward traffic over a routed network. The headers direct the packet to be sent from the local switch to the peering tunnel endpoint. The routing to these local endpoints can be done with ISIS. OSPF , BGP but what is important to understand . The peers need to discover each other . This is how switches indicate they are interested in participating in the Layer2 domain .
router# show vxlan
Vlan VN-Segment
==== ==========
411 411000
500 50000
511 511000
711 711000
1001 2001001
This was initially down with Multicast which makes sense. Multicast is based on signaling up the tree that you are interested in joining a traffic group. Vxlan virtual interfaces (NVE) are able to map VNI (Vxlan Network Identifier ) . This can also now be accomplished with BGP EVPN Address families , this has the added feature of BGP traffic controls.
This segmentation technology allows to eliminate costly spanning tree across switches and simplifies the configuration local to switch. Vlans become locally significant , VNI are mapped across as the unique name space which ~16 million. This scales the network l2 virtually over an IP fabric.
EVPN
EVPN (Ethernet VPN)
EVPN is an extended address family of BGP. VXLAN did allow the ability to send Layer 2 Domain over a routed network but lack the control plane to control routes. The control and data plane were shared by the underlying IP Fabric and multicast groups. BGP EVPN allows vxlan to perform the data plane transactions , while allowing BGP to advertise the various routes types. This allows for a multi-tenant environments to be scaled and to ensure segmentation. EVPN allows tenants to be mapped into VRFs (IP-VRF or MAC-VRF) . VRFs are local , logical segmentations.
In order to follow the logical segmentation , What is important to remember
- Vlans are configured with VNI
- VNIs are associated to Multicast Groups or BGP
- VNIs contain both a Route Distinguisher & Route Target
- Route Distinguisher is the VRF and separates the routing tables on the switch
- Route Target is the policy to import / export tied to its own or other Route Distinguisher
- BGP is the control plane for the routes
- MAC Routes
- IP Address Routes
- Prefix Routes
- There is separate routing tables for all of these.
# vlan -- vni MAC-VRF
vlan 511
vn-segment 511000
# Layer 3 Interface to IP-VRF Mapping
interface Vlan511
no shutdown
mtu 9192
vrf member dia-vdc
ip address 172.31.1.2/29 tag 50000
#MAC-VRF COnfig
vni 511000 l2
rd 10.200.0.1:511
route-target import auto
route-target export auto
#Ip-VRF Config
vrf context dia-vdc
vni 50000
rd 65101:50000
address-family ipv4 unicast
route-target both auto evpn
#vxlan encapsulation config
interface nve1
no shutdown
host-reachability protocol bgp
source-interface loopback1
global ingress-replication protocol bgp
member vni 50000 associate-vrf. #This is the IP-VRF
member vni 511000 #This is the MAC-VRF
ingress-replication protocol bgp # BGP to pass Mac address routes
#Show bgp info for MAC-VRF
router# show bgp evi 511000
-----------------------------------------------
L2VNI ID : 511000 (L2-511000)
RD : 10.200.0.1:511
Prefixes (local/total) : 1/2
Created : Mar 15 20:05:06.485798
Last Oper Up/Down : Mar 15 20:05:06.551859 / never
Enabled : Yes
Associated IP-VRF : dia-vdc
Active Export RT list :
65201:511000
Active Import RT list :
65201:511000
#BGP Info for IP-VRF
router # show bgp evi 50000
-----------------------------------------------
L3VNI ID : 50000 (L3-50000)
RD : 65101:50000
Prefixes (local/total) : 2/2
Created : Mar 16 15:58:31.359963
Last Oper Up/Down : Mar 16 15:58:31.360000 / never
Enabled : Yes
Associated IP-VRF : dia-vdc
Address-family IPv4 Unicast
Active Export RT list :
65201:50000
Active Import RT list :
65201:50000
Active EVPN Export RT list :
65201:50000
Active EVPN Import RT list :
65201:50000
Active MVPN Export RT list :
65201:50000
Active MVPN Import RT list :
65201:50000
Address-family IPv6 Unicast
Active Export RT list :
65201:50000
Active Import RT list :
65201:50000
Active EVPN Export RT list :
65201:50000
Active EVPN Import RT list :
65201:50000
Active MVPN Export RT list :
65201:50000
Active MVPN Import RT list :
#Route table for MAC-VRF
router# show bgp l2vpn evpn vni-id 511000
BGP routing table information for VRF default, address family L2VPN EVPN
BGP table version is 123, Local Router ID is 10.200.0.1
Network Next Hop Metric LocPrf Weight Path
Route Distinguisher: 10.200.0.1:511 (L2VNI 511000)
*>l[3]:[0]:[32]:[10.200.200.1]/88
10.200.200.1 100 32768 i
*>e[3]:[0]:[32]:[10.200.200.2]/88
10.200.200.2 0 65101 65202 i
#Route Table for IP-VRF
show bgp l2vpn evpn vrf dia-vdc
Route Distinguisher: 65101:50000 (L3VNI 50000)
*>l[5]:[0]:[0]:[28]:[192.168.1.0]/224
10.200.200.1 0 100 32768 ?
*>l[5]:[0]:[0]:[29]:[172.31.1.0]/224
10.200.200.1 0 100 32768 ?
There is several RFC that explain this better than I can here and an interesting Facebook whitepaper about BGP as the single protocol in the datacenter.
I hope this brief overview helps in your journey to understand more about the tech available to us today.
RFCs:
