Keep learning, keep living...

0%

NSX-T路由逻辑介绍

之前的文章<<NSX分布式逻辑路由器介绍>>简要介绍了NSX-V(NSX for vSphere)中的分布式逻辑路由器。NSX-V只支持vSphere平台,对VMware vCenter强依赖。NSX-T是针对异构虚拟化平台以及多Hypervisor环境来设计的,不仅支持vSphere平台,还支持KVM、Docker、Kubernetes等平台。当前来看,NSX-T更像是VMware未来的主要投入方向。这篇文章很透彻地介绍了NSX-VNSX-T的差别,但文章中的内容是基于NSX-T2.3版本,当前已经是3.0, 有些内容已经不太适用。

NSX-T的路由实现与NSX-V有较大不同,本文来简要介绍NSX-T平台下逻辑路由器的概念。

NSX-T中,逻辑路由器分为Tier-0网关和Tier-1网关。Tire-0网关用于连接NSX-T虚拟网络与外部网络,主要处理南北向路由。Tier-1网关用于处理不同分段: Segment(以前版本叫做逻辑交换机: Logical Switch, 虚拟二层网络)之间的东西向路由。从概念上来看,Tire-1网关对应NSX-V中的LDR: Logical Distributed Router, Tire-0对应NSX-V中的ESG: Edge Service Gateway

典型的部署结构如下图, 图片来自VMware官方博客:

之前我们介绍过,在NSX-VLDR当需要运行动态路由协议时,需要创建Control VM来做为LDR的统一控制面。而在NSX-T的实现,无论是Tire-0网关还是Tire-1网关, 它们都是一个逻辑路由器(Logical Router),都由SR: Service RouterDR: Distributed Router构成。DR分布在相应传输区域传输节点上,SR则部署在Edge节点中。SR实例只有在该逻辑路由器上开启不能分布式部署的服务时才会被创建, 如,与外部网络连接、NAT、DHCP、负载均衡等。

对于Tire-1网关来说,当一个分段连接这个逻辑路由器时,它的DR组件就会在所有传输节点上创建。而当它被配置DNAT, 边界防火墙、负载均衡等功能时,SR组件在Edge节点上创建。
DRSR之间的连通是由NSX-T自动创建的传输分段来实现,可用地址段为169.254.0.0/24。而Tire-1Tier-0之间传输子网也是NSX-T自动创建,子网地址使用为100.64.0.0/16

对于Tire-0网关来说,当Tire-1网关连接到它时,DR组件在所需的传输节点上创建。而当这个Tire-0网关连接到外部网络、配置DNAT、配置边界防火墙等服务时,SR组件在Edge节点上被创建。

Tire-1网关Tire-0网关之间不运行动态路由协议,是由NSX-T自动构建静态路由实现路由转发。

我们上边所说的SR组件并不是以虚拟机形式存在,而是在Edge节点中的独立服务,这些Edge节点构成Edge集群来提供Active-ActiveActive-Standby形式的高可用能力。Edge节点同时做为Overlay传输区域和VLAN传输区域的传输节点实现Overlay网络VLAN网络的转发。

在实验中发现,只要Tire-1网关关联上Edge集群,在没有配置NAT等服务时,相应SR组件也会创建,如下图操作:

这是因为网关防火墙功能默认是开启的,当关联上Edge集群后,SR组件即会自动创建, 可以参考这篇文章

我的实验环境的网络拓扑如图所示:

没有给Tire-1网关关联Edge集群时,简化的结构如图:

从虚拟机t1访问外部时,数据包首先到达T1-GW-01t1所在ESXi主机上的DR。此时,DR-T1-GW-01的路由表如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
localhost> get logical-router 9ddc145a-f076-4d2e-9280-bebdbeb250de forwarding
Logical Routers Forwarding Table
--------------------------------------------------------------------------------------------------------------
Flags Legend: [U: Up], [G: Gateway], [C: Connected], [I: Interface]
[H: Host], [R: Reject], [B: Blackhole], [F: Soft Flush], [E: ECMP]

Network Gateway Type Interface UUID
==============================================================================================================
0.0.0.0/0 100.64.80.0 UGE 85e139ef-0f86-434f-8862-63e421c3e738
100.64.80.0/31 0.0.0.0 UCI 85e139ef-0f86-434f-8862-63e421c3e738
169.254.0.0/28 0.0.0.0 UCI 45fbe9fd-85aa-481f-adf0-372ce979198c
192.168.0.0/24 0.0.0.0 UCI 53e3d461-7fb2-4fba-a495-4e51dbcdccf1
192.168.1.0/24 0.0.0.0 UCI aac47f48-4233-424a-8503-6e41801e86f8
::/0 fc15:7264:6244:2800::1 UGE 85e139ef-0f86-434f-8862-63e421c3e738
fc15:7264:6244:2800::/64 :: UCI 85e139ef-0f86-434f-8862-63e421c3e738
fe80::50:56ff:fe56:5300/128 :: UCI 45fbe9fd-85aa-481f-adf0-372ce979198c
fe80:820:100:0:50:56ff:fe56:4455/128 :: UCI 85e139ef-0f86-434f-8862-63e421c3e738
fe80:c20:100:0:50:56ff:fe56:4452/128 :: UCI 45fbe9fd-85aa-481f-adf0-372ce979198c
ff02:820:100::1:ff00:2/128 :: UCI 85e139ef-0f86-434f-8862-63e421c3e738
ff02:820:100::1:ff56:4455/128 :: UCI 85e139ef-0f86-434f-8862-63e421c3e738
ff02:c20:100::1:ff56:4452/128 :: UCI 45fbe9fd-85aa-481f-adf0-372ce979198c
ff02:c20:100::1:ff56:5300/128 :: UCI 45fbe9fd-85aa-481f-adf0-372ce979198c

因而通过默认路由,到达DR-T0-GW-01的接口100.64.80.0/31

查看此时DR-T0-GW-1的路由表:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
localhost> get logical-router d70e64df-c577-473b-ac7a-89665d623a57 forwarding
Logical Routers Forwarding Table
--------------------------------------------------------------------------------------------------------------
Flags Legend: [U: Up], [G: Gateway], [C: Connected], [I: Interface]
[H: Host], [R: Reject], [B: Blackhole], [F: Soft Flush], [E: ECMP]

Network Gateway Type Interface UUID
==============================================================================================================
0.0.0.0/0 169.254.0.2 UG 33698a52-a0d4-4fd3-9b32-3f903f36d88c
10.10.10.0/24 169.254.0.2 UG 33698a52-a0d4-4fd3-9b32-3f903f36d88c
10.10.10.100/32 169.254.0.2 UGH 33698a52-a0d4-4fd3-9b32-3f903f36d88c
100.64.80.0/31 0.0.0.0 UCI 28d42bfc-81ad-4424-8dd4-f2b63d339bae
100.64.80.2/31 0.0.0.0 UCI 837fe706-17b9-429f-8049-963609181d9b
169.254.0.0/24 0.0.0.0 UCI 33698a52-a0d4-4fd3-9b32-3f903f36d88c
192.168.0.0/24 100.64.80.1 UG 28d42bfc-81ad-4424-8dd4-f2b63d339bae
192.168.1.0/24 100.64.80.1 UG 28d42bfc-81ad-4424-8dd4-f2b63d339bae
192.168.2.0/24 100.64.80.3 UG 837fe706-17b9-429f-8049-963609181d9b
::/0 fe80::50:56ff:fe56:5300 UG 33698a52-a0d4-4fd3-9b32-3f903f36d88c
fc15:7264:6244:2800::/64 :: UCI 28d42bfc-81ad-4424-8dd4-f2b63d339bae
fc15:7264:6244:2801::/64 :: UCI 837fe706-17b9-429f-8049-963609181d9b
fe80::50:56ff:fe56:5300/128 :: UCI 33698a52-a0d4-4fd3-9b32-3f903f36d88c
fe80:820:100:0:50:56ff:fe56:4452/128 :: UCI 28d42bfc-81ad-4424-8dd4-f2b63d339bae
fe80:920:100:0:50:56ff:fe56:4452/128 :: UCI 33698a52-a0d4-4fd3-9b32-3f903f36d88c
fe80:a20:100:0:50:56ff:fe56:4452/128 :: UCI 837fe706-17b9-429f-8049-963609181d9b
ff02:820:100::1:ff00:1/128 :: UCI 28d42bfc-81ad-4424-8dd4-f2b63d339bae
ff02:820:100::1:ff56:4452/128 :: UCI 28d42bfc-81ad-4424-8dd4-f2b63d339bae
ff02:920:100::1:ff56:4452/128 :: UCI 33698a52-a0d4-4fd3-9b32-3f903f36d88c
ff02:920:100::1:ff56:5300/128 :: UCI 33698a52-a0d4-4fd3-9b32-3f903f36d88c
ff02:a20:100::1:ff00:1/128 :: UCI 837fe706-17b9-429f-8049-963609181d9b
ff02:a20:100::1:ff56:4452/128 :: UCI 837fe706-17b9-429f-8049-963609181d9b

默认路由是169.254.0.2, 此时本地ESXi主机已经找不到该接口,通过传输节点隧道到达Edge节点SR-T0-GW-01的接口169.254.0.2/24

SR-T0-GW-01的路由表如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
edge-01(tier0_sr)> get route

Flags: t0c - Tier0-Connected, t0s - Tier0-Static, b - BGP,
t0n - Tier0-NAT, t1s - Tier1-Static, t1c - Tier1-Connected,
t1n: Tier1-NAT, t1l: Tier1-LB VIP, t1ls: Tier1-LB SNAT,
t1d: Tier1-DNS FORWARDER, t1ipsec: Tier1-IPSec, isr: Inter-SR,
> - selected route, * - FIB route

Total number of routes: 11

t0s> * 0.0.0.0/0 [1/0] via 10.10.100.1, uplink-363, 00:00:30
t0c> * 10.10.10.0/24 is directly connected, uplink-363, 00:12:13
t0c> * 100.64.80.0/31 is directly connected, linked-361, 00:12:13
t0c> * 100.64.80.2/31 is directly connected, linked-371, 00:12:13
t0c> * 169.254.0.0/24 is directly connected, downlink-353, 00:12:13
t1c> * 192.168.0.0/24 [3/0] via 100.64.80.1, linked-361, 00:12:11
t1c> * 192.168.1.0/24 [3/0] via 100.64.80.1, linked-361, 00:12:11
t1c> * 192.168.2.0/24 [3/0] via 100.64.80.3, linked-371, 00:12:11
t0c> * fc15:7264:6244:2800::/64 is directly connected, linked-361, 00:12:13
t0c> * fc15:7264:6244:2801::/64 is directly connected, linked-371, 00:12:13
t0c> * fe80::/64 is directly connected, linked-361, 00:12:13

因而数据包通过SR-T0-GW-01的上联VLAN接口流向外部网络。

整体过程如图:

逻辑架构图如下:

而给T1-GW-01T1-GW-02两个Tire1网关关联Edge集群后,整体结构变化为:

此时虚拟机t1访问外部网络的数据包同样到达DR-T1-GW-01, DR-T1-GW-01的路由表如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
localhost> get logical-router 9ddc145a-f076-4d2e-9280-bebdbeb250de forwarding
Logical Routers Forwarding Table
--------------------------------------------------------------------------------------------------------------
Flags Legend: [U: Up], [G: Gateway], [C: Connected], [I: Interface]
[H: Host], [R: Reject], [B: Blackhole], [F: Soft Flush], [E: ECMP]

Network Gateway Type Interface UUID
==============================================================================================================
0.0.0.0/0 169.254.0.2 UG 45fbe9fd-85aa-481f-adf0-372ce979198c
169.254.0.0/28 0.0.0.0 UCI 45fbe9fd-85aa-481f-adf0-372ce979198c
192.168.0.0/24 0.0.0.0 UCI 53e3d461-7fb2-4fba-a495-4e51dbcdccf1
192.168.1.0/24 0.0.0.0 UCI aac47f48-4233-424a-8503-6e41801e86f8
::/0 fe80::50:56ff:fe56:5300 UG 45fbe9fd-85aa-481f-adf0-372ce979198c
fe80::50:56ff:fe56:5300/128 :: UCI 45fbe9fd-85aa-481f-adf0-372ce979198c
fe80:c20:100:0:50:56ff:fe56:4452/128 :: UCI 45fbe9fd-85aa-481f-adf0-372ce979198c
ff02:c20:100::1:ff56:4452/128 :: UCI 45fbe9fd-85aa-481f-adf0-372ce979198c
ff02:c20:100::1:ff56:5300/128 :: UCI 45fbe9fd-85aa-481f-adf0-372ce979198c

默认路由为169.254.0.2, 数据包通过传输节点隧道到达Edge节点上SR-T1-GW-01169.254.0.2/24接口。SR-T1-GW-01的路由表为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
edge-02(tier1_sr)> get forwarding
Logical Router
UUID VRF LR-ID Name Type
970176ea-8d00-4ceb-92f5-9cd05c8860c0 27 20 SR-T1-GW-01 SERVICE_ROUTER_TIER1
IPv4 Forwarding Table
IP Prefix Gateway IP Type UUID Gateway MAC
0.0.0.0/0 100.64.80.0 route 85e139ef-0f86-434f-8862-63e421c3e738 02:50:56:56:44:52
100.64.80.0/31 route 85e139ef-0f86-434f-8862-63e421c3e738
100.64.80.1/32 route c09945ce-2748-5413-9fc7-e4a7280126b5
127.0.0.1/32 route 80bd72cf-40d8-4a30-8472-fed4a7f61695
169.254.0.0/28 route befd48ee-0bc1-4cf6-8416-39f5a3ed8911
169.254.0.1/32 route 3f66a5a0-5f8e-5d63-b574-f5a3fbe6d6b3
169.254.0.2/32 route c09945ce-2748-5413-9fc7-e4a7280126b5
192.168.0.0/24 route 53e3d461-7fb2-4fba-a495-4e51dbcdccf1
192.168.0.1/32 route 3f66a5a0-5f8e-5d63-b574-f5a3fbe6d6b3
192.168.1.0/24 route aac47f48-4233-424a-8503-6e41801e86f8
192.168.1.1/32 route 3f66a5a0-5f8e-5d63-b574-f5a3fbe6d6b3
IPv6 Forwarding Table
IP Prefix Gateway IP Type UUID Gateway MAC
::/0 fc15:7264:6244:2800::1 route 85e139ef-0f86-434f-8862-63e421c3e738
::1/128 route 80bd72cf-40d8-4a30-8472-fed4a7f61695
fc15:7264:6244:2800::/64 route 85e139ef-0f86-434f-8862-63e421c3e738
fc15:7264:6244:2800::2/128 route c09945ce-2748-5413-9fc7-e4a7280126b5

数据包经过默认路由100.64.80.0到达DR-T0-GW-01的接口100.64.80.0/31接口。DR-T0-GW-01的路由表如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
edge-02(vrf)> get forwarding
Logical Router
UUID VRF LR-ID Name Type
d70e64df-c577-473b-ac7a-89665d623a57 21 10 DR-T0-GW-01 DISTRIBUTED_ROUTER_TIER0
IPv4 Forwarding Table
IP Prefix Gateway IP Type UUID Gateway MAC
0.0.0.0/0 169.254.0.2 route 33698a52-a0d4-4fd3-9b32-3f903f36d88c 02:50:56:56:53:00
10.44.204.0/22 169.254.0.2 route 33698a52-a0d4-4fd3-9b32-3f903f36d88c 02:50:56:56:53:00
10.44.205.85/32 169.254.0.2 route 33698a52-a0d4-4fd3-9b32-3f903f36d88c 02:50:56:56:53:00
10.44.205.86/32 169.254.0.2 route 33698a52-a0d4-4fd3-9b32-3f903f36d88c 02:50:56:56:53:00
100.64.80.0/32 route 76e6ae82-1527-5af6-a14c-6ee80dd52125
100.64.80.0/31 route 28d42bfc-81ad-4424-8dd4-f2b63d339bae
100.64.80.2/32 route 76e6ae82-1527-5af6-a14c-6ee80dd52125
100.64.80.2/31 route 837fe706-17b9-429f-8049-963609181d9b
169.254.0.0/24 route 33698a52-a0d4-4fd3-9b32-3f903f36d88c
169.254.0.1/32 route 76e6ae82-1527-5af6-a14c-6ee80dd52125
192.168.0.0/24 100.64.80.1 route 28d42bfc-81ad-4424-8dd4-f2b63d339bae
192.168.1.0/24 100.64.80.1 route 28d42bfc-81ad-4424-8dd4-f2b63d339bae
192.168.2.0/24 100.64.80.3 route 837fe706-17b9-429f-8049-963609181d9b
IPv6 Forwarding Table
IP Prefix Gateway IP Type UUID Gateway MAC
::/0 fe80::50:56ff:fe56:5300 route 33698a52-a0d4-4fd3-9b32-3f903f36d88c
fc15:7264:6244:2800::/64 route 28d42bfc-81ad-4424-8dd4-f2b63d339bae
fc15:7264:6244:2800::1/128 route 76e6ae82-1527-5af6-a14c-6ee80dd52125
fc15:7264:6244:2801::/64 route 837fe706-17b9-429f-8049-963609181d9b
fc15:7264:6244:2801::1/128 route 76e6ae82-1527-5af6-a14c-6ee80dd52125

经由默认路由到达SR-T0-GW-01。我们再查看SR-T0-GW-01的路由表:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
edge-01(tier0_sr)> get forwarding
Logical Router
UUID VRF LR-ID Name Type
6ff30c16-57f7-4a8b-a43d-0b207e38e798 11 16 SR-T0-GW-01 SERVICE_ROUTER_TIER0
IPv4 Forwarding Table
IP Prefix Gateway IP Type UUID Gateway MAC
0.0.0.0/0 10.44.204.1 route baa91ad9-2afb-47dd-805c-f0c5e9f32886
10.10.10.0/22 route baa91ad9-2afb-47dd-805c-f0c5e9f32886
10.10.10.100/32 route a1dd8b2c-75e9-59a6-8d9c-c618d6e8df48
100.64.80.0/32 route 76e6ae82-1527-5af6-a14c-6ee80dd52125
100.64.80.0/31 route 28d42bfc-81ad-4424-8dd4-f2b63d339bae
100.64.80.2/32 route 76e6ae82-1527-5af6-a14c-6ee80dd52125
100.64.80.2/31 route 837fe706-17b9-429f-8049-963609181d9b
127.0.0.1/32 route bfb3edcb-69fd-4aa4-b78e-d3f79a5fb098
169.254.0.0/24 route e7c718e2-f2f4-4686-bb70-bc183e22ac17
e7c718e2-f2f4-4686-bb70-bc183e22ac17
169.254.0.1/32 route 76e6ae82-1527-5af6-a14c-6ee80dd52125
169.254.0.2/32 route a1dd8b2c-75e9-59a6-8d9c-c618d6e8df48
192.168.0.0/24 100.64.80.1 route 28d42bfc-81ad-4424-8dd4-f2b63d339bae 02:50:56:56:44:55
192.168.1.0/24 100.64.80.1 route 28d42bfc-81ad-4424-8dd4-f2b63d339bae 02:50:56:56:44:55
192.168.2.0/24 100.64.80.3 route 837fe706-17b9-429f-8049-963609181d9b
IPv6 Forwarding Table
IP Prefix Gateway IP Type UUID Gateway MAC
::1/128 route bfb3edcb-69fd-4aa4-b78e-d3f79a5fb098
fc15:7264:6244:2800::/64 route 28d42bfc-81ad-4424-8dd4-f2b63d339bae
fc15:7264:6244:2800::1/128 route 76e6ae82-1527-5af6-a14c-6ee80dd52125
fc15:7264:6244:2801::/64 route 837fe706-17b9-429f-8049-963609181d9b
fc15:7264:6244:2801::1/128 route 76e6ae82-1527-5af6-a14c-6ee80dd52125

数据包经由默认路由由上联接口进入到外部网络。

整体的转发路径如图:

逻辑路径图如下:

整体上NSX-T的路由逻辑不太容易理解,还可能涉及到Edge集群内多个实例之间的通信,需要多实践多分析。NSX-T平台上有流跟踪功能,该功能对于分析数据包走向非常有价值。

NSX-T安装参考:

路由原理参考:

NSX-T Client文档: