kubernetes网络(三)之bird的路由反射器的使用
一、摘要
上一篇文章中我们用 bird 程序实现了三台服务器之间的BGP full mesh。本文我们将实验把full mesh
方式改为RR
路由反射器方式 ,让宿主的BIRD相互学习到对方的容器网段,从而达到容器网段能相互通信的目的。
二、bird 实验
bird简介
- BIRD 实际上是 BIRD Internet Routing Daemon 的缩写,是一款可运行在 Linux 和其他类 Unix 系统上的路由软件,它实现了多种路由协议,比如 BGP、OSPF、RIP 等。
brid路由反射器的学习
我们直接翻译计算机网络最权威的文档RFC的方式来学习。找到rfc4456中关于路由反射器的描述。
The basic idea of route reflection is very simple. Let us consider the simple example depicted in Figure 1 below.
路由反射的基本思想非常简单。让我们考虑下面图11所示的简单示例。
+-------+ +-------+| | IBGP | || RTR-A |--------| RTR-B || | | |+-------+ +-------+\ /IBGP \ ASX / IBGP\ /+-------+| || RTR-C || |+-------+Figure 1: Full-Mesh IBGP
In ASX, there are three IBGP speakers (routers RTR-A, RTR-B, and RTR-C). With the existing BGP model, if RTR-A receives an external route and it is selected as the best path it must advertise the external route to both RTR-B and RTR-C. RTR-B and RTR-C (as IBGP speakers) will not re-advertise these IBGP learned routes to other IBGP speakers.
在自治域ASX中,有3个IBGP speekers(路由器RTR-A、RTR-B和RTR-C)。在现有的BGP模型下,如果RTR-A收到一条外部路由并被选为最佳路径,则必须同时向RTR-B和RTR-C发布这条外部路由。RTR-B和RTR-C(作为IBGP speaker)不会将学到的路由重新发布给其他IBGP speaker。
来我翻译翻译,这句话说出了BGP协议的2个重要规则:
-
从EBGP学习到的路由,必须发布给其余的IBGP speaker。如上图,RTR-A必须同时向RTR-B和RTR-C发布这条外部路由。
-
从IBGP学习到的路由,不能再重新发布给其余IBGP speaker。如上图,RTR-B不会将从RTR-A学到的路由发布给RTR-C。
If this rule is relaxed and RTR-C is allowed to advertise IBGP learned routes to IBGP peers, then it could re-advertise (or reflect) the IBGP routes learned from RTR-A to RTR-B and vice versa. This would eliminate the need for the IBGP session between RTR-A and RTR-B as shown in Figure 2 below.
如果放宽此规则,允许RTR-C向IBGP对等体发布学到的IBGP路由,即RTR-C可以向RTR-B重新发布从RTR-A学到的IBGP路由(这种重新发布方式我们也可称为reflect反射),反之亦然。这将RTR-A和RTR-B之间就无需建立IBGP会话了。
+-------+ +-------+| | | || RTR-A | | RTR-B || | | |+-------+ +-------+\ /IBGP \ ASX / IBGP\ /+-------+| || RTR-C || |+-------+Figure 2: Route Reflection IBGP
The route reflection scheme is based upon this basic principle.
路由反射方案就是基于这个基本原理实现的。
路由反射器中的概念或术语
We use the term route reflection
to describe the operation of a BGP speaker advertising an IBGP learned route to another IBGP peer. Such a BGP speaker is said to be a “route reflecto
r” (RR
), and such a route is said to be a reflected route
.
我们使用“路由反射”一词来描述BGP speaker将IBGP学到的路由通告给另一个IBGP对等体的操作。这样的BGP speaker被称为路由反射器
RR (route reflector),这样的路由被称为反射路由
。
The internal peers of an RR are divided into two groups:
RR的内部对等体分为两类:
1). Client peers
2). Non-Client peers
An RR reflects routes between these groups, and may reflect routes among client peers. An RR along with its client peers form a cluster. The Non-Client peer must be fully meshed but the Client peers need not be fully meshed. Figure 3 depicts a simple example outlining the basic RR components using the terminology noted above.
一个RR是在group组内部的client peer 之间的反射路由。RR与其client peer
客户端对等体组成集群cluster
。non-client
非客户端对等体必须full mesh(全互联),但clent peer客户端对等体不必full mesh。图3描述了一个使用上述术语概述基本RR组件的简单示例。
/ - - - - - - - - - - - - - -| Cluster |+-------+ +-------+| | | | | || RTR-A | | RTR-B || |Client | |Client | |+-------+ +-------+| \ / |IBGP \ / IBGP| \ / |+-------+| | | || RTR-C || | RR | |+-------+| / \ |- - - - - /- - -\- - - - - - /IBGP / \ IBGP+-------+ +-------+| RTR-D | IBGP | RTR-E || Non- |---------| Non- ||Client | |Client |+-------+ +-------+Figure 3: RR Components
总结:
- BGP协议默认的规则: 从IBGP学习到的路由,不能再重新发布给其余IBGP speaker,而RR路由反射器是打破了这个规则,所以才被称为路由反射器。
- 使用RR路由反射器后,bgp speaker 不需要full mesh全互联,只需要client 与 RR 建立BGP peer即可.(这是RR的最最主要的作用)
- RR 与 Non-client, non-client 与 non-client 之间还是必须全互联。(如上图所示,RR 不会 把RTR-D从外部学的路由的反射给RTR-E,所以要求RTR-D与RTR-E之前需要相互建立bgp peer关系)。
实验目标
- 学习使用bird实现 BGP 路由反射器
- 三台宿主的各个容器网段能相互通信
实验环境
系统版本 | bird版本 | 宿主IP | 容器网段 | 宿主简称 |
---|---|---|---|---|
ubuntu16.04 | BIRD 1.5.0 | 10.226.11.27 | 192.168.227.0/24 | 宿主A |
ubuntu16.04 | BIRD 1.5.0 | 10.226.11.22 | 192.168.222.0/24 | 宿主B |
ubuntu16.04 | BIRD 1.5.0 | 10.226.11.21 | 192.168.221.0/24 | 宿主C |
实验拓扑
三台宿主在同一网段内,每个宿主各自下挂一个容器网段。下面实现10.226.11.27作为RR
,10.226.11.21与10.226.11.21作为RR client
,从而完成三台宿主上的bird相互学习路由,最终目的是实现各个容器网段能相互通信。 特别说明:10.226.11.21 与 10.226.11.22 之间不需要建立BGP peer关系。
创建模拟的容器网段
我们知道容器是通过linux 的 namespace 机制实现的一个隔离空间, 这里我们同样借助 namespace 机制 实现一个隔离空间,并为隔离出的命名空间配置网络,模拟一个宿主"挂了"一个容器网段。
- 宿主10.226.11.22上创建模拟容器网段
# 创建命名空间 netns1
ip netns add netns1
# 创建 veth peer
ip link add veth1 type veth peer name veth2
# 将 veth1 放入 netns1
ip link set veth1 netns netns1
# 查看 veth
ip link show | grep veth
# 为 netns1 中的
ip netns exec netns1 ifconfig veth1 192.168.222.102/24 up
# 为 netns1 指定默认网关,网关是宿主1的网络协议栈
ip netns exec netns1 route add -net 0.0.0.0/0 gw 192.168.222.101
# 为宿主中的 veth2 配置IP地址,这样就实现了宿主与netns1 通过 veth2 与 veth1 的互联
ifconfig veth2 192.168.222.101/24 up
- 宿主10.226.11.21上创建模拟容器网段
ip netns add netns1
ip link add veth1 type veth peer name veth2
ip link set veth1 netns netns1
ip link show | grep veth
ip netns exec netns1 ifconfig veth1 192.168.221.102/24 up
ip netns exec netns1 route add -net 0.0.0.0/0 gw 192.168.221.101
ifconfig veth2 192.168.221.101/24 up
- 宿主10.226.11.27上创建模拟容器网段
ip netns add netns1
ip link add veth1 type veth peer name veth2
ip link set veth1 netns netns1
ip link show | grep veth
ip netns exec netns1 ifconfig veth1 192.168.227.102/24 up
ip netns exec netns1 route add -net 0.0.0.0/0 gw 192.168.227.101
ifconfig veth2 192.168.227.101/24 up
bird的安装
- bird程序的安装
apt-get update
apt-get install bird
/etc/init.d/bird start
- 检查是否安装成功
/etc/init.d/bird status
birdc show status
输出如下显示表示安装正常
root@10_226_11_21:/etc/bird# birdc show status
BIRD 1.5.0 ready.
BIRD 1.5.0
Router ID is 10.226.11.21
Current server time is 2024-09-23 15:03:28
Last reboot on 2024-09-22 20:22:36
Last reconfiguration on 2024-09-23 13:01:00
Daemon is up and running
BGP RR模式的实现
宿主A作为RR角色
RR 需要与所有client 建立BGP peer关系,它的/etc/bird/bird.conf配置如下
# This is a minimal configuration file, which allows the bird daemon to start
# but will not cause anything else to happen.
#
# Please refer to the documentation in the bird-doc package or BIRD User's
# Guide on http://bird.network.cz/ for more information on configuring BIRD and
# adding routing protocols.#log syslog all;
log "/var/log/bird.log" all;
# Change this into your BIRD router ID. It's a world-wide unique identification
# of your router, usually one of router's IPv4 addresses.
router id 10.226.11.27;# The Kernel protocol is not a real routing protocol. Instead of communicating
# with other routers in the network, it performs synchronization of BIRD's
# routing tables with the OS kernel.
protocol kernel {debug { states };scan time 10;learn;persist;import none; # kernel to bird mapexport all; # Actually insert routes into the kernel routing table
}# The Device protocol is not a real routing protocol. It doesn't generate any
# routes and it only serves as a module for getting information about network
# interfaces from the kernel.
protocol direct {interface "veth2";
}
protocol device {
}# 与10.226.11.21建立bgp peer的配置
protocol bgp peer_10_226_11_21 {debug { states };local as 64512;neighbor 10.226.11.21 as 64512;source address 10.226.11.27;#multihop;password "passwd";direct;export all; # 控制哪些路由可以发布给bgp peerimport all; # 从 direct, device, static, kernel 等所有protocol的路由都导入bird 的 bgp 路由## 让 10.226.11.21 作为 rr clientrr client;# cluster id 标识属于哪个集群rr cluster id 224.0.0.1;
}# 与10.226.11.22建立bgp peer的配置
protocol bgp peer_10_226_11_22 {debug { states };# 配置 BGP 的 graceful restart# 如果对端因为网络抖动或暂时崩溃而暂时下线,会导致所有传入路由瞬间消失# 为了避免这种情况下数据转发中断,才有 graceful restart# 建议打开graceful restart on;# 指定自己的 ASN 为 65550local as 64512;# 指定对端的 ASN 为 64512,IP 为 10.226.11.22# 如果 ASN 和 local as 相同,那么 BIRD 会自动认为这是一个 iBGP,否则是 eBGP# i 表示 internal(内部),e 表示 external(外部)neighbor 10.226.11.22 as 64512;# source: 定义本地地址作为BGP会话的源地址。Default:邻居所连接接口的本端地址。source address 10.226.11.27;#multihop;# password: 如果和对端约定了密码,在这里配置约定好的密码,否则不用写password "passwd";# direct: eBGP 默认启用可以不写# direct: iBGP 如果是直接连接的可以写这个来避免 multihop 被指定# 指定邻居为直连。邻居的IP地址必须在直接可达的IP范围内(即与路由器的接口有关联),# 否则BGP会话不会启动,而是等待这样的接口出现。另一种选择是多跳选项。默认值:使能eBGP。direct;#export: 控制哪些路由可以发布给bgp peerexport all;import all;## 让 10.226.11.22 作为 rr clientrr client;rr cluster id 224.0.0.1;
}
配置文件中重要的就如下2行代码,其余代码与上一文中几乎相同:
这2行代码说明了:本节点作为RR,并且指定了哪些bgp节点作为rr client,同时设置集群标识。
rr client;rr cluster id 224.0.0.1;
宿主B作为RR client角色
/etc/bird.bird.conf的配置文件如下:
# This is a minimal configuration file, which allows the bird daemon to start
# but will not cause anything else to happen.
#
# Please refer to the documentation in the bird-doc package or BIRD User's
# Guide on http://bird.network.cz/ for more information on configuring BIRD and
# adding routing protocols.#log syslog all;
log "/var/log/bird.log" all;
# Change this into your BIRD router ID. It's a world-wide unique identification
# of your router, usually one of router's IPv4 addresses.
router id 10.226.11.22;# The Kernel protocol is not a real routing protocol. Instead of communicating
# with other routers in the network, it performs synchronization of BIRD's
# routing tables with the OS kernel.
protocol kernel {debug { states };scan time 10;learn;persist;import none; # kernel to bird mapexport all; # Actually insert routes into the kernel routing table
}# The Device protocol is not a real routing protocol. It doesn't generate any
# routes and it only serves as a module for getting information about network
# interfaces from the kernel.
protocol direct {interface "veth2";
}
protocol device {
}# 与 RR 建立 bgp peer 关系
protocol bgp peer_10_226_11_27 {debug { states };local as 64512;neighbor 10.226.11.27 as 64512;source address 10.226.11.22;direct;password "passwd";export all;import all;
}
10.226.11.22 作为 rr client ,只需要与 RR 建立 bgp peer关系;从配置文件可以看出,自己不知道peer_10_226_11_27是一个RR,也就是说RR不需要让其cilent 知道自己是一个RR。
宿主C作为RR client角色
/etc/bird.bird.conf的配置文件如下:
# This is a minimal configuration file, which allows the bird daemon to start
# but will not cause anything else to happen.
#
# Please refer to the documentation in the bird-doc package or BIRD User's
# Guide on http://bird.network.cz/ for more information on configuring BIRD and
# adding routing protocols.
log "/var/log/bird.log" all;
# Change this into your BIRD router ID. It's a world-wide unique identification
# of your router, usually one of router's IPv4 addresses.
router id 10.226.11.21;# The Kernel protocol is not a real routing protocol. Instead of communicating
# with other routers in the network, it performs synchronization of BIRD's
# routing tables with the OS kernel.
protocol kernel {learn;persist;scan time 10;import none;export all; # Actually insert routes into the kernel routing table
}# The Device protocol is not a real routing protocol. It doesn't generate any
# routes and it only serves as a module for getting information about network
# interfaces from the kernel.
protocol device {
}protocol direct {interface "veth2";
}# 与 RR 建立 bgp peer 关系
protocol bgp peer_10_226_11_27 {debug { states };local as 64512;neighbor 10.226.11.27 as 64512;source address 10.226.11.21;direct;password "passwd";export all;import all;
}
10.226.11.21 作为 rr client ,只需要与 RR 建立 bgp peer关系;从配置文件可以看出,自己不知道peer_10_226_11_27是一个RR,也就是说RR不需要让其cilent 知道自己是一个RR。
查看状态
登录宿主A 10.226.11.27查看其相关网络的状态
- 查看 bgp peer 邻居状态
Established
表示与bgp peer邻接关系已经建立,而且已经相互完成了路由学习
root@10_226_11_27:/work/code# birdc show protocol
BIRD 1.5.0 ready.
name proto table state since info
kernel1 Kernel master up 16:06:17
direct1 Direct master up 16:06:17
device1 Device master up 16:06:17
peer_10_226_11_21 BGP master up 16:06:21 Established
peer_10_226_11_22 BGP master up 16:06:22 Established
- bird 路由表
root@10_226_11_27:/work/code# birdc show route
BIRD 1.5.0 ready.
# 从 protocol direct 导入的路由
192.168.227.0/24 dev veth2 [direct1 16:06:17] * (240)
# 从 bgp peer_10_226_11_21 学习的路由
192.168.221.0/24 via 10.226.11.21 on eth0 [peer_10_226_11_21 16:06:21] * (100) [i]
# 从 bgp peer_10_226_11_22 学习的路由
192.168.222.0/24 via 10.226.11.22 on eth0 [peer_10_226_11_22 16:18:14] * (100) [i]
从可以看到RR 上学习到了2个宿主的2个容器网段。
- bird路由表的详细信息
root@10_226_11_27:/work/code# birdc show route all
BIRD 1.5.0 ready.
192.168.227.0/24 dev veth2 [direct1 16:06:17] * (240)Type: device unicast univ
192.168.221.0/24 via 10.226.11.21 on eth0 [peer_10_226_11_21 16:06:21] * (100) [i]Type: BGP unicast univBGP.origin: IGPBGP.as_path:BGP.next_hop: 10.226.11.21BGP.local_pref: 100
192.168.222.0/24 via 10.226.11.22 on eth0 [peer_10_226_11_22 16:18:14] * (100) [i]Type: BGP unicast univBGP.origin: IGPBGP.as_path:BGP.next_hop: 10.226.11.22BGP.local_pref: 100
- kernel 路由表
root@10_226_11_27:/work/code# ip route show
default via 10.226.8.1 dev eth0
# 宿主 eth0接口 的直连路由
10.226.8.0/22 dev eth0 proto kernel scope link src 10.226.11.27
# 从bird 学习来的路由
192.168.221.0/24 via 10.226.11.21 dev eth0 proto bird
192.168.222.0/24 via 10.226.11.22 dev eth0 proto bird
# 宿主 veth2接口 直接的路由(模拟的容器网段)
192.168.227.0/24 dev veth2 proto kernel scope link src 192.168.227.101
- 查看宿主10.226.11.22的bird路由表
从可以看到本宿主上学习到了其余2个宿主的2个容器网段:192.168.221.0/24和192.168.227.0/24
- 查看宿主10.226.11.22的bird路由表
从可以看到本宿主上学习到了其余2个宿主的2个容器网段:192.168.222.0/24和192.168.227.0/24
网络测试验证
-
从容器192.168.227.102 ping 容器192.168.221.102
-
容器192.168.221.102 ping 192.168.222.102
可见三个容器网络可以互通,达到了实验目的。
实验结论
- 实现了三台宿主的bird的BGP RR 模式
- 三台宿主通过bgp相互完成了路由学习
- 通过实验我们对calico中的bird程序有了更深入的认知
三、参考文档
bird官网
BIRD BGP route-reflector
https://wiki.skywolf.cloud/quickstart/player.html
https://gitlab.nic.cz/labs/bird/-/wikis/BGP_example_1
https://lyyao09.github.io/2020/06/30/linux/Intro-to-BGP-with-BIRD/
https://soha.moe/post/bird-bgp-kickstart.html