当前位置: 首页 > news >正文

k8s迁移——岁月云实战笔记

  新系统使用rockylinux9.5,旧系统虚拟机装的是centos7

1 目标服务器

1.1 禁止swap

swapoff -a
vi /etc/fstab
#/dev/mapper/rl-swap     none                    swap    defaults        0 0
#执行,swap一行都是0
free -h

1.2 关闭防火墙

         只是为了减少维护成本。

systemctl stop firewalld
systemctl disable firewalld
systemctl status firewalld

1.3 关闭SE 

# 临时关闭 重启系统后还会开启
setenforce 0
# 永久关闭
vi /etc/selinux/config
# 将SELINUX=enforcing改为SELINUX=disabled,

1.4 更改主机名 

hostnamectl set-hostname master7

 1.5 添加host

vi /etc/hosts
10.101.10.6 master6
10.101.10.7 master7
10.101.10.8 master8

 1.6 配置ip_forward机制

# 设置
modprobe br_netfilter
# net.ipv4.ip_forward为0,则pod的ip无法转发
sysctl -w net.ipv4.ip_forward=1
sysctl -w net.bridge.bridge-nf-call-iptables=1
sysctl -w net.bridge.bridge-nf-call-ip6tables=1
sysctl -p
# 检查
sysctl -a | grep net.ipv4.ip_forward
sysctl -a | grep net.bridge.bridge-nf-call-iptables
sysctl -a | grep net.bridge.bridge-nf-call-ip6tables

1.7 时间同步

sudo dnf install chrony
sudo systemctl start chronyd
sudo systemctl enable chronyd# 添加配置
vi /etc/chrony.conf
# 添加如下配置
pool ntp1.aliyun.com iburst
pool ntp2.aliyun.com iburstserver ntp1.aliyun.com iburst
server ntp2.aliyun.com iburst
server ntp3.aliyun.com iburst
server ntp4.aliyun.com iburst
server ntp5.aliyun.com iburst
server ntp7.aliyun.com iburst# 立即同步
sudo chronyc -a makestep# 查看时间状态
timedatectl status

1.8 添加rancher用户

useradd rancher
usermod -aG docker rancher
echo 123456 | passwd --stdin rancher
cat /etc/group | grep docker

2 源服务器

        由原来的master节点添加新的节点,因此这个是在源服务器上执行。

2.1 免密登录

# 在原master节点中执行
su - rancher
ssh-copy-id rancher@master7

2.2 安装新的rke

curl -sfL https://get.rke2.io | sh -

2.2 添加节点

  rke管理k8s节点的新增与删除,更改cluster.yml配置,然后执行rke up --update-only --config cluster.yml,因为涉及到etcd的添加,因此需要选择空闲时段来处理。

 2.3 安装kubectlctl

        安装对应的kubectl

https://dl.k8s.io/release/v1.30.7/bin/linux/amd64/kubectl

chmod +x kubectl
cp -a kubectl /usr/bin
cd /root
mkdir .kube
cp /home/rancher/kube_config_cluster.yml /root/.kube/config

3 一些问题

3.1 docker版本不兼容问题

su - rancher
rke up --update-only --config cluster.yml

      执行完命令后,提示下面的错误信息,rancher官网也有这个错误Failed to set up SSH tunneling for host [xxx.xxx.xxx.xxx]: Can't retrieve Docker Info#

WARN[0000] Failed to set up SSH tunneling for host [master6]: Can't retrieve Docker Info: error during connect: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info": Unable to access node with address [master6:22] using SSH. Please check if you are able to SSH to the node using the specified SSH Private Key and if you have configured the correct SSH username. Error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain 
WARN[0000] Removing host [master6] from node lists      
INFO[0000] [network] No hosts added existing cluster, skipping port check 

         但在源服务器中执行,下面的命令是通过的

ssh -i ~/.ssh/id_rsa rancher@master6

        查看docker版本,估计是docker版本

# 目标服务器
[root@master6 ~]# docker --version
Docker version 27.4.0, build bde2b89
# 源服务器
[root@master1 ~]# docker --version
Docker version 19.03.8, build afacb8b

       docker并不是最新的就好,当前 rke 版本Release v1.6.5,但是安装的时候提示,也就是说docker27.4.1当前不支持。因此还得做版本回退。

[rancher@master8 ~]$ rke up --config cluster.yml
INFO[0000] Running RKE version: v1.6.5                  
INFO[0000] Initiating Kubernetes cluster                
INFO[0000] [certificates] GenerateServingCertificate is disabled, checking if there are unused kubelet certificates 
INFO[0000] [certificates] Generating Kubernetes API server certificates 
INFO[0000] [certificates] Generating admin certificates and kubeconfig 
INFO[0000] [certificates] Generating kube-etcd-master6 certificate and key 
INFO[0000] [certificates] Generating kube-etcd-master7 certificate and key 
INFO[0000] [certificates] Generating kube-etcd-master8 certificate and key 
INFO[0000] Successfully Deployed state file at [./cluster.rkestate] 
INFO[0000] Building Kubernetes cluster                  
INFO[0000] [dialer] Setup tunnel for host [master7]     
INFO[0000] [dialer] Setup tunnel for host [master8]     
INFO[0000] [dialer] Setup tunnel for host [master6]     
FATA[0001] Unsupported Docker version found [27.4.1] on host [master8], supported versions are [1.13.x 17.03.x 17.06.x 17.09.x 18.06.x 18.09.x 19.03.x 20.10.x 23.0.x 24.0.x 25.0.x 26.0.x 26.1.x 27.0.x 27.1.x 27.2.x] 

        重置docker环境

systemctl disable docker
sudo systemctl stop docker.socket
systemctl stop docker
dnf remove docker-ce docker-ce-cli containerd.io docker-compose-plugin -y
# 删除docker数据
sudo rm -rf /var/lib/docker
sudo rm -rf /var/lib/containerd
rm -rf /home/docker
# 清理残留文件,如果是重装下面两步也可以跳过
sudo rm -rf /etc/docker
sudo rm -rf /etc/systemd/system/docker.service.d
# 查看可用的docker
sudo yum list docker-ce --showduplicates | sort -r
# 安装指定版本的docker
yum install docker-ce-27.2.1-1.el9 docker-ce-cli-27.2.1-1.el9 containerd.io -y
# 更改docker路径
vi /lib/systemd/system/docker.service
# 重启docker
systemctl start docker
systemctl enable docker

3.2 rke下载不了文件

        虽然你改了/etc/docker/daemon.json,但是执行rke up --config cluster.yml,镜像还是下载不下来。在各个节点手工执行一下,如下面拉去对应的镜像,然后再rke up --config cluster.yml就可以往下走了。

docker pull rancher/rke-tools:v0.1.105

        下面是执行过程中,我的截图,可以看到有些rancher相关的镜像比较大,都有16.GB,而有些镜像还在下载过程中。 

3.3 canal安装失败

        calico-kube-controllers安装也失败,但是解决下面的问题后,一并会解决

# 执行这个可以看到详细的错误日志
kubectl describe pod canal-5vznx -n kube-systemEvents:Type     Reason     Age                   From               Message----     ------     ----                  ----               -------Normal   Scheduled  32m                   default-scheduler  Successfully assigned kube-system/canal-5vznx to master7Normal   Pulling    27m (x4 over 32m)     kubelet            Pulling image "rancher/calico-cni:v3.28.1-rancher1"Warning  Failed     25m (x4 over 31m)     kubelet            Error: ErrImagePullWarning  Failed     24m (x7 over 31m)     kubelet            Error: ImagePullBackOffWarning  Failed     11m (x7 over 31m)     kubelet            Failed to pull image "rancher/calico-cni:v3.28.1-rancher1": rpc error: code = Canceled desc = context canceledNormal   BackOff    2m44s (x77 over 31m)  kubelet            Back-off pulling image "rancher/calico-cni:v3.28.1-rancher1"# 于是手工执行
docker pull rancher/calico-cni:v3.28.1-rancher1
docker pull rancher/mirrored-calico-node:v3.28.1

3.5 kuboard安装失败

        下面看还是同样的问题,镜像下载不下来,这个是因为kuboard要设置secret到本地harbor中下载镜像。

Events:Type     Reason                           Age                From               Message----     ------                           ----               ----               -------Normal   Scheduled                        46s                default-scheduler  Successfully assigned kube-system/kuboard-559bccdc6-zf67z to master6Normal   BackOff                          18s (x2 over 44s)  kubelet            Back-off pulling image "10.101.10.2:8081/mid/eipwork/kuboard:latest"Warning  Failed                           18s (x2 over 44s)  kubelet            Error: ImagePullBackOffWarning  FailedToRetrieveImagePullSecret  3s (x5 over 46s)   kubelet            Unable to retrieve some image pull secrets (regcred); attempting to pull the image may not succeed.Normal   Pulling                          3s (x3 over 45s)   kubelet            Pulling image "10.101.10.2:8081/mid/eipwork/kuboard:latest"Warning  Failed                           3s (x3 over 45s)   kubelet            Failed to pull image "10.101.10.2:8081/mid/eipwork/kuboard:latest": Error response from daemon: unauthorized: unauthorized to access repository: mid/eipwork/kuboard, action: pull: unauthorized to access repository: mid/eipwork/kuboard, action: pullWarning  Failed                           3s (x3 over 45s)   kubelet            Error: ErrImagePull
kubectl create secret docker-registry regcred \--docker-server=http://harbor的ip:端口 \--docker-username=用户名 \--docker-password=密码\--docker-email=邮箱 \-n kube-system

接口要获取kuboard的token

echo $(kubectl -n kube-system get secret $(kubectl -n kube-system get secret | grep kuboard-user | awk '{print $1}') -o go-template='{{.data.token}}' | base64 -d)

3.6 kuboard拿不到token

        以往都很容易执行上面的命令就可以了,但是今天不知道为什么kuboard没有创建对应的secret。检查账户信息,里面确实没有scecret

kubectl get serviceaccount kuboard-user -n kube-system -o yaml
apiVersion: v1
kind: ServiceAccount
metadata:annotations:kubectl.kubernetes.io/last-applied-configuration: |{"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"name":"kuboard-user","namespace":"kube-system"}}creationTimestamp: "2024-12-21T07:24:12Z"name: kuboard-usernamespace: kube-systemresourceVersion: "3491"uid: 7d46c0a1-07e9-4cb2-ad99-00b7e6091151

解决方案如下,创建了secret,接着按照上面的命令,从secret中拿到token就可以登录kuboard的网页了。

# 这个命令会创建一个新的Token Secret
kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:name: kuboard-user-tokennamespace: kube-systemannotations:kubernetes.io/service-account.name: kuboard-user
type: kubernetes.io/service-account-token
EOF# 将这个新创建的Secret关联到ServiceAccount
kubectl patch serviceaccount kuboard-user -n kube-system --patch '{"secrets":[{"name":"kuboard-user-token"}]}'


http://www.mrgr.cn/news/80975.html

相关文章:

  • wxWidgets使用wxStyledTextCtrl(Scintilla编辑器)的正确姿势
  • Linux相关概念和易错知识点(25)(信号原理、操作系统的原理、volatile)
  • Go 1.24即将到来!
  • 每日十题八股-2024年12月19日
  • javase-15、正则表达式
  • 单点登录平台Casdoor搭建与使用,集成gitlab同步创建删除账号
  • JWT令牌与微服务
  • Pytorch | 利用MI-FGSM针对CIFAR10上的ResNet分类器进行对抗攻击
  • GTID详解
  • (耗时4天制作)详细介绍macOS系统 本博文含有全英版 (全文翻译稿)
  • 本地计算机上的MySQL服务启动后停止(connection refused: connect)解决一系列数据库连接不上的问题
  • 在UE5中调用ImGui图形界面库
  • [创业之路-202]:任正非管理华为的思想与毛泽东管理党、军队、国家的思想的相似性与差异性
  • Linux——卷
  • 【day09】面向对象编程进阶
  • 汽车IVI中控开发入门及进阶(43):NanoVG
  • Leetcode-208. 实现Trie(前缀树)
  • SQL进阶技巧:如何计算算法题分发糖果问题?
  • Java开发经验——数据库开发经验
  • 深度学习笔记——VQ-VAE和VQ-VAE-2
  • 基于mmdetection进行语义分割(不修改源码)
  • ubuntu24.04使用opencv4
  • 记录踩过的坑-金蝶云苍穹平台-许可、用户、角色和权限(慢慢更新)
  • 基于SpringBoot的图书管理系统(源码+数据库+报告)
  • C vs C++: 一场编程语言的演变与对比
  • More Effective C++之效率Efficiency_上