环境介绍
在kubernetes中,针对node节点的伸缩很多都是通过公有云可以做到自动化,但是对于伸缩的时候,在新节点创建的pod,公有云无法做到自动的回收,所以这里需要用到手动操作的问题,我这里使用自建的环境进行操作,因为不管是公有云还是私有云操作是一样的,只是公有云进行了封装。我这里用了三个节点,一个master,两个node节点,先看一下我的kubernetes环境。
[root@master ~]# kubectl get node -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master Ready control-plane,master 8d v1.23.0 10.211.55.5 <none> CentOS Linux 8 (Core) 4.18.0-80.el8.x86_64 docker://23.0.0 node1 Ready <none> 8d v1.23.0 10.211.55.6 <none> CentOS Linux 8 (Core) 4.18.0-80.el8.x86_64 docker://23.0.0 node2 Ready <none> 8d v1.23.0 10.211.55.7 <none> CentOS Linux 8 (Core) 4.18.0-80.el8.x86_64 docker://23.0.0
我这里要操作的是,我要把node2的节点进行删除操作,因为node2节点没有pod的所以我手动进行的创建。创建方式可以参考:
[root@master ~]# kubectl create deployment nginx-v5 --image=nginx deployment.apps/nginx-v5 created [root@master ~]# kubectl expose deployment nginx-v5 --port=80 --type=NodePort
因为我这里用不到service,所以就没有创建,这里只是为了给node节点打污点,所以使用不使用NodePort都不影响。看一下我创建的nginx已经调度到node2节点了。
[root@master ~]# kubectl get pod -A -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES default nginx-85b98978db-brlfd 1/1 Running 1 (6d23h ago) 7d2h 10.244.166.135 node1 <none> <none> default nginx-v1-7d48f885fb-225sg 1/1 Running 2 (6d23h ago) 8d 10.244.166.136 node1 <none> <none> default nginx-v2-5f45d8768c-8bm48 1/1 Running 1 (6d23h ago) 7d2h 10.244.166.137 node1 <none> <none> default nginx-v3-7599d6fb5d-zpdr6 1/1 Running 2 (6d23h ago) 8d 10.244.166.138 node1 <none> <none> default nginx-v4-6989b5cbbf-72jfd 1/1 Running 0 11m 10.244.104.3 node2 <none> <none> default nginx-v5-8454d48d76-w66rm 1/1 Running 0 20s 10.244.104.4 node2 <none> <none> kube-system calico-kube-controllers-64cc74d646-jkfkj 1/1 Running 5 (6d23h ago) 8d 10.244.219.69 master <none> <none> kube-system calico-node-8wwhz 1/1 Running 2 (6d23h ago) 8d 10.211.55.7 node2 <none> <none> kube-system calico-node-pncp5 1/1 Running 2 (6d23h ago) 8d 10.211.55.6 node1 <none> <none> kube-system calico-node-tssn2 1/1 Running 2 (6d23h ago) 8d 10.211.55.5 master <none> <none> kube-system coredns-6d8c4cb4d-2mz79 1/1 Running 4 (6d23h ago) 8d 10.244.219.70 master <none> <none> kube-system coredns-6d8c4cb4d-8p8ld 1/1 Running 4 (6d23h ago) 8d 10.244.219.71 master <none> <none> kube-system etcd-master 1/1 Running 2 (6d23h ago) 8d 10.211.55.5 master <none> <none> kube-system kube-apiserver-master 1/1 Running 4 (6d23h ago) 8d 10.211.55.5 master <none> <none> kube-system kube-controller-manager-master 1/1 Running 18 (6d23h ago) 8d 10.211.55.5 master <none> <none> kube-system kube-proxy-4w7k2 1/1 Running 3 (6d23h ago) 8d 10.211.55.7 node2 <none> <none> kube-system kube-proxy-7xgll 1/1 Running 2 (6d23h ago) 8d 10.211.55.6 node1 <none> <none> kube-system kube-proxy-b2ghj 1/1 Running 2 (6d23h ago) 8d 10.211.55.5 master <none> <none> kube-system kube-scheduler-master 1/1 Running 19 (73m ago) 8d 10.211.55.5 master <none> <none>
删除node节点
要删除node节点之前我们要避免删除影响现有的应用,所以我们要给node节点排空操作,就是把要删除的节点上的应用自动迁移走,然后在进行删除操作,排空也就是我们说的打污点。
[root@master ~]# kubectl drain node2 --delete-local-data --force --ignore-daemonsets Flag --delete-local-data has been deprecated, This option is deprecated and will be deleted. Use --delete-emptydir-data. node/node2 cordoned WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-8wwhz, kube-system/kube-proxy-4w7k2 evicting pod default/nginx-v5-8454d48d76-w66rm evicting pod default/nginx-v4-6989b5cbbf-72jfd pod/nginx-v5-8454d48d76-w66rm evicted pod/nginx-v4-6989b5cbbf-72jfd evicted node/node2 drained [root@master ~]# kubectl get pod -A -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES default nginx-85b98978db-brlfd 1/1 Running 1 (6d23h ago) 7d3h 10.244.166.135 node1 <none> <none> default nginx-v1-7d48f885fb-225sg 1/1 Running 2 (6d23h ago) 8d 10.244.166.136 node1 <none> <none> default nginx-v2-5f45d8768c-8bm48 1/1 Running 1 (6d23h ago) 7d3h 10.244.166.137 node1 <none> <none> default nginx-v3-7599d6fb5d-zpdr6 1/1 Running 2 (6d23h ago) 8d 10.244.166.138 node1 <none> <none> default nginx-v4-6989b5cbbf-nflhw 1/1 Running 0 12s 10.244.166.140 node1 <none> <none> default nginx-v5-8454d48d76-8jfq5 1/1 Running 0 12s 10.244.166.139 node1 <none> <none> kube-system calico-kube-controllers-64cc74d646-jkfkj 1/1 Running 5 (6d23h ago) 8d 10.244.219.69 master <none> <none> kube-system calico-node-8wwhz 1/1 Running 2 (6d23h ago) 8d 10.211.55.7 node2 <none> <none> kube-system calico-node-pncp5 1/1 Running 2 (6d23h ago) 8d 10.211.55.6 node1 <none> <none> kube-system calico-node-tssn2 1/1 Running 2 (6d23h ago) 8d 10.211.55.5 master <none> <none> kube-system coredns-6d8c4cb4d-2mz79 1/1 Running 4 (6d23h ago) 8d 10.244.219.70 master <none> <none> kube-system coredns-6d8c4cb4d-8p8ld 1/1 Running 4 (6d23h ago) 8d 10.244.219.71 master <none> <none> kube-system etcd-master 1/1 Running 2 (6d23h ago) 8d 10.211.55.5 master <none> <none> kube-system kube-apiserver-master 1/1 Running 4 (6d23h ago) 8d 10.211.55.5 master <none> <none> kube-system kube-controller-manager-master 1/1 Running 18 (6d23h ago) 8d 10.211.55.5 master <none> <none> kube-system kube-proxy-4w7k2 1/1 Running 3 (6d23h ago) 8d 10.211.55.7 node2 <none> <none> kube-system kube-proxy-7xgll 1/1 Running 2 (6d23h ago) 8d 10.211.55.6 node1 <none> <none> kube-system kube-proxy-b2ghj 1/1 Running 2 (6d23h ago) 8d 10.211.55.5 master <none> <none> kube-system kube-scheduler-master 1/1 Running 19 (84m ago) 8d 10.211.55.5 master <none> <none>
给node2节点打污点后,发现node2节点上就没有应用了,有的也是kubernetes的组件,所以这个不受现有环境的影响。node2节点已经没有应用了,我们要做的就是把node2节点从集群中删除。
[root@master ~]# kubectl delete nodes node2 node "node2" deleted [root@master ~]# kubectl get node NAME STATUS ROLES AGE VERSION master Ready control-plane,master 8d v1.23.0 node1 Ready <none> 8d v1.23.0
这样node2节点已经从集群中完全删除了,到这里kubernetes删除node节点的操作已经完成了,下面进行一些扩展。
扩展
这里的扩展部分是为了把删除的node2节点重新加入到集群中,我这里对node2节点删除后没有做任何操作,我们尝试一下是否可以直接加入到集群中,如果不能我们在看一下需要注意事项。在添加node节点之前我们需要在master节点从新创建一下新的token,因为这个token的生效时间是24小时,所以我们需要先创建,然后在node2节点加入。
[root@master ~]# kubeadm token create --print-join-command kubeadm join 10.211.55.5:6443 --token olvh4t.rzflkeyrmceemscc --discovery-token-ca-cert-hash sha256:b748add9c4d2077777c1ff3c283cceec928f504647ca704106da8e887151b8f7 [root@node2 ~]# kubeadm join 10.211.55.5:6443 --token olvh4t.rzflkeyrmceemscc --discovery-token-ca-cert-hash sha256:b748add9c4d2077777c1ff3c283cceec928f504647ca704106da8e887151b8f7 [preflight] Running pre-flight checks [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 23.0.0. Latest validated version: 20.10 error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists [ERROR Port-10250]: Port 10250 is in use [ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...` To see the stack trace of this error execute with --v=5 or higher
在node2节点加入的时候,提示的是端口被使用了,我们这里尝试一下重启一下docker,kubeadm和kubelet,然后把创建的文件都删除。
[root@node2 ~]# kubeadm reset [reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted. [reset] Are you sure you want to proceed? [y/N]: y [preflight] Running pre-flight checks W0215 11:57:14.531663 28038 removeetcdmember.go:80] [reset] No kubeadm config, using etcd pod spec to get data directory [reset] No etcd config found. Assuming external etcd [reset] Please, manually reset etcd to prevent further issues [reset] Stopping the kubelet service [reset] Unmounting mounted directories in "/var/lib/kubelet" [reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki] [reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf] [reset] Deleting contents of stateful directories: [/var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni] The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d The reset process does not reset or clean up iptables rules or IPVS tables. If you wish to reset iptables, you must do so manually by using the "iptables" command. If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar) to reset your system's IPVS tables. The reset process does not clean your kubeconfig files and you must remove them manually. Please, check the contents of the $HOME/.kube/config file. [root@node2 ~]# systemctl stop kubelet [root@node2 ~]# systemctl stop docker Warning: Stopping docker.service, but it can still be activated by: docker.socket [root@node2 ~]# rm -rf /var/lib/cni/ [root@node2 ~]# rm -rf /var/lib/kubelet/* [root@node2 ~]# rm -rf /etc/cni/ [root@node2 ~]# systemctl start docker [root@node2 ~]# systemctl start kubelet [root@node2 ~]# kubeadm join 10.211.55.5:6443 --token olvh4t.rzflkeyrmceemscc --discovery-token-ca-cert-hash sha256:b748add9c4d2077777c1ff3c283cceec928f504647ca704106da8e887151b8f7 [preflight] Running pre-flight checks [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 23.0.0. Latest validated version: 20.10 [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Starting the kubelet [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
我们在这里删除了cni和kubelet的信息,然后重启加入集群,node2节点中提示加入成功了,我们需要在master节点看一下,是否加入成功。
[root@master ~]# kubectl get node NAME STATUS ROLES AGE VERSION master Ready control-plane,master 8d v1.23.0 node1 Ready <none> 8d v1.23.0 node2 NotReady <none> 20s v1.23.0 [root@master ~]# kubectl get node NAME STATUS ROLES AGE VERSION master Ready control-plane,master 8d v1.23.0 node1 Ready <none> 8d v1.23.0 node2 Ready <none> 21s v1.23.0
刚开始node2节点上的状态是NotReady,不过没关系,我们需要稍等一下,也可以使用,如果我们加入后还有问题,可以查看一下node2节点的网络插件是否有异常,也可以使用describe查看一下详细信息。kubernetes常用命令可以参考:https://www.wulaoer.org/?p=2736 到现在kubernetes中node节点删除操作才算完成,没有了,看些其他的吧。。。
您可以选择一种方式赞助本站
支付宝扫一扫赞助
微信钱包扫描赞助
赏