AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / server / 问题 / 1029654
Accepted
Windowlicker
Windowlicker
Asked: 2020-08-12 01:03:58 +0800 CST2020-08-12 01:03:58 +0800 CST 2020-08-12 01:03:58 +0800 CST

从集群中删除控制节点会杀死 apiserver

  • 772

当我有一个具有多个控制节点的 kubernetes 集群并删除其中一个时,整个 API 服务器似乎不再可用。

在此设置中,我想从两个控制节点缩小到一个控制节点,但最终导致集群无法使用:

$ kubectl get nodes
NAME      STATUS   ROLES    AGE     VERSION
master1   Ready    master   5d20h   v1.18.6
worker1   Ready    <none>   5d19h   v1.18.6
master2   Ready    master   19h     v1.18.6
$ kubectl drain master2 --ignore-daemonsets
node/master2 cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/calico-node-hns7p, kube-system/kube-proxy-vk6t7
node/master2 drained
$ kubectl get nodes
NAME      STATUS                     ROLES    AGE     VERSION
master1   Ready                      master   5d20h   v1.18.6
worker1   Ready                      <none>   5d20h   v1.18.6
master2   Ready,SchedulingDisabled   master   19h     v1.18.6
$ kubectl delete node master2
node "master2" deleted
$ kubectl get nodes
NAME      STATUS   ROLES    AGE     VERSION
master1   Ready    master   5d20h   v1.18.6
worker1   Ready    <none>   5d20h   v1.18.6
$ ssh master2
$ sudo kubeadm reset
[reset] Reading configuration from the cluster...
[reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
W0811 10:24:49.750898    7159 reset.go:99] [reset] Unable to fetch the kubeadm-config ConfigMap from cluster: failed to get node registration: failed to get corresponding node: nodes "master2" not found
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W0811 10:24:51.487912    7159 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/etcd /var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]

The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
$ exit
$ kubectl get nodes
Error from server: etcdserver: request timed out
$ kubectl cluster-info

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
The connection to the server master1:6443 was refused - did you specify the right host or port?

这里缺少什么?或者,删除控制平面节点与删除工作节点有何不同?指针表示赞赏。

kubernetes
  • 1 1 个回答
  • 1115 Views

1 个回答

  • Voted
  1. Best Answer
    Matt
    2020-08-13T00:58:19+08:002020-08-13T00:58:19+08:00

    你有两个主节点,这也意味着你有两个 etcd 副本。

    在etcd 文档中,您可以阅读:

    建议集群中有奇数个成员。奇数大小的集群容忍的故障数量与偶数大小的集群相同,但节点更少。通过比较偶数和奇数大小的集群可以看出差异:

    Cluster Size    Majority    Failure Tolerance
    1               1           0
    2               2           0
    3               2           1
    

    如您所见,拥有大小为 2 的 etcd 集群需要所有副本都能正常工作,并且不能容忍任何故障。这就是为什么强烈建议使用奇数个 etcd 副本的原因。

    所以我相信你现在明白了为什么你的集群出现故障了。

    还可以查看关于kubeadm: high availability topology 的kubernetes 文档。

    • 1

相关问题

  • 无法通过 Docker 在本地运行 Hyperkube (kubernetes)

  • 跨 Kubernetes 分散工作负载

  • Kubernetes升级回滚机器类型

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    新安装后 postgres 的默认超级用户用户名/密码是什么?

    • 5 个回答
  • Marko Smith

    SFTP 使用什么端口?

    • 6 个回答
  • Marko Smith

    命令行列出 Windows Active Directory 组中的用户?

    • 9 个回答
  • Marko Smith

    什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同?

    • 3 个回答
  • Marko Smith

    如何确定bash变量是否为空?

    • 15 个回答
  • Martin Hope
    Tom Feiner 如何按大小对 du -h 输出进行排序 2009-02-26 05:42:42 +0800 CST
  • Martin Hope
    Noah Goodrich 什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同? 2009-05-19 18:24:42 +0800 CST
  • Martin Hope
    Brent 如何确定bash变量是否为空? 2009-05-13 09:54:48 +0800 CST
  • Martin Hope
    cletus 您如何找到在 Windows 中打开文件的进程? 2009-05-01 16:47:16 +0800 CST

热门标签

linux nginx windows networking ubuntu domain-name-system amazon-web-services active-directory apache-2.4 ssh

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve