AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / user-563121

caprica's questions

Martin Hope
caprica
Asked: 2020-03-08 03:11:41 +0800 CST

EKS 集群节点在授权失败大约 30 分钟后从 Ready 变为 NotReady

  • 1

我正在使用 eksctl 在 EKS/AWS 上设置集群。

按照 EKS 文档中的指南,我对几乎所有内容都使用默认值。

集群创建成功,我从集群更新了 Kubernetes 配置,我可以成功运行各种 kubectl 命令——例如“kubectl get nodes”显示节点处于“就绪”状态。

我没有碰任何其他东西,我有一个开箱即用的干净集群,没有进行任何其他更改,到目前为止,一切似乎都按预期工作。我不向它部署任何应用程序,我只是不理会它。

问题是在一段相对较短的时间后(大约在集群创建后 30 分钟),节点从“Ready”变为“NotReady”并且它永远不会恢复。

事件日志显示了这一点(我编辑了 IP):

LAST SEEN   TYPE     REASON                    OBJECT        MESSAGE
22m         Normal   Starting                  node/ip-[x]   Starting kubelet.
22m         Normal   NodeHasSufficientMemory   node/ip-[x]   Node ip-[x] status is now: NodeHasSufficientMemory
22m         Normal   NodeHasNoDiskPressure     node/ip-[x]   Node ip-[x] status is now: NodeHasNoDiskPressure
22m         Normal   NodeHasSufficientPID      node/ip-[x]   Node ip-[x] status is now: NodeHasSufficientPID
22m         Normal   NodeAllocatableEnforced   node/ip-[x]   Updated Node Allocatable limit across pods
22m         Normal   RegisteredNode            node/ip-[x]   Node ip-[x] event: Registered Node ip-[x] in Controller
22m         Normal   Starting                  node/ip-[x]   Starting kube-proxy.
21m         Normal   NodeReady                 node/ip-[x]   Node ip-[x] status is now: NodeReady
7m34s       Normal   NodeNotReady              node/ip-[x]   Node ip-[x] status is now: NodeNotReady

集群中其他节点的相同事件。

连接到实例并检查 /var/log/messages 会在节点进入 NotReady 的同时显示这一点:

Mar  7 10:40:37 ip-[X] kubelet: E0307 10:40:37.259207    3896 kubelet_node_status.go:385] Error updating node status, will retry: error getting node "ip-[x]": Unauthorized
Mar  7 10:40:37 ip-[X] kubelet: E0307 10:40:37.385044    3896 kubelet_node_status.go:385] Error updating node status, will retry: error getting node "ip-[x]": Unauthorized
Mar  7 10:40:37 ip-[X] kubelet: E0307 10:40:37.621271    3896 reflector.go:270] object-"kube-system"/"aws-node-token-bdxwv": Failed to watch *v1.Secret: the server has asked for the client to provide credentials (get secrets)
Mar  7 10:40:37 ip-[X] kubelet: E0307 10:40:37.621320    3896 reflector.go:270] object-"kube-system"/"coredns": Failed to watch *v1.ConfigMap: the server has asked for the client to provide credentials (get configmaps)
Mar  7 10:40:37 ip-[X] kubelet: E0307 10:40:37.638850    3896 reflector.go:270] k8s.io/client-go/informers/factory.go:133: Failed to watch *v1beta1.RuntimeClass: the server has asked for the client to provide credentials (get runtimeclasses.node.k8s.io)
Mar  7 10:40:37 ip-[X] kubelet: E0307 10:40:37.707074    3896 reflector.go:270] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to watch *v1.Pod: the server has asked for the client to provide credentials (get pods)
Mar  7 10:40:37 ip-[X] kubelet: E0307 10:40:37.711386    3896 reflector.go:270] object-"kube-system"/"coredns-token-67fzd": Failed to watch *v1.Secret: the server has asked for the client to provide credentials (get secrets)
Mar  7 10:40:37 ip-[X] kubelet: E0307 10:40:37.714899    3896 reflector.go:270] object-"kube-system"/"kube-proxy-config": Failed to watch *v1.ConfigMap: the server has asked for the client to provide credentials (get configmaps)
Mar  7 10:40:37 ip-[X] kubelet: E0307 10:40:37.720884    3896 kubelet_node_status.go:385] Error updating node status, will retry: error getting node "ip-[x]": Unauthorized
Mar  7 10:40:37 ip-[X] kubelet: E0307 10:40:37.868003    3896 kubelet_node_status.go:385] Error updating node status, will retry: error getting node "ip-[x]": Unauthorized
Mar  7 10:40:37 ip-[X] kubelet: E0307 10:40:37.868067    3896 controller.go:125] failed to ensure node lease exists, will retry in 200ms, error: Get https://[X]/apis/coordination.k8s.io/v1beta1/namespaces/kube-node-lease/leases/ip-[x]?timeout=10s: write tcp 192.168.91.167:50866->34.249.27.158:443: use of closed network connection
Mar  7 10:40:38 ip-[X] kubelet: E0307 10:40:38.017157    3896 kubelet_node_status.go:385] Error updating node status, will retry: error getting node "ip-[x]": Unauthorized
Mar  7 10:40:38 ip-[X] kubelet: E0307 10:40:38.017182    3896 kubelet_node_status.go:372] Unable to update node status: update node status exceeds retry count
Mar  7 10:40:38 ip-[X] kubelet: E0307 10:40:38.200053    3896 controller.go:125] failed to ensure node lease exists, will retry in 400ms, error: Unauthorized
Mar  7 10:40:38 ip-[X] kubelet: E0307 10:40:38.517193    3896 reflector.go:270] object-"kube-system"/"kube-proxy": Failed to watch *v1.ConfigMap: the server has asked for the client to provide credentials (get configmaps)
Mar  7 10:40:38 ip-[X] kubelet: E0307 10:40:38.729756    3896 controller.go:125] failed to ensure node lease exists, will retry in 800ms, error: Unauthorized
Mar  7 10:40:38 ip-[X] kubelet: E0307 10:40:38.752267    3896 reflector.go:126] object-"kube-system"/"aws-node-token-bdxwv": Failed to list *v1.Secret: Unauthorized
Mar  7 10:40:38 ip-[X] kubelet: E0307 10:40:38.824988    3896 reflector.go:126] object-"kube-system"/"coredns": Failed to list *v1.ConfigMap: Unauthorized
Mar  7 10:40:38 ip-[X] kubelet: E0307 10:40:38.899566    3896 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.RuntimeClass: Unauthorized
Mar  7 10:40:38 ip-[X] kubelet: E0307 10:40:38.963756    3896 reflector.go:126] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.CSIDriver: Unauthorized
Mar  7 10:40:38 ip-[X] kubelet: E0307 10:40:38.963822    3896 reflector.go:126] object-"kube-system"/"kube-proxy-config": Failed to list *v1.ConfigMap: Unauthorized

身份验证器组件的 CloudWatch 日志显示其中许多消息:

time="2020-03-07T10:40:37Z" level=warning msg="access denied" arn="arn:aws:iam::[ACCOUNT_ID]]:role/AmazonSSMRoleForInstancesQuickSetup" client="127.0.0.1:50132" error="ARN is not mapped: arn:aws:iam::[ACCOUNT_ID]:role/amazonssmroleforinstancesquicksetup" method=POST path=/authenticate

我通过 IAM 控制台确认该角色确实存在。

显然,由于这些身份验证失败,该节点正在报告 NotReady。

这是一些在大约 30 分钟后超时的身份验证令牌,如果是这样,是否不应该自动请求新的令牌?还是我应该设置其他东西?

我很惊讶由 eksctl 创建的新集群会显示此问题。

我错过了什么?

kubernetes amazon-eks
  • 3 个回答
  • 4276 Views

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    新安装后 postgres 的默认超级用户用户名/密码是什么?

    • 5 个回答
  • Marko Smith

    SFTP 使用什么端口?

    • 6 个回答
  • Marko Smith

    命令行列出 Windows Active Directory 组中的用户?

    • 9 个回答
  • Marko Smith

    什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同?

    • 3 个回答
  • Marko Smith

    如何确定bash变量是否为空?

    • 15 个回答
  • Martin Hope
    Tom Feiner 如何按大小对 du -h 输出进行排序 2009-02-26 05:42:42 +0800 CST
  • Martin Hope
    Noah Goodrich 什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同? 2009-05-19 18:24:42 +0800 CST
  • Martin Hope
    Brent 如何确定bash变量是否为空? 2009-05-13 09:54:48 +0800 CST
  • Martin Hope
    cletus 您如何找到在 Windows 中打开文件的进程? 2009-05-01 16:47:16 +0800 CST

热门标签

linux nginx windows networking ubuntu domain-name-system amazon-web-services active-directory apache-2.4 ssh

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve