AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / user-1901110

bulletshot60's questions

Martin Hope
bulletshot60
Asked: 2024-03-14 23:08:23 +0800 CST

Terraform `aws_eks_node_group` 已准备就绪,但创建从未完成

  • 6

我有一个 terraform 设置,在其中创建一个新的启动模板和一个节点组。如果没有启动模板,一切都会正常工作。使用启动模板,节点已准备就绪,但节点组永远不会完成创建。

main.tf

...

resource "aws_launch_template" "this" {
  block_device_mappings {
    device_name = "/dev/xvda"

    ebs {
      volume_type           = var.block_device_mappings.type
      volume_size           = var.block_device_mappings.size
      iops                  = var.block_device_mappings.iops
      kms_key_id            = var.block_device_mappings.kms_key_id
      encrypted             = var.block_device_mappings.encrypted
      delete_on_termination = var.block_device_mappings.delete_on_termination
    }
  }

  user_data = base64encode(templatefile("${path.module}/user_data.tpl", {
    cluster_endpoint = var.cluster_endpoint
    certificate_authority_data = var.certificate_authority_data
    bootstrap_extra_args = "--use-max-pods false"
    cluster_name = var.cluster_name
  }))
}

resource "aws_eks_node_group" "this" {
  cluster_name    = var.cluster_name
  node_group_name = var.node_group_name
  node_role_arn   = var.node_group_arn
  instance_types  = [var.instance_type]
  subnet_ids = [
    for subnet in var.subnets : subnet.id
  ]
  capacity_type = var.capacity_type

  scaling_config {
    desired_size = var.desired_capacity
    max_size     = var.max_capacity
    min_size     = var.min_capacity
  }

  update_config {
    max_unavailable = 1
  }

  labels = var.node_group_labels

  dynamic "taint" {
    for_each = toset(var.node_group_taints)

    content {
      key    = taint.value.key
      value  = taint.value.value
      effect = taint.value.effect
    }
  }

  launch_template {
    id      = aws_launch_template.this.id
    version = aws_launch_template.this.latest_version
  }
}

...

user_data.tpl

MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="/:/+++"

--/:/+++
Content-Type: text/x-shellscript; charset="us-ascii"
#!/bin/bash

/etc/eks/bootstrap.sh --apiserver-endpoint '${cluster_endpoint}' --b64-cluster-ca '${certificate_authority_data}' ${bootstrap_extra_args} '${cluster_name}'

--/:/+++--

kubectl get pods

NAME                                          STATUS   ROLES    AGE   VERSION
ip-192-168-1-128.us-west-1.compute.internal   Ready    <none>   13m   v1.29.0-eks-5e0fdde
ip-192-168-1-140.us-west-1.compute.internal   Ready    <none>   13m   v1.29.0-eks-5e0fdde
ip-192-168-1-157.us-west-1.compute.internal   Ready    <none>   13m   v1.29.0-eks-5e0fdde

kubectl describe node ip-192-168-1-128.us-west-1.compute.internal

Name:               ip-192-168-1-128.us-west-1.compute.internal
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=m5.4xlarge
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=us-west-1
                    failure-domain.beta.kubernetes.io/zone=us-west-1a
                    k8s.io/cloud-provider-aws=cff041cdc91d38d182baa77beef8bf9f
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-192-168-1-128.us-west-1.compute.internal
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=m5.2xlarge
                    topology.kubernetes.io/region=us-west-1
                    topology.kubernetes.io/zone=us-west-1a
Annotations:        alpha.kubernetes.io/provided-node-ip: 192.168.1.128
                    csi.volume.kubernetes.io/nodeid: {"csi.tigera.io":"ip-192-168-1-128.us-gov-west-1.compute.internal"}
                    node.alpha.kubernetes.io/ttl: 0
                    projectcalico.org/IPv4Address: 192.168.1.128/24
                    projectcalico.org/IPv4VXLANTunnelAddr: 10.42.7.192
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Thu, 14 Mar 2024 10:40:54 -0400
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  ip-192-168-1-128.us-west-1.compute.internal
  AcquireTime:     <unset>
  RenewTime:       Thu, 14 Mar 2024 10:54:21 -0400
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Thu, 14 Mar 2024 10:41:24 -0400   Thu, 14 Mar 2024 10:41:24 -0400   CalicoIsUp                   Calico is running on this node
  MemoryPressure       False   Thu, 14 Mar 2024 10:52:09 -0400   Thu, 14 Mar 2024 10:40:54 -0400   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Thu, 14 Mar 2024 10:52:09 -0400   Thu, 14 Mar 2024 10:40:54 -0400   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Thu, 14 Mar 2024 10:52:09 -0400   Thu, 14 Mar 2024 10:40:54 -0400   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Thu, 14 Mar 2024 10:52:09 -0400   Thu, 14 Mar 2024 10:41:18 -0400   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:   192.168.1.128
  InternalDNS:  ip-192-168-1-128.us-west-1.compute.internal
  Hostname:     ip-192-168-1-128.us-west-1.compute.internal
Capacity:
  cpu:                16
  ephemeral-storage:  20959212Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             64333324Ki
  pods:               110
Allocatable:
  cpu:                15890m
  ephemeral-storage:  18242267924
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             61334028Ki
  pods:               110
System Info:
  Machine ID:                 ec2821bfac66895c1abc29a47021fe76
  System UUID:                ec2821bf-ac66-895c-1abc-29a47021fe76
  Boot ID:                    356d15db-1436-4c45-af1e-6a668eddd8e0
  Kernel Version:             5.10.210-201.852.amzn2.x86_64
  OS Image:                   Amazon Linux 2
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.7.11
  Kubelet Version:            v1.29.0-eks-5e0fdde
  Kube-Proxy Version:         v1.29.0-eks-5e0fdde
ProviderID:                   aws:///us-west-1a/i-0874068c9ab354407
Non-terminated Pods:          (6 in total)
  Namespace                   Name                                 CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                                 ------------  ----------  ---------------  -------------  ---
  calico-apiserver            calico-apiserver-5f98fdb745-cf4xg    0 (0%)        0 (0%)      0 (0%)           0 (0%)         12m
  calico-system               calico-node-6c98k                    0 (0%)        0 (0%)      0 (0%)           0 (0%)         13m
  calico-system               calico-typha-695fb789b5-sfq4n        0 (0%)        0 (0%)      0 (0%)           0 (0%)         13m
  calico-system               csi-node-driver-qtczs                0 (0%)        0 (0%)      0 (0%)           0 (0%)         13m
  ionic-system                tigera-operator-967f9fc76-tghqf      0 (0%)        0 (0%)      0 (0%)           0 (0%)         15m
  kube-system                 kube-proxy-cnlnc                     100m (0%)     0 (0%)      0 (0%)           0 (0%)         13m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests   Limits
  --------           --------   ------
  cpu                100m (0%)  0 (0%)
  memory             0 (0%)     0 (0%)
  ephemeral-storage  0 (0%)     0 (0%)
  hugepages-1Gi      0 (0%)     0 (0%)
  hugepages-2Mi      0 (0%)     0 (0%)
Events:
  Type     Reason                   Age                From                   Message
  ----     ------                   ----               ----                   -------
  Normal   Starting                 13m                kube-proxy             
  Normal   Synced                   13m                cloud-node-controller  Node synced successfully
  Normal   Starting                 13m                kubelet                Starting kubelet.
  Warning  InvalidDiskCapacity      13m                kubelet                invalid capacity 0 on image filesystem
  Normal   NodeHasSufficientMemory  13m (x2 over 13m)  kubelet                Node ip-192-168-1-128.us-west-1.compute.internal status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    13m (x2 over 13m)  kubelet                Node ip-192-168-1-128.us-west-1.compute.internal status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     13m (x2 over 13m)  kubelet                Node ip-192-168-1-128.us-west-1.compute.internal status is now: NodeHasSufficientPID
  Normal   NodeAllocatableEnforced  13m                kubelet                Updated Node Allocatable limit across pods
  Normal   RegisteredNode           13m                node-controller        Node ip-192-168-1-128.us-west-1.compute.internal event: Registered Node ip-192-168-1-128.us-west-1.compute.internal in Controller
  Normal   NodeReady                13m                kubelet                Node ip-192-168-1-128.us-west-1.compute.internal status is now: NodeReady

更令人困惑的是,如果我--apiserver-endpoint '${cluster_endpoint}' --b64-cluster-ca '${certificate_authority_data}'从.tpl文件中删除,一切都会正常工作,除了最大 pod 计数错误(由于实例类型,它会下降到 58)。

笔记:

  • 我们使用 Calico 而不是 AWS 节点 CNI。这是该项目的要求,所以我坚持这一点。
  • 迄今为止唯一突出的奇怪之处是,当我在没有上述参数的情况下运行此命令时,污点会填充,而当我在没有上述参数的情况下运行时,污点不会填充,但这可能是一个转移注意力的事情。

任何建议表示赞赏。

amazon-web-services
  • 1 个回答
  • 26 Views

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    如何减少“vmmem”进程的消耗?

    • 11 个回答
  • Marko Smith

    从 Microsoft Stream 下载视频

    • 4 个回答
  • Marko Smith

    Google Chrome DevTools 无法解析 SourceMap:chrome-extension

    • 6 个回答
  • Marko Smith

    Windows 照片查看器因为内存不足而无法运行?

    • 5 个回答
  • Marko Smith

    支持结束后如何激活 WindowsXP?

    • 6 个回答
  • Marko Smith

    远程桌面间歇性冻结

    • 7 个回答
  • Marko Smith

    子网掩码 /32 是什么意思?

    • 6 个回答
  • Marko Smith

    鼠标指针在 Windows 中按下的箭头键上移动?

    • 1 个回答
  • Marko Smith

    VirtualBox 无法以 VERR_NEM_VM_CREATE_FAILED 启动

    • 8 个回答
  • Marko Smith

    应用程序不会出现在 MacBook 的摄像头和麦克风隐私设置中

    • 5 个回答
  • Martin Hope
    Vickel Firefox 不再允许粘贴到 WhatsApp 网页中? 2023-08-18 05:04:35 +0800 CST
  • Martin Hope
    Saaru Lindestøkke 为什么使用 Python 的 tar 库时 tar.xz 文件比 macOS tar 小 15 倍? 2021-03-14 09:37:48 +0800 CST
  • Martin Hope
    CiaranWelsh 如何减少“vmmem”进程的消耗? 2020-06-10 02:06:58 +0800 CST
  • Martin Hope
    Jim Windows 10 搜索未加载,显示空白窗口 2020-02-06 03:28:26 +0800 CST
  • Martin Hope
    andre_ss6 远程桌面间歇性冻结 2019-09-11 12:56:40 +0800 CST
  • Martin Hope
    Riley Carney 为什么在 URL 后面加一个点会删除登录信息? 2019-08-06 10:59:24 +0800 CST
  • Martin Hope
    zdimension 鼠标指针在 Windows 中按下的箭头键上移动? 2019-08-04 06:39:57 +0800 CST
  • Martin Hope
    jonsca 我所有的 Firefox 附加组件突然被禁用了,我该如何重新启用它们? 2019-05-04 17:58:52 +0800 CST
  • Martin Hope
    MCK 是否可以使用文本创建二维码? 2019-04-02 06:32:14 +0800 CST
  • Martin Hope
    SoniEx2 更改 git init 默认分支名称 2019-04-01 06:16:56 +0800 CST

热门标签

windows-10 linux windows microsoft-excel networking ubuntu worksheet-function bash command-line hard-drive

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve