我无法将 CNI 添加到 kubernetes 主节点,CNI 插件无权访问某些文件或文件夹。Calico 和 Flannel 的日志说某些文件或文件夹不可访问(在帖子中我只指 Calico)。
对于 v1.19.4 和 v1.19.3 版本的 kubectl、kubeadm 和 kubelet,我发现了同样的问题。Docker 版本为 19.03.13-ce,并使用带有 ext4 文件系统的 overlay2 和 systemd 作为 cgroupdriver。交换被禁用。
我在stackoverflow上发现的唯一朝着这个方向发展的是: Kubernetes Cluster with Calico - Containers are not come up & failed with FailedCreatePodSandBox
在第一步中,我使用 kubeadm(calico 的 CIDR)设置集群:
# kubeadm init --apiserver-advertise-address=192.168.178.33 --pod-network-cidr=192.168.0.0/16
这是 workinThis 工作正常,在 kubelet 日志中是需要 CNI 的消息。在此之后,我正在应用 CNI calico:
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml
等待一段时间后,主节点将保持以下状态:
❯ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-5c6f6b67db-zdksz 0/1 ContainerCreating 0 7m47s
kube-system calico-node-sc42z 0/1 CrashLoopBackOff 5 7m47s
kube-system coredns-f9fd979d6-4zrcj 0/1 ContainerCreating 0 8m11s
kube-system coredns-f9fd979d6-wf9r2 0/1 ContainerCreating 0 8m11s
kube-system etcd-hs-0 1/1 Running 0 8m20s
kube-system kube-apiserver-hs-0 1/1 Running 0 8m20s
kube-system kube-controller-manager-hs-0 1/1 Running 0 8m20s
kube-system kube-proxy-t6ngd 1/1 Running 0 8m11s
kube-system kube-scheduler-hs-0 1/1 Running 0 8m20sere
对我来说,我从以下命令获得的信息:
kubectl describe pods calico-node-sc42z --namespace kube-system
与下一个代码不一致:calico-node pod 具有已安装的卷,但该 pod 无权访问它(查看卷和事件)。
❯ kubectl describe pods calico-node-sc42z --namespace kube-system
Name: calico-node-sc42z
Namespace: kube-system
Priority: 2000001000
Priority Class Name: system-node-critical
Node: hs-0/192.168.178.48
Start Time: Sat, 14 Nov 2020 00:58:36 +0100
Labels: controller-revision-hash=5f678767
k8s-app=calico-node
pod-template-generation=1
Annotations: <none>
Status: Running
IP: 192.168.178.48
IPs:
IP: 192.168.178.48
Controlled By: DaemonSet/calico-node
Init Containers:
upgrade-ipam:
Container ID: docker://29c6cf8b73ecb98ee18169db0f6ffe8b141a8a6e10b2c839fc5bf05177f066ac
Image: calico/cni:v3.16.5
Image ID: docker-pullable://calico/cni@sha256:e05d0ee834c2004e8e7c4ee165a620166cd16e3cb8204a06eb52e5300b46650b
Port: <none>
Host Port: <none>
Command:
/opt/cni/bin/calico-ipam
-upgrade
State: Terminated
Reason: Completed
Exit Code: 0
Started: Sat, 14 Nov 2020 00:58:48 +0100
Finished: Sat, 14 Nov 2020 00:58:48 +0100
Ready: True
Restart Count: 0
Environment Variables from:
kubernetes-services-endpoint ConfigMap Optional: true
Environment:
KUBERNETES_NODE_NAME: (v1:spec.nodeName)
CALICO_NETWORKING_BACKEND: <set to the key 'calico_backend' of config map 'calico-config'> Optional: false
Mounts:
/host/opt/cni/bin from cni-bin-dir (rw)
/var/lib/cni/networks from host-local-net-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-tzhr4 (ro)
install-cni:
Container ID: docker://4435863e0d2f3ab4535aa6ca49ff95d889e71614861f3c7c0e4213d8c333f4db
Image: calico/cni:v3.16.5
Image ID: docker-pullable://calico/cni@sha256:e05d0ee834c2004e8e7c4ee165a620166cd16e3cb8204a06eb52e5300b46650b
Port: <none>
Host Port: <none>
Command:
/opt/cni/bin/install
State: Terminated
Reason: Completed
Exit Code: 0
Started: Sat, 14 Nov 2020 00:58:49 +0100
Finished: Sat, 14 Nov 2020 00:58:49 +0100
Ready: True
Restart Count: 0
Environment Variables from:
kubernetes-services-endpoint ConfigMap Optional: true
Environment:
CNI_CONF_NAME: 10-calico.conflist
CNI_NETWORK_CONFIG: <set to the key 'cni_network_config' of config map 'calico-config'> Optional: false
KUBERNETES_NODE_NAME: (v1:spec.nodeName)
CNI_MTU: <set to the key 'veth_mtu' of config map 'calico-config'> Optional: false
SLEEP: false
Mounts:
/host/etc/cni/net.d from cni-net-dir (rw)
/host/opt/cni/bin from cni-bin-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-tzhr4 (ro)
flexvol-driver:
Container ID: docker://ca03f59013c1576a4a605a6d737af78ec3e859376aa11a301e56f0ffdacbc8db
Image: calico/pod2daemon-flexvol:v3.16.5
Image ID: docker-pullable://calico/pod2daemon-flexvol@sha256:7b20fd9cc36c7196dd24d56cc1e89ac573c634856ee020334b0b30cf5b8a3d3b
Port: <none>
Host Port: <none>
State: Terminated
Reason: Completed
Exit Code: 0
Started: Sat, 14 Nov 2020 00:58:56 +0100
Finished: Sat, 14 Nov 2020 00:58:56 +0100
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/host/driver from flexvol-driver-host (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-tzhr4 (ro)
Containers:
calico-node:
Container ID: docker://96bbc7f4adf1d5cb9a927aedc18e16da7b5ed4b0ff1290179a8dd4a51c115ab8
Image: calico/node:v3.16.5
Image ID: docker-pullable://calico/node@sha256:43c145b2bd837611d8d41e70631a8f2cc2b97b5ca9d895d66ffddd414dab83c5
Port: <none>
Host Port: <none>
State: Running
Started: Sat, 14 Nov 2020 01:04:51 +0100
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Sat, 14 Nov 2020 01:03:41 +0100
Finished: Sat, 14 Nov 2020 01:04:51 +0100
Ready: False
Restart Count: 5
Requests:
cpu: 250m
Liveness: exec [/bin/calico-node -felix-live -bird-live] delay=10s timeout=1s period=10s #success=1 #failure=6
Readiness: exec [/bin/calico-node -felix-ready -bird-ready] delay=0s timeout=1s period=10s #success=1 #failure=3
Environment Variables from:
kubernetes-services-endpoint ConfigMap Optional: true
Environment:
DATASTORE_TYPE: kubernetes
WAIT_FOR_DATASTORE: true
NODENAME: (v1:spec.nodeName)
CALICO_NETWORKING_BACKEND: <set to the key 'calico_backend' of config map 'calico-config'> Optional: false
CLUSTER_TYPE: k8s,bgp
IP: autodetect
CALICO_IPV4POOL_IPIP: Always
CALICO_IPV4POOL_VXLAN: Never
FELIX_IPINIPMTU: <set to the key 'veth_mtu' of config map 'calico-config'> Optional: false
FELIX_VXLANMTU: <set to the key 'veth_mtu' of config map 'calico-config'> Optional: false
FELIX_WIREGUARDMTU: <set to the key 'veth_mtu' of config map 'calico-config'> Optional: false
CALICO_DISABLE_FILE_LOGGING: true
FELIX_DEFAULTENDPOINTTOHOSTACTION: ACCEPT
FELIX_IPV6SUPPORT: false
FELIX_LOGSEVERITYSCREEN: info
FELIX_HEALTHENABLED: true
Mounts:
/lib/modules from lib-modules (ro)
/run/xtables.lock from xtables-lock (rw)
/sys/fs/ from sysfs (rw)
/var/lib/calico from var-lib-calico (rw)
/var/run/calico from var-run-calico (rw)
/var/run/nodeagent from policysync (rw)
/var/run/secrets/kubernetes.io/serviceaccount from calico-node-token-tzhr4 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
lib-modules:
Type: HostPath (bare host directory volume)
Path: /lib/modules
HostPathType:
var-run-calico:
Type: HostPath (bare host directory volume)
Path: /var/run/calico
HostPathType:
var-lib-calico:
Type: HostPath (bare host directory volume)
Path: /var/lib/calico
HostPathType:
xtables-lock:
Type: HostPath (bare host directory volume)
Path: /run/xtables.lock
HostPathType: FileOrCreate
sysfs:
Type: HostPath (bare host directory volume)
Path: /sys/fs/
HostPathType: DirectoryOrCreate
cni-bin-dir:
Type: HostPath (bare host directory volume)
Path: /opt/cni/bin
HostPathType:
cni-net-dir:
Type: HostPath (bare host directory volume)
Path: /etc/cni/net.d
HostPathType:
host-local-net-dir:
Type: HostPath (bare host directory volume)
Path: /var/lib/cni/networks
HostPathType:
policysync:
Type: HostPath (bare host directory volume)
Path: /var/run/nodeagent
HostPathType: DirectoryOrCreate
flexvol-driver-host:
Type: HostPath (bare host directory volume)
Path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds
HostPathType: DirectoryOrCreate
calico-node-token-tzhr4:
Type: Secret (a volume populated by a Secret)
SecretName: calico-node-token-tzhr4
Optional: false
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: :NoScheduleop=Exists
:NoExecuteop=Exists
CriticalAddonsOnly op=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 6m52s default-scheduler Successfully assigned kube-system/calico-node-sc42z to hs-0
Normal Pulling 6m51s kubelet Pulling image "calico/cni:v3.16.5"
Normal Pulled 6m40s kubelet Successfully pulled image "calico/cni:v3.16.5" in 10.618669742s
Normal Started 6m40s kubelet Started container upgrade-ipam
Normal Created 6m40s kubelet Created container upgrade-ipam
Normal Created 6m39s kubelet Created container install-cni
Normal Pulled 6m39s kubelet Container image "calico/cni:v3.16.5" already present on machine
Normal Started 6m39s kubelet Started container install-cni
Normal Pulling 6m38s kubelet Pulling image "calico/pod2daemon-flexvol:v3.16.5"
Normal Started 6m32s kubelet Started container flexvol-driver
Normal Created 6m32s kubelet Created container flexvol-driver
Normal Pulled 6m32s kubelet Successfully pulled image "calico/pod2daemon-flexvol:v3.16.5" in 6.076268177s
Normal Pulling 6m31s kubelet Pulling image "calico/node:v3.16.5"
Normal Pulled 6m19s kubelet Successfully pulled image "calico/node:v3.16.5" in 12.051211859s
Normal Created 6m19s kubelet Created container calico-node
Normal Started 6m19s kubelet Started container calico-node
Warning Unhealthy 5m32s (x5 over 6m12s) kubelet Readiness probe failed: calico/node is not ready: BIRD is not ready: Failed to stat() nodename file: stat /var/lib/calico/nodename: no such file or directory
Warning Unhealthy 109s (x23 over 6m9s) kubelet Liveness probe failed: calico/node is not ready: bird/confd is not live: exit status 1
此外,我有 calico-node 的日志,但我不明白如何从这些附加信息中受益:不幸的是,我不知道数据存储区是否指的是文件系统,这意味着这是我已经知道的错误,或者如果它是额外的东西。
❯ kubectl logs calico-node-sc42z -n kube-system -f
2020-11-14 01:42:55.536 [INFO][8] startup/startup.go 376: Early log level set to info
2020-11-14 01:42:55.536 [INFO][8] startup/startup.go 392: Using NODENAME environment for node name
2020-11-14 01:42:55.536 [INFO][8] startup/startup.go 404: Determined node name: hs-0
2020-11-14 01:42:55.539 [INFO][8] startup/startup.go 436: Checking datastore connection
2020-11-14 01:43:25.539 [INFO][8] startup/startup.go 451: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: i/o timeout
2020-11-14 01:43:56.540 [INFO][8] startup/startup.go 451: Hit error connecting to datastore - retry error=Get "https://10.96.0.1:443/api/v1/nodes/foo": dial tcp 10.96.0.1:443: i/o timeout
也许有人可以给我一个提示如何解决这个问题或在哪里阅读这个主题。问候, Kokos Bot。
可能是因为您 Calico 的默认 POD CIDR 与 Host CIDR 冲突。刚从你那里得到这个印象
--apiserver-advertise-address=192.168.178.33
。如果是这种情况,值得尝试使用不同的 POD--pod-network-cidr=20.96.0.0/12
CIDRkubeadm init
要再次进行全新安装,最好
kubeadm reset
在上述更改之前执行一次。在执行之前请注意kubeadm reset
命令影响(阅读此处)参考 - https://stackoverflow.com/questions/60742165/kubernetes-calico-replicaset