在k3s
集群(具有多个控制平面节点)和安装了Rancher Longhorn 的情况下,对于具有默认longhorn
存储类的 pvc 的 pod,我观察到以下警告(输出为kubectl get events
,wrapped 以提高可读性):
LAST SEEN TYPE REASON OBJECT
113s Warning FailedMount pod/grafana-6756f6587b-rv2xj
MESSAGE
MountVolume.SetUp failed for volume "pvc-7b6d12e3-132d-4af1-99c0-920ac5af0687" :
rpc error:
code = Aborted
desc = no Pending workload pods for volume pvc-7b6d12e3-132d-4af1-99c0-920ac5af0687
to be mounted: map[Running:[grafana-6756f6587b-rv2xj]]
这个反复出现的警告意味着什么以及需要采取什么措施来解决它?
至少从我在 Pod 容器中看到的情况来看,该卷已安装。我已经尝试重新启动 Pod 并重新创建 Pod,但警告仍然存在。longhorn 仪表板也没有显示任何问题。
请参阅下面的 pvc 和部署资源定义:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: grafana-pvc
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
storageClassName: longhorn
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
namespace: monitoring
labels:
app: grafana
spec:
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
securityContext:
fsGroup: 472
supplementalGroups:
- 0
containers:
- name: grafana
image: grafana/grafana:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3000
name: http-grafana
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /robots.txt
port: 3000
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 2
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
tcpSocket:
port: 3000
timeoutSeconds: 1
resources:
requests:
cpu: 250m
memory: 750Mi
volumeMounts:
- mountPath: /var/lib/grafana
name: grafana-pv
volumes:
- name: grafana-pv
persistentVolumeClaim:
claimName: grafana-pvc
这是预料之外的,也是最近版本的 Longhorn 中的一个错误,由 Kubelet 重新启动(在我们的例子中是 RKE2 的滚动部署)引起。修复正在进行中:https://github.com/longhorn/longhorn/issues/8072