我家里有一个小型的k3s集群,托管着一些网站和本地应用程序。在大多数情况下,我能够用它来托管各种服务,但 LetsEncrypt 功能对我来说一直不太好用。当它能用的时候,我不知道为什么,也不敢去摆弄它,以免它再次坏掉。
目前,集群中有两个站点支持 TLS,运行良好……甚至好几年了,但现在我要添加第三个站点,我遇到了过去遇到的相同错误,但我不知道原因。我希望这里有人可以解释我错过了什么。
该错误似乎是对当请求魔术 URL 时(我如何确定正在尝试的 URL?)他们得到的是 HTML 而不是预期的响应,尽管不清楚为什么会这样,甚至不清楚哪种服务正在构建响应。
I0820 17:28:23.398812 1 service.go:43] cert-manager/challenges/http01/selfCheck/http01/ensureService "msg"="found one existing HTTP01 solver Service for challenge resource" "dnsName"="REDACTED.tld" "related_resource_kind"="Service" "related_resource_name"="cm-acme-http-solver-9qp5z" "related_resource_namespace"="my-namespace" "related_resource_version"="v1" "resource_kind"="Challenge" "resource_name"="tls-REDACTED" "resource_namespace"="my-namespace" "resource_version"="v1" "type"="HTTP-01"
E0820 17:28:23.585197 1 sync.go:190] cert-manager/challenges "msg"="propagation check failed" "error"="did not get expected response when querying endpoint, expected \"REDACTED.REDACTED\" but got: <!DOCTYPE html PUBLIC \"-... (truncated)" "dnsName"="REDACTED.tld" "resource_kind"="Challenge" "resource_name"="tls-REDACTED" "resource_namespace"="my-namespace" "resource_version"="v1" "type"="HTTP-01"
I0820 17:28:33.585741 1 pod.go:59] cert-manager/challenges/http01/selfCheck/http01/ensurePod "msg"="found one existing HTTP01 solver pod" "dnsName"="REDACTED.tld" "related_resource_kind"="Pod" "related_resource_name"="cm-acme-http-solver-5gnhm" "related_resource_namespace"="my-namespace" "related_resource_version"="v1" "resource_kind"="Challenge" "resource_name"="tls-REDACTED" "resource_namespace"="my-namespace" "resource_version"="v1" "type"="HTTP-01"
I0820 17:28:33.585898 1 service.go:43] cert-manager/challenges/http01/selfCheck/http01/ensureService "msg"="found one existing HTTP01 solver Service for challenge resource" "dnsName"="REDACTED.tld" "related_resource_kind"="Service" "related_resource_name"="cm-acme-http-solver-9qp5z" "related_resource_namespace"="my-namespace" "related_resource_version"="v1" "resource_kind"="Challenge" "resource_name"="tls-REDACTED" "resource_namespace"="my-namespace" "resource_version"="v1" "type"="HTTP-01"
E0820 17:28:33.600402 1 sync.go:190] cert-manager/challenges "msg"="propagation check failed" "error"="did not get expected response when querying endpoint, expected \"REDACTED.REDACTED\" but got: <!DOCTYPE html PUBLIC \"-... (truncated)" "dnsName"="REDACTED.tld" "resource_kind"="Challenge" "resource_name"="tls-REDACTED" "resource_namespace"="my-namespace" "resource_version"="v1" "type"="HTTP-01"
整个应用程序由一个 Python Web 应用程序、一个 Python 工作程序、Redis 和 Nginx 组成,总共包含大量 YAML。我添加了我认为与此案例相关的标记:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
namespace: my-namespace
spec:
replicas: 1
revisionHistoryLimit: 0
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:alpine
imagePullPolicy: IfNotPresent
resources:
requests:
memory: 32Mi
cpu: 125m
limits:
memory: 32Mi
cpu: 125m
ports:
- containerPort: 80
volumeMounts:
- name: public
mountPath: "/usr/share/nginx/html"
volumes:
- name: public
persistentVolumeClaim:
claimName: public
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: web
namespace: my-namespace
annotations:
kubernetes.io/ingress.class: traefik
cert-manager.io/cluster-issuer: letsencrypt
acme.cert-manager.io/http01-edit-in-place: 'true'
spec:
rules:
- host: REDACTED.tld
http:
paths:
- path: /static
pathType: Prefix
backend:
service:
name: nginx
port:
number: 8000
- path: /media
pathType: Prefix
backend:
service:
name: nginx
port:
number: 8000
- path: /
pathType: Prefix
backend:
service:
name: web
port:
number: 8000
tls:
- secretName: tls
hosts:
- REDACTED.tld
---
apiVersion: v1
kind: Service
metadata:
name: nginx
namespace: my-namespace
spec:
selector:
app: nginx
ports:
- port: 8000
targetPort: 80
name: tcp
---
apiVersion: v1
kind: Service
metadata:
name: web
namespace: my-namespace
spec:
selector:
app: web
ports:
- port: 8000
targetPort: 8000
name: gunicorn
我非常习惯使用“裸机”Linux 管理,但 Kubernetes 对我来说仍然感觉太“神奇”了。上面的内容是通过遵循K3s Rocks教程构建的,如上所述,不知何故适用于其他两个领域——尽管它们最初没有,但后来不知何故开始工作了。
我将非常感激针对这个问题的任何建议,以及关于如何/应该更好地完成上述工作的任何建议。
感谢@DavidW的评论,我能够找到问题所在。CertManager确实创建了一个指向以下内容的入口:
当我从远程网络请求该 URL 时,它响应了相应的值。所以它是正常工作的,但在将证书请求发送到 LetsEncrypt 之前测试该响应的过程仍然失败。
碰巧的是,由于这项服务在我家运行,我有一个本地域服务器,允许本地访问远程服务。这样,当我
my-domain.tld
从家庭办公室查询时,我得到的192.168.x.x
是公共 IP。由于我的路由器无法处理来自 LAN 内部的对其自身外部 IP 的请求,这是我的解决方法。无论如何,该域名服务器没有的记录
my-domain.tld
,因此虽然来自我家外面的请求my-domain.tld
正确解析了测试 URL,但 CertManager 的请求可能会获取我的路由器在尝试从内部网络访问我的外部 IP 时返回的某种默认页面。就我而言,解决方法是将
CNAME
这个新域的记录添加到指向 k8s 主节点的本地 DNS 中。