无法通过 Docker 在本地运行 Hyperkube (kubernetes)

Question

laimison

Asked: 2020-02-10 16:19:11 +0800 CST2020-02-10 16:19:11 +0800 CST 2020-02-10 16:19:11 +0800 CST

调试 Kubernetes 中的 DNS 解析问题

772

我在 Ubuntu 18.04 上使用 Kubespray 构建了一个 Kubernetes 集群，并且面临 DNS 问题，因此基本上容器无法通过它们的主机名进行通信。

有效的事情：

通过 IP 地址进行容器通信
互联网正在从容器中工作
能够解决kubernetes.default

Kubernetes 主控：

root@k8s-1:~# cat /etc/resolv.conf | grep -v ^\\#
nameserver 127.0.0.53
search home
root@k8s-1:~#

荚：

root@k8s-1:~# kubectl exec dnsutils cat /etc/resolv.conf
nameserver 169.254.25.10
search default.svc.cluster.local svc.cluster.local cluster.local home
options ndots:5
root@k8s-1:~#

CoreDNS pod 是健康的：

root@k8s-1:~# kubectl get pods --namespace=kube-system -l k8s-app=kube-dns        
NAME                       READY   STATUS    RESTARTS   AGE
coredns-58687784f9-8rmlw   1/1     Running   0          35m
coredns-58687784f9-hp8hp   1/1     Running   0          35m
root@k8s-1:~#

CoreDNS pod 的日志：

root@k8s-1:~# kubectl describe pods --namespace=kube-system -l k8s-app=kube-dns | tail -n 2
  Normal   Started           35m                 kubelet, k8s-2     Started container coredns
  Warning  DNSConfigForming  12s (x33 over 35m)  kubelet, k8s-2     Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 4.2.2.1 4.2.2.2 208.67.220.220

root@k8s-1:~# kubectl logs --namespace=kube-system coredns-58687784f9-8rmlw
.:53
2020-02-09T22:56:14.390Z [INFO] plugin/reload: Running configuration MD5 = b9d55fc86b311e1d1a0507440727efd2
2020-02-09T22:56:14.391Z [INFO] CoreDNS-1.6.0
2020-02-09T22:56:14.391Z [INFO] linux/amd64, go1.12.7, 0a218d3
CoreDNS-1.6.0
linux/amd64, go1.12.7, 0a218d3
root@k8s-1:~#

root@k8s-1:~# kubectl logs --namespace=kube-system coredns-58687784f9-hp8hp
.:53
2020-02-09T22:56:20.388Z [INFO] plugin/reload: Running configuration MD5 = b9d55fc86b311e1d1a0507440727efd2
2020-02-09T22:56:20.388Z [INFO] CoreDNS-1.6.0
2020-02-09T22:56:20.388Z [INFO] linux/amd64, go1.12.7, 0a218d3
CoreDNS-1.6.0
linux/amd64, go1.12.7, 0a218d3
root@k8s-1:~#

CoreDNS 似乎暴露了：

root@k8s-1:~# kubectl get svc --namespace=kube-system | grep coredns
coredns                ClusterIP   10.233.0.3      <none>        53/UDP,53/TCP,9153/TCP   37m
root@k8s-1:~#

root@k8s-1:~# kubectl get ep coredns --namespace=kube-system
NAME      ENDPOINTS                                                  AGE
coredns   10.233.64.2:53,10.233.65.3:53,10.233.64.2:53 + 3 more...   37m
root@k8s-1:~#

这些是我有问题的 pod - 所有集群都因为这个问题而受到影响：

root@k8s-1:~# kubectl get pods -o wide -n default
NAME                     READY   STATUS    RESTARTS   AGE   IP            NODE    NOMINATED NODE   READINESS GATES
busybox                  1/1     Running   0          17m   10.233.66.7   k8s-3   <none>           <none>
dnsutils                 1/1     Running   0          50m   10.233.66.5   k8s-3   <none>           <none>
nginx-86c57db685-p8zhc   1/1     Running   0          43m   10.233.64.3   k8s-1   <none>           <none>
nginx-86c57db685-st7rw   1/1     Running   0          47m   10.233.66.6   k8s-3   <none>           <none>
root@k8s-1:~#

能够通过 IP 地址使用 DNS 和容器访问互联网：

root@k8s-1:~# kubectl exec -it nginx-86c57db685-st7rw -- sh -c "ping 10.233.64.3"
PING 10.233.64.3 (10.233.64.3) 56(84) bytes of data.
64 bytes from 10.233.64.3: icmp_seq=1 ttl=62 time=0.481 ms
64 bytes from 10.233.64.3: icmp_seq=2 ttl=62 time=0.551 ms
...

root@k8s-1:~# kubectl exec -it nginx-86c57db685-st7rw -- sh -c "ping google.com"
PING google.com (172.217.21.174) 56(84) bytes of data.
64 bytes from fra07s64-in-f174.1e100.net (172.217.21.174): icmp_seq=1 ttl=61 time=77.9 ms
...

root@k8s-1:~# kubectl exec -it nginx-86c57db685-st7rw -- sh -c "ping kubernetes.default"
PING kubernetes.default.svc.cluster.local (10.233.0.1) 56(84) bytes of data.
64 bytes from kubernetes.default.svc.cluster.local (10.233.0.1): icmp_seq=1 ttl=64 time=0.030 ms
64 bytes from kubernetes.default.svc.cluster.local (10.233.0.1): icmp_seq=2 ttl=64 time=0.069 ms
...

实际问题：

root@k8s-1:~# kubectl exec -it nginx-86c57db685-st7rw -- sh -c "ping nginx-86c57db685-p8zhc"
ping: nginx-86c57db685-p8zhc: Name or service not known
command terminated with exit code 2
root@k8s-1:~#

root@k8s-1:~# kubectl exec -it nginx-86c57db685-st7rw -- sh -c "ping dnsutils"
ping: dnsutils: Name or service not known
command terminated with exit code 2
root@k8s-1:~#

oot@k8s-1:~# kubectl exec -ti busybox -- nslookup nginx-86c57db685-p8zhc
Server:     169.254.25.10
Address:    169.254.25.10:53

** server can't find nginx-86c57db685-p8zhc.default.svc.cluster.local: NXDOMAIN

*** Can't find nginx-86c57db685-p8zhc.svc.cluster.local: No answer
*** Can't find nginx-86c57db685-p8zhc.cluster.local: No answer
*** Can't find nginx-86c57db685-p8zhc.home: No answer
*** Can't find nginx-86c57db685-p8zhc.default.svc.cluster.local: No answer
*** Can't find nginx-86c57db685-p8zhc.svc.cluster.local: No answer
*** Can't find nginx-86c57db685-p8zhc.cluster.local: No answer
*** Can't find nginx-86c57db685-p8zhc.home: No answer

command terminated with exit code 1
root@k8s-1:~#

我是否遗漏了什么或如何使用主机名修复容器之间的通信？

非常感谢

更新

更多检查：

root@k8s-1:~# kubectl exec -ti dnsutils -- nslookup kubernetes.default
Server:     169.254.25.10
Address:    169.254.25.10#53

Name:   kubernetes.default.svc.cluster.local
Address: 10.233.0.1

我创建了 StatefulSet：

kubectl apply -f https://raw.githubusercontent.com/kubernetes/website/master/content/en/examples/application/web/web.yaml

我可以 ping 服务“nginx”：

root@k8s-1:~/kplay# k exec dnsutils -it nslookup nginx
Server:     169.254.25.10
Address:    169.254.25.10#53

Name:   nginx.default.svc.cluster.local
Address: 10.233.66.8
Name:   nginx.default.svc.cluster.local
Address: 10.233.64.3
Name:   nginx.default.svc.cluster.local
Address: 10.233.65.5
Name:   nginx.default.svc.cluster.local
Address: 10.233.66.6

使用 FQDN 时还可以联系 statefulset 成员

root@k8s-1:~/kplay# k exec dnsutils -it nslookup web-0.nginx.default.svc.cluster.local
Server:     169.254.25.10
Address:    169.254.25.10#53

Name:   web-0.nginx.default.svc.cluster.local
Address: 10.233.65.5

root@k8s-1:~/kplay# k exec dnsutils -it nslookup web-1.nginx.default.svc.cluster.local
Server:     169.254.25.10
Address:    169.254.25.10#53

Name:   web-1.nginx.default.svc.cluster.local
Address: 10.233.66.8

但不只使用主机名：

root@k8s-1:~/kplay# k exec dnsutils -it nslookup web-0
Server:     169.254.25.10
Address:    169.254.25.10#53

** server can't find web-0: NXDOMAIN

command terminated with exit code 1
root@k8s-1:~/kplay# k exec dnsutils -it nslookup web-1
Server:     169.254.25.10
Address:    169.254.25.10#53

** server can't find web-1: NXDOMAIN

command terminated with exit code 1
root@k8s-1:~/kplay#

它们都生活在同一个命名空间中：

root@k8s-1:~/kplay# k get pods -n default
NAME                     READY   STATUS    RESTARTS   AGE
busybox                  1/1     Running   22         22h
dnsutils                 1/1     Running   22         22h
nginx-86c57db685-p8zhc   1/1     Running   0          22h
nginx-86c57db685-st7rw   1/1     Running   0          22h
web-0                    1/1     Running   0          11m
web-1                    1/1     Running   0          10m

另一个确认我能够 ping 服务的测试：

kubectl create deployment --image nginx some-nginx
kubectl scale deployment --replicas 2 some-nginx
kubectl expose deployment some-nginx --port=12345 --type=NodePort

root@k8s-1:~/kplay# k exec dnsutils -it nslookup some-nginx
Server:     169.254.25.10
Address:    169.254.25.10#53

Name:   some-nginx.default.svc.cluster.local
Address: 10.233.63.137

最后的想法

有趣的事实，但也许这就是 Kubernetes 应该如何工作的？如果想单独访问某个 pod，我可以访问服务主机名和 statefulset 成员。至少在我的 k8s 使用中（可能适用于每个人），如果不是 statefulset，则到达单个 pod 似乎不是很重要。

1 个回答

Voted

Mark Watney · Answer 1 · 2020-02-12T01:00:37+08:00

我建议您遵循这一点，以便我们可以隔离您的 CoreDNS 中可能存在的问题，并且您可以看到它工作正常。

至少在我的 k8s 使用中（可能适用于每个人），如果不是 statefulset，则到达单个 pod 似乎不是很重要。

可以使用 DNS 记录访问 pod，但正如您所说，这在常规 K8s 实现中并不是很重要。

启用后，Pod 会被分配一个 DNS A 记录，格式为 pod-ip-address.my-namespace.pod.cluster.local.

例如， 1.2.3.4 在命名空间中具有 IPdefault 且 DNS 名称为的 Podcluster.local 将具有一个条目： 1-2-3-4.default.pod.cluster.local。资源

例子

$ kubectl get pods -o wide
NAME         READY   STATUS    RESTARTS   AGE     IP          NODE                                 NOMINATED NODE   READINESS GATES
dnsutils     1/1     Running   20         20h     10.28.2.3   gke-lab-default-pool-87c6b085-wcp8   <none>           <none>
sample-pod   1/1     Running   0          2m11s   10.28.2.4   gke-lab-default-pool-87c6b085-wcp8   <none>           <none>

$ kubectl exec -ti dnsutils -- nslookup 10-28-2-4.default.pod.cluster.local
Server:     10.31.240.10
Address:    10.31.240.10#53

Name:   10-28-2-4.default.pod.cluster.local
Address: 10.28.2.4

有趣的事实，但也许这就是 Kubernetes 应该如何工作的？

是的，您的 CoreDNS 正在按预期工作，并且您描述的一切都是预期的。

调试 Kubernetes 中的 DNS 解析问题

新安装后 postgres 的默认超级用户用户名/密码是什么？

SFTP 使用什么端口？

命令行列出 Windows Active Directory 组中的用户？

什么是 Pem 文件，它与其他 OpenSSL 生成的密钥文件格式有何不同？

如何确定bash变量是否为空？

调试 Kubernetes 中的 DNS 解析问题

1 个回答

相关问题