在设置 Ray Cluster 期间重复出现“proxyconnect tcp:拨号 tcp 127.0.0.1:1082:连接:连接被拒绝”,那么在哪里可以使用代理配置 K8S?
按照Ray CLuster 快速入门说明进行操作:
helm repo add kuberay https://ray-project.github.io/kuberay-helm/
helm repo update
# Install both CRDs and KubeRay operator v1.1.1.
helm install kuberay-operator kuberay/kuberay-operator --version 1.1.1
# Confirm that the operator is running in the namespace `default`.
kubectl get pods
# NAME READY STATUS RESTARTS AGE
# kuberay-operator-7fbdbf8c89-pt8bk 1/1 Running 0 27s
在步骤2中,我得到了一个ErrImagePull
状态窗格,以及实际的输出:
(base) ➜ ~ helm install kuberay-operator kuberay/kuberay-operator --version 1.0.0 [36/197]
NAME: kuberay-operator
LAST DEPLOYED: Fri Jul 26 08:56:30 2024
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
(base) ➜ ~ kubectl get pods
NAME READY STATUS RESTARTS AGE
kuberay-operator-5d64d88fdb-shrkv 0/1 ErrImagePull 0 10s
(base) ➜ ~ kubectl describe pod kuberay-operator-5d64d88fdb-shrkv
Name: kuberay-operator-5d64d88fdb-shrkv
Namespace: default
Priority: 0
Service Account: kuberay-operator
Node: kind-control-plane/172.23.0.2
Start Time: Fri, 26 Jul 2024 08:56:31 +0800
Labels: app.kubernetes.io/component=kuberay-operator
app.kubernetes.io/instance=kuberay-operator
app.kubernetes.io/name=kuberay-operator
pod-template-hash=5d64d88fdb
.....
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 22s default-scheduler Successfully assigned default/kuberay-operator-5d64d88fdb-shrkv to kind-control-plane
Normal BackOff 21s kubelet Back-off pulling image "kuberay/operator:v1.0.0"
Warning Failed 21s kubelet Error: ImagePullBackOff
Normal Pulling 6s (x2 over 21s) kubelet Pulling image "kuberay/operator:v1.0.0"
Warning Failed 6s (x2 over 21s) kubelet Failed to pull image "kuberay/operator:v1.0.0": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/kuberay/operator:v1.0.0"
: failed to resolve reference "docker.io/kuberay/operator:v1.0.0": failed to do request: Head "https://registry-1.docker.io/v2/kuberay/operator/manifests/v1.0.0": proxyconnect tcp: dial tcp 127.0.0.1:1082: connect: connection refused
Warning Failed 6s (x2 over 21s) kubelet Error: ErrImagePull
难题的问题是消息:proxyconnect tcp:dial tcp 127.0.0.1:1082:connect:连接被拒绝
我尝试了以下方法但没有找到任何代理配置:
(base) ➜ ~ echo $HTTP_PROXY
(base) ➜ ~ echo $HTTPS_PROXY
(base) ➜ ~ cat /etc/environment
PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
(base) ➜ ~ cat /etc/docker/daemon.json
{
"registry-mirrors": [
"https://5wxalzzb.mirror.aliyuncs.com",
"https://hub-mirror.c.163.com",
"https://mirror.iscas.ac.cn",
"https://docker.m.daocloud.io"
],
"runtimes": {
"nvidia": {
"args": [],
"path": "nvidia-container-runtime"
}
}
}
为了缓解 pod 问题,我启动了本地端口 1082,这是一个没有 AuthZ 和 AuthN 的 HTTP 代理,然后再次重新安装 kubera/operator,但代理的错误事件消息相同。
这可能是因为代理配置错误或端口 1082 上没有运行的现有代理服务器。
笔记:
暂时关闭您环境中的所有代理,看看这是否是导致问题的原因。
在 k8s 日志中查找与网络连接或代理问题相关的任何错误消息。
如果您无法解决代理问题,请尝试从不同的注册表中提取图像以排除注册表特定的问题。