AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / server / 问题

问题[coreos](server)

Martin Hope
Olaf Mandel
Asked: 2020-06-09 06:23:34 +0800 CST

Hyper-V 来宾的 EXT4 文件系统损坏的原因

  • 3

我们在相对较短的时间内对 ext4 分区进行了第二次损坏,并且 ext4 据说非常可靠。由于这是一个虚拟机,并且提供资源的主机没有看到磁盘错误或断电等,我想暂时排除硬件错误。

所以我想知道我们是否有如此不寻常的设置(Hyper-V 主机下的 CoreOS 来宾)、如此不寻常的工作负载(Nginx、Gitlab、Redmine、MediaWiki、MariaDB 的 Docker 容器)或错误的配置。欢迎任何意见/建议。

原始错误消息(在第二种情况下)是:

Jun 05 02:00:50 localhost kernel: EXT4-fs error (device sda9): ext4_lookup:1595: inode #8347255: comm git: deleted inode referenced: 106338109
Jun 05 02:00:50 localhost kernel: Aborting journal on device sda9-8.
Jun 05 02:00:50 localhost kernel: EXT4-fs (sda9): Remounting filesystem read-only

此时,e2fsck运行发现很多错误(没有考虑保留日志),并lost+found为一个 2TB 的分区放置了大约 357MB,上面有大约 512GB 的数据。此后操作系统仍会启动,因此丢失的部分似乎位于用户数据或 docker 容器中。

以下是有关受影响系统的更多详细信息:

$ uname -srm
Linux 4.19.123-coreos x86_64
$ sudo tune2fs -l /dev/sda9
tune2fs 1.45.5 (07-Jan-2020)
Filesystem volume name:   ROOT
Last mounted on:          /sysroot
Filesystem UUID:          04ab23af-a14f-48c8-af59-6ca97b3263bc
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent 64bit flex_bg inline_data sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags:         signed_directory_hash
Default mount options:    user_xattr acl
Filesystem state:         clean
Errors behavior:          Remount read-only
Filesystem OS type:       Linux
Inode count:              533138816
Block count:              536263675
Reserved block count:     21455406
Free blocks:              391577109
Free inodes:              532851311
First block:              0
Block size:               4096
Fragment size:            4096
Group descriptor size:    64
Reserved GDT blocks:      15
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         32576
Inode blocks per group:   1018
Flex block group size:    16
Filesystem created:       Tue Sep 11 00:02:46 2018
Last mount time:          Fri Jun  5 15:40:01 2020
Last write time:          Fri Jun  5 15:40:01 2020
Mount count:              3
Maximum mount count:      -1
Last checked:             Fri Jun  5 08:14:10 2020
Check interval:           0 (<none>)
Lifetime writes:          79 GB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:               128
Journal inode:            8
Default directory hash:   half_md4
Directory Hash Seed:      595db5c2-beda-4f32-836f-ee025416b0f1
Journal backup:           inode blocks

更新:

以及有关主机设置的更多详细信息:

  • 使用 Hyper-V 服务器 2016
  • 磁盘基于虚拟磁盘文件(与物理磁盘相反)
  • 磁盘设置为动态(即增长)
  • 虚拟机上有几个快照/还原点。我不确定这是否将磁盘映像从动态切换到差异(?)
filesystems hyper-v corruption ext4 coreos
  • 1 个回答
  • 1937 Views
Martin Hope
Alexander Presber
Asked: 2020-05-05 07:21:49 +0800 CST

在带有主机网络的 docker 中使用 HAProxy

  • 0

在 docker 容器中运行 HAProxy 时,在使用此处--net=host描述的选项运行容器时,我们只能看到(并转发)原始客户端的 IP 。

我们的问题:从安全的角度来看,这样做是否可取?这会让攻击者更容易利用 HAProxy 漏洞吗?还是这是普遍做法?

security docker haproxy docker-swarm coreos
  • 1 个回答
  • 415 Views
Martin Hope
mr.zog
Asked: 2020-03-26 16:38:02 +0800 CST

我可以使用 Ansible systemd 模块来停止 CoreOS 系统吗?

  • 0

我只需要一个简单的剧本或临时语法来关闭一组 CoreOS 主机。

有些事情告诉我,我可能需要使用 shell 模块,这确实不会很糟糕。

关闭 CoreOS 主机的正确方法是“systemctl poweroff”。

systemd ansible coreos
  • 1 个回答
  • 146 Views
Martin Hope
Ishpeck
Asked: 2019-04-13 09:02:17 +0800 CST

在 CoreOS 上哪里安装 NSS 模块?

  • 1

我有一个我编写的自定义 NSS 模块,我通常通过像这样复制库来安装它......

cp libnss_mymodule.so.0 /lib64/

...然后我将我的模块添加到/etc/nsswitch.conf ...

$ grep mymodule /etc/nsswitch.conf
passwd: mymodule files usrfiles sss systemd
group: mymodule files usrfiles sss systemd

这在 CentOS 7 中对我有用,但在 CoreOS 中不起作用,因为/lib64位于只读文件系统上。我可以将共享对象库放在对 nsdispath() 可见的 CoreOS 上的什么位置?

编辑:我尝试将文件添加到/opt/me/lib64并将其放入 LD_LIBRARY_PATH 环境变量中。它似乎没有帮助。

coreos
  • 2 个回答
  • 124 Views
Martin Hope
Brian
Asked: 2018-08-01 11:20:07 +0800 CST

Coreos 裸机安装登录失败,一些点火指令被忽略

  • 0

我是 coreos 的新手,过去几天试图解决过时文档和新文档之间的差异,试图弄清楚如何在具有两个网络接口和四个 HDD 的裸机系统上安装 coreos。我已经尝试安装了几次,但留下的系统不允许我从控制台登录,也不允许我通过 ssh 远程登录。

以下是我使用的基本步骤:

  1. 将 coreos 版本 1800.4.0 ISO 映像刻录到 CD。

  2. 创建了一个点火配置yaml文件,并使用ct将其转换为json,并复制到U盘上。

  3. 在我的裸机系统上插入闪存驱动器并从 CD ISO 启动。

  4. 初始系统启动后,我从控制台输入了以下命令:

    sudo su ping google.com #验证网络 lsblk #验证我的 U 盘在:/dev/sde1 mkdir /mnt/sde1 mount /dev/sde1 /mnt/sde1 coreos-install -d /dev/sda -i /mnt/sde1/ignition.json

  5. 安装完成后,我取出 CD、闪存驱动器并重新启动。重启

系统启动,并在控制台显示 localhost 登录提示。使用我在点火文件中指定的用户登录控制台失败,它不接受密码。从 ssh 远程登录也无法识别密码。(点火文件密码哈希值是使用“openssl passwd -1”命令创建的。ssh_authorized_keys 值是使用“ssh-keygen -t rsa”命令创建的。)

此外,点火配置中指定的静态网络地址被忽略,似乎使用了 DHCP。

这是我在转换为 json 之前的配置 yaml:

# This config is meant to be consumed by the config transpiler, which will
# generate the corresponding Ignition config. Do not pass this config directly
# to instances of Container Linux.

storage:
  files:
    - filesystem: "root"
      path:       "/etc/hostname"
      mode:       0644
      contents:
        inline: coreos1
  disks:
    - device: /dev/sda
      wipe_table: true
      partitions:
       - label: root1
         type_guid: be9067b9-ea49-4f15-b4f6-f36f8c9e1818
         number: 1
         size: 120GiB
       - label: reserve1
         type_guid: a19d880f-05fc-4d3b-a006-743f0f84911e
         number: 2
    - device: /dev/sdb
      wipe_table: true
      partitions:
       - label: root2
         type_guid: be9067b9-ea49-4f15-b4f6-f36f8c9e1818
         number: 1
         size: 120GiB
       - label: reserve2
         type_guid: a19d880f-05fc-4d3b-a006-743f0f84911e
         number: 2
    - device: /dev/sdc
      wipe_table: true
      partitions:
       - label: store1
         type_guid: a19d880f-05fc-4d3b-a006-743f0f84911e
    - device: /dev/sdd
      wipe_table: true
      partitions:
       - label: store2
         type_guid: a19d880f-05fc-4d3b-a006-743f0f84911e
  raid:
    - name: "root_array"
      level: "raid1"
      devices:
        - "/dev/sda1"
        - "/dev/sdb1"
    - name: "reserve_array"
      level: "raid1"
      devices:
        - "/dev/sda2"
        - "/dev/sdb2"
    - name: "store_array"
      level: "raid0"
      devices:
        - "/dev/sdc1"
        - "/dev/sdd1"
  filesystems:
    - name: "ROOT"
      mount:
        device: "/dev/md/root_array"
        format: "ext4"
        label: "ROOT"
    - name: "RESERVE"
      mount:
        device: "/dev/md/reserve_array"
        format: "ext4"
        label: "RESERVE"
    - name: "STORE"
      mount:
        device: "/dev/md/store_array"
        format: "ext4"
        label: "STORE"
networkd:
  units:
    - name: static.network
      contents: |
        [Match]
        Name=eno1
        [Network]
        DNS= *snipped*
        Address=10.0.0.178/24
        Gateway=10.0.0.1
    - name: 00-enp2s0.network
      contents: |
        [Match]
        Name=enp2s0
        [Network]
        DNS= *snipped*
        Address=10.0.0.179/24
        Gateway=10.0.0.1
passwd:
  users:
    - name: "user1"
      password_hash: "$1$Fe8..."
      ssh_authorized_keys:
        - ssh-rsa AAAAB3N...
      groups:
        - "sudo"
        - "docker"

在花了一天的时间在互联网上搜索更多关于我做错了什么的线索之后,我似乎已经用尽了任何其他建议。

如果您对 coreos 有任何经验,请告诉我我可能做错了什么。我的目标是在裸机硬件上安装 coreos,在 RAID 阵列中有两个网卡和四个 HDD,并且能够在指定的静态地址登录。

coreos
  • 1 个回答
  • 311 Views
Martin Hope
Shadowraze
Asked: 2017-04-05 09:01:34 +0800 CST

发现流云配置的意外结束

  • 0

所以这是我的云配置

#cloud-config
coreos:
  etcd2:
    discovery: "https://discovery.etcd.io/tocken"
    advertise-client-urls: "http://$private_ipv4:2379"
    initial-advertise-peer-urls: "http://$private_ipv4:2380"
    listen-client-urls: "http://0.0.0.0:2379,http://0.0.0.0:4001"
    listen-peer-urls: "http://$private_ipv4:2380,http://$private_ipv4:7001"

  flannel:
    interface: $private_ipv4

  units:
    - name: etcd2.service
      command: start
    - name: flanneld.service
      drop-ins:
        - name: 50-network-config.conf
          content: |
            [Service]
            ExecStartPre=/usr/bin/etcdctl set /coreos.com/network/config '{ "Network": "10.1.0.0/16" }'
      command: start
    - name: sshd.socket
      command: restart
      runtime: true
      content: |
        [Unit]
        Description=OpenSSH server daemon
        Conflicts=sshd.service

        [Socket]
        ListenStream=65321
        FreeBind=true
        Accept=yes

        [Install]
        WantedBy=sockets.target
    - name: kubelet.service
      command: restart
      runtime: true
      content: |
        [Service]
        Environment=KUBELET_VERSION=v1.6.1_coreos.0
        ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests
        ExecStart=/usr/lib/coreos/kubelet-wrapper \
          --api-servers=http://127.0.0.1:8080 \
          --allow-privileged=true \
          --config=/etc/kubernetes/manifests \
          --hostname-override=$private_ipv4 \
          --cluster-dns=10.13.0.10 \
          --cluster-domain=cluster.local
        Restart=always
        RestartSec=10

        [Install]
        WantedBy=multi-user.target

users:
  - name: admin
    ssh-authorized-keys:
      - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCuCXgeT7kQfSikcU1BWRyMFi8izN+1WHPNopaaXQV2xune6nKOHN8yhGVRKaE9iQHY+6jSjxWd5SY9CEyWlIST5dxfffRkWZiuJISHAxl6+E+fI0kNsUG2AXTXuJnXBQllqkgsggfBJ+5BxNf35IyfILTqkDu99ZNBNbeTNSPJmbYgMs71fWB2TiGx8ugsZrIOzqbcEfu9KNTD+RszrLaCRAZNl1sANEk7N7ZIUaIIlBBxmaPWW1voXor4AP/SAnHMEouX25ZlruL7nCEH9BybVYT8xFVEBl0fJIoj/c1TYkk/80P7JLJg0pIAxMCWqy0NzBwEcXbef1yIlO6meDuZ Kirill@NOUTKIR
    groups:
     - "sudo"
    shell: /bin/bash
write_files:
  - path: "/etc/ssh/sshd_config"
    permissions: 0600
    owner: root:root
    content: |
         HostKey /etc/ssh/ssh_host_rsa_key
         HostKey /etc/ssh/ssh_host_dsa_key
         HostKey /etc/ssh/ssh_host_ecdsa_key
         HostKey /etc/ssh/ssh_host_ed25519_key
         UsePrivilegeSeparation yes
         KeyRegenerationInterval 3600
         ServerKeyBits 1024
         SyslogFacility AUTH
         LogLevel INFO
         LoginGraceTime 120
         PermitRootLogin no
         StrictModes yes
         RSAAuthentication yes
         PubkeyAuthentication yes
         IgnoreRhosts yes
         RhostsRSAAuthentication no
         HostbasedAuthentication no
         PermitEmptyPasswords no
         ChallengeResponseAuthentication no
         X11Forwarding yes
         X11DisplayOffset 10
         PrintMotd no
         PrintLastLog yes
         TCPKeepAlive yes
         AcceptEnv LANG LC_*
         Subsystem sftp /usr/lib/openssh/sftp-server
         UsePAM yes
         AllowUsers admin
         PasswordAuthentication no
  - path: "/etc/kubernetes/manifests/kube-apiserver.yaml
#    permissions: ??
#    owner: ??
    content: |
         apiVersion: v1
         kind: Pod
         metadata:
           name: kube-apiserver
           namespace: kube-system
         spec:
           hostNetwork: true
           containers:
           - name: kube-apiserver
             image: quay.io/coreos/hyperkube:v1.6.1_coreos.0
             command:
             - /hyperkube
             - apiserver
             - --bind-address=0.0.0.0
             - --etcd-servers=http://<master private IP>:2379,http://<node1 private IP>:2379,http://<node2 private IP>:2379
             - --allow-privileged=true
             - --service-cluster-ip-range=10.13.0.0/24
             - --secure-port=443
             - --advertise-address=<master private IP>
             - --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota
#    - --tls-cert-file=/etc/kubernetes/ssl/apiserver.pem
#    - --tls-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem
#    - --client-ca-file=/etc/kubernetes/ssl/ca.pem
             - --service-account-key-file=/etc/kubernetes/ssl/apiserver-key.pem
             - --runtime-config=extensions/v1beta1=true,extensions/v1beta1/networkpolicies=true
             ports:
             - containerPort: 443
               hostPort: 443
               name: https
             - containerPort: 8080
               hostPort: 8080
               name: local
#    volumeMounts:
#    - mountPath: /etc/kubernetes/ssl
#      name: ssl-certs-kubernetes
#      readOnly: true
#    - mountPath: /etc/ssl/certs
#      name: ssl-certs-host
#      readOnly: true
#  volumes:
#  - hostPath:
#      path: /etc/kubernetes/ssl
#    name: ssl-certs-kubernetes
#  - hostPath:
#      path: /usr/share/ca-certificates
#    name: ssl-certs-host
  - path: /etc/kubernetes/manifests/kube-proxy.yaml
#    permissions: ??
#    owner: ??
    content: |
         apiVersion: v1
         kind: Pod
         metadata:
           name: kube-proxy
           namespace: kube-system
         spec:
           hostNetwork: true
           containers:
           - name: kube-proxy
             image: quay.io/coreos/hyperkube:v1.6.1_coreos.0
             command:
             - /hyperkube
             - proxy
             - --master=http://127.0.0.1:8080
             - --proxy-mode=iptables
             securityContext:
               privileged: true
#             volumeMounts:
#             - mountPath: /etc/ssl/certs
#               name: ssl-certs-host
#               readOnly: true
#           volumes:
#           - hostPath:
#               path: /usr/share/ca-certificates
#             name: ssl-certs-host
  - path: /etc/kubernetes/manifests/kube-controller-manager.yaml
#    permissions: ??
#    owner: ??
    content: |
         apiVersion: v1
         kind: Pod
         metadata:
           name: kube-controller-manager
           namespace: kube-system
         spec:
           hostNetwork: true
           containers:
           - name: kube-controller-manager
             image: quay.io/coreos/hyperkube:v1.6.1_coreos.0
             command:
             - /hyperkube
             - controller-manager
             - --master=http://127.0.0.1:8080
             - --leader-elect=true
#             - --service-account-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem
#             - --root-ca-file=/etc/kubernetes/ssl/ca.pem
             livenessProbe:
               httpGet:
                 host: 127.0.0.1
                 path: /healthz
                 port: 10252
               initialDelaySeconds: 15
               timeoutSeconds: 1
#             volumeMounts:
#             - mountPath: /etc/kubernetes/ssl
#               name: ssl-certs-kubernetes
#               readOnly: true
#             - mountPath: /etc/ssl/certs
#               name: ssl-certs-host
#               readOnly: true
#           volumes:
#           - hostPath:
#               path: /etc/kubernetes/ssl
#             name: ssl-certs-kubernetes
#           - hostPath:
#               path: /usr/share/ca-certificates
#             name: ssl-certs-host
  - path: /etc/kubernetes/manifests/kube-scheduler.yaml
#    permissions: ??
#    owner: ??
    content: |
         apiVersion: v1
         kind: Pod
         metadata:
           name: kube-scheduler
           namespace: kube-system
         spec:
           hostNetwork: true
           containers:
           - name: kube-scheduler
             image: quay.io/coreos/hyperkube:v1.6.1_coreos.0
             command:
             - /hyperkube
             - scheduler
             - --master=http://127.0.0.1:8080
             - --leader-elect=true
             livenessProbe:
               httpGet:
                 host: 127.0.0.1
                 path: /healthz
                 port: 10251
               initialDelaySeconds: 15
               timeoutSeconds: 1

有人遇到过这种情况吗?我已经失去了 4 个小时的谷歌搜索和尝试狗屎

PS:前行错误

coreos
  • 1 个回答
  • 1558 Views
Martin Hope
gogstad
Asked: 2017-03-16 07:47:49 +0800 CST

Systemd 单位:[安装] 与命令:开始(云配置)

  • 1

我在 CoreOS 的云配置文件中配置 systemd。如果我理解正确的话,我有两种在启动时启动单元的方法:

备选方案 1,使用- 部分[Install](如数字海洋指南中所述):

- name: initialize_data
  content: |
    [Unit]
    Description=Run a command

    [Service]
    Type=oneshot
    ExecStart=/usr/bin/mkdir /foo

    [Install]
    WantedBy=multi-user.target

备选方案 2,删除[Install]-section 并使用command: start:

- name: initialize_data
  command: start
  content: |
    [Unit]
    Description=Run a command

    [Service]
    Type=oneshot
    ExecStart=/usr/bin/mkdir /foo

使用 启动设备有什么缺点command: start吗?我知道我无法控制它将在哪个单元之后启动,但还有什么?它会遵守和之[Unit]类的指令吗?Requires=After=

systemd cloud-config coreos
  • 1 个回答
  • 554 Views
Martin Hope
iameli
Asked: 2016-10-12 13:51:08 +0800 CST

无法在 EC2 实例的第二个弹性 IP 上获取任何外部流量

  • 1

我有一个运行 CoreOS 1068.9.0 的 EC2 实例。

该服务器正在运行一个非常简单的 Hello World HTTP 服务器。

> curl http://52.43.128.34/
Hello, world!

我有一个弹性 IP 分配给同一子网上的网络接口。它有公共 IP 地址 54.190.35.220 和私有 IP 地址 10.8.0.104。实例的安全组和网络接口上的安全组都允许 TCP 端口 80 上来自0.0.0.0/0.

我将网络接口添加到实例。CoreOS 日志似乎表明新 IP 添加得很好,它在 ifconfig 中显示为 eth1。

但是子网中的其他计算机无法访问HTTP服务器,外部也无法访问。VPC 内部和外部的连接均失败 - 其他计算机无法使用 访问它,curl http://10.8.0.104/外部计算机也无法使用 访问它curl http://54.190.35.220/。

是什么赋予了?

编辑:更多信息

  • 服务器应该监听所有接口。此外,我从 SSH 服务器看到了相同的行为,所以我认为这不是 HTTP 服务器本身的问题。
  • 网络 ACL 完全开放。
  • 安全组都对来自 0.0.0.0/0 的 80 端口和 22 端口的流量开放。
  • 子网的路由表:

    Route Table: rtb-b6449fd1
    ----------------------------
    Destination   | Target
    ----------------------------
    10.8.0.0/16   | local
    0.0.0.0/0     | igw-91eb56f5
    172.31.0.0/16 | pcx-f6f64e9f
    
amazon-ec2 coreos amazon-elastic-ip
  • 1 个回答
  • 566 Views
Martin Hope
ufk
Asked: 2016-09-24 15:37:39 +0800 CST

带有 etcd2-tls 的 kubelet 服务无法连接到 127.0.0.1:8080 - getsockopt:连接被拒绝

  • 0

我安装了 CoreOS stable v1122.2.0。

我已经用 tls 配置了 etcd2 并且工作正常。我基于https://github.com/coreos/etcd/tree/master/hack/tls-setup使用我为我的服务器创建的子域而不是特定的 IP 地址创建了证书,以使 calico tls 工作。

etcd2 和 calcio-node 已配置并正常工作。现在我想配置 Kubernetes。我使用了https://coreos.com/kubernetes/docs/latest/deploy-master.html上的说明,现在我只配置了一个 coreos 服务器。

当我启动 kubelet 并执行时,journalctl -f -u kubelet我收到以下消息:

 Sep 23 23:30:11 coreos-2.tux-in.com kubelet-wrapper[1473]: E0923 23:30:11.495381    1473 reflector.go:205] pkg/kubelet/kubelet.go:286: Failed to list *api.Node: Get http://127.0.0.1:8080/api/v1/nodes?fieldSelector=metadata.name%3Dcoreos-2.tux-in.com&resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
 Sep 23 23:30:11 coreos-2.tux-in.com kubelet-wrapper[1473]: E0923 23:30:11.889187    1473 reflector.go:205] pkg/kubelet/kubelet.go:267: Failed to list *api.Service: Get http://127.0.0.1:8080/api/v1/services?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
 Sep 23 23:30:12 coreos-2.tux-in.com kubelet-wrapper[1473]: E0923 23:30:12.292061    1473 reflector.go:205] pkg/kubelet/config/apiserver.go:43: Failed to list *api.Pod: Get http://127.0.0.1:8080/api/v1/pods?fieldSelector=spec.nodeName%3Dcoreos-2.tux-in.com&resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
 Sep 23 23:30:12 coreos-2.tux-in.com kubelet-wrapper[1473]: E0923 23:30:12.307222    1473 event.go:207] Unable to write event: 'Post http://127.0.0.1:8080/api/v1/namespaces/default/events: dial tcp 127.0.0.1:8080: getsockopt: connection refused' (may retry after sleeping)
 Sep 23 23:30:12 coreos-2.tux-in.com kubelet-wrapper[1473]: E0923 23:30:12.495982    1473 reflector.go:205] pkg/kubelet/kubelet.go:286: Failed to list *api.Node: Get http://127.0.0.1:8080/api/v1/nodes?fieldSelector=metadata.name%3Dcoreos-2.tux-in.com&resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
 Sep 23 23:30:12 coreos-2.tux-in.com kubelet-wrapper[1473]: E0923 23:30:12.889756    1473 reflector.go:205] pkg/kubelet/kubelet.go:267: Failed to list *api.Service: Get http://127.0.0.1:8080/api/v1/services?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
 Sep 23 23:30:13 coreos-2.tux-in.com kubelet-wrapper[1473]: E0923 23:30:13.292671    1473 reflector.go:205] pkg/kubelet/config/apiserver.go:43: Failed to list *api.Pod: Get http://127.0.0.1:8080/api/v1/pods?fieldSelector=spec.nodeName%3Dcoreos-2.tux-in.com&resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
 Sep 23 23:30:13 coreos-2.tux-in.com kubelet-wrapper[1473]: E0923 23:30:13.496732    1473 reflector.go:205] pkg/kubelet/kubelet.go:286: Failed to list *api.Node: Get http://127.0.0.1:8080/api/v1/nodes?fieldSelector=metadata.name%3Dcoreos-2.tux-in.com&resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
 Sep 23 23:30:13 coreos-2.tux-in.com kubelet-wrapper[1473]: E0923 23:30:13.589335    1473 kubelet.go:1938] Failed creating a mirror pod for "kube-apiserver-coreos-2.tux-in.com_kube-system(9b41319800532574b4c4ac760c920bee)": Post http://127.0.0.1:8080/api/v1/namespaces/kube-system/pods: dial tcp 127.0.0.1:8080: getsockopt: connection refused
 Sep 23 23:30:13 coreos-2.tux-in.com kubelet-wrapper[1473]: E0923 23:30:13.890294    1473 reflector.go:205] pkg/kubelet/kubelet.go:267: Failed to list *api.Service: Get http://127.0.0.1:8080/api/v1/services?resourceVersion=0: dial tcp 127.0.0.1:8080: getsockopt: connection refused
 Sep 23 23:30:13 coreos-2.tux-in.com kubelet-wrapper[1473]: I0923 23:30:13.979257    1473 docker_manager.go:2289] checking backoff for container "kube-apiserver" in pod "kube-apiserver-coreos-2.tux-in.com"
 Sep 23 23:30:13 coreos-2.tux-in.com kubelet-wrapper[1473]: I0923 23:30:13.980071    1473 docker_manager.go:2303] Back-off 2m40s restarting failed container=kube-apiserver pod=kube-apiserver-coreos-2.tux-in.com_kube-system(9b41319800532574b4c4ac760c920bee)
 Sep 23 23:30:13 coreos-2.tux-in.com kubelet-wrapper[1473]: E0923 23:30:13.980144    1473 pod_workers.go:183] Error syncing pod 9b41319800532574b4c4ac760c920bee, skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=kube-apiserver pod=kube-apiserver-coreos-2.tux-in.com_kube-system(9b41319800532574b4c4ac760c920bee)"

这是我的/var/lib/coreos-install/user_data文件:

 #cloud-config

 hostname: "coreos-2.tux-in.com"
 write_files:
  - path: "/etc/ssl/etcd/ca.pem"
    permissions: "0666"
    owner: "etcd:etcd"
    content: |
     ...
  - path: "/etc/ssl/etcd/etcd1.pem"
    permissions: "0666"
    owner: "etcd:etcd"
    content: |
     ...
  - path: "/etc/ssl/etcd/etcd1-key.pem"
    permissions: "0666"
    owner: "etcd:etcd"
    content: |
     ...
  - path: "/etc/kubernetes/ssl/ca.pem"
    permissions: "0600"
    owner: "root:root"
    content: |
     ...
  - path: "/etc/kubernetes/ssl/apiserver.pem"
    permissions: "0600"
    owner: "root:root"
    content: |
     ...
  - path: "/etc/kubernetes/ssl/apiserver-key.pem"
    permissions: "0600"
    owner: "root:root"
    content: |
     ...
  - path: "/etc/kubernetes/cni/net.d/10-calico.conf"
    content: |
      {
          "name": "calico",
          "type": "flannel",
          "delegate": {
              "type": "calico",
              "etcd_endpoints": "https://coreos-2.tux-in.com:2379",
              "log_level": "none",
              "log_level_stderr": "info",
              "hostname": "coreos-2.tux-in.com",
              "policy": {
                  "type": "k8s",
                  "k8s_api_root": "http://127.0.0.1:8080/api/v1/"
              }
          }
      }
  - path: "/etc/kubernetes/manifests/policy-controller.yaml"
    content: |
     apiVersion: v1
      kind: Pod
      metadata:
        name: calico-policy-controller
        namespace: calico-system
      spec:
        hostNetwork: true
        containers:
          # The Calico policy controller.
          - name: k8s-policy-controller
            image: calico/kube-policy-controller:v0.2.0
            env:
              - name: ETCD_ENDPOINTS
                value: "https://coreos-2.tux-in.com:2379"
              - name: K8S_API
                value: "http://127.0.0.1:8080"
              - name: LEADER_ELECTION
                value: "true"
          # Leader election container used by the policy controller.
          - name: leader-elector
            image: quay.io/calico/leader-elector:v0.1.0
            imagePullPolicy: IfNotPresent
            args:
              - "--election=calico-policy-election"
              - "--election-namespace=calico-system"
              - "--http=127.0.0.1:4040"
  - path: "/etc/kubernetes/manifests/kube-scheduler.yaml"
    content: |
      apiVersion: v1
      kind: Pod
      metadata:
        name: kube-scheduler
        namespace: kube-system
      spec:
        hostNetwork: true
        containers:
        - name: kube-scheduler
          image: quay.io/coreos/hyperkube:v1.3.6_coreos.0
          command:
          - /hyperkube
          - scheduler
          - --master=http://127.0.0.1:8080
          - --leader-elect=true
          livenessProbe:
            httpGet:
              host: 127.0.0.1
              path: /healthz
              port: 10251
            initialDelaySeconds: 15
            timeoutSeconds: 1
  - path: "/etc/kubernetes/manifests/kube-controller-manager.yaml"
    content: |
      apiVersion: v1
      kind: Pod
      metadata:
        name: kube-controller-manager
        namespace: kube-system
      spec:
        hostNetwork: true
        containers:
        - name: kube-controller-manager
          image: quay.io/coreos/hyperkube:v1.3.6_coreos.0
          command:
          - /hyperkube
          - controller-manager
          - --master=http://127.0.0.1:8080
          - --leader-elect=true
          - --service-account-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem
          - --root-ca-file=/etc/kubernetes/ssl/ca.pem
          livenessProbe:
            httpGet:
              host: 127.0.0.1
              path: /healthz
              port: 10252
            initialDelaySeconds: 15
            timeoutSeconds: 1
          volumeMounts:
          - mountPath: /etc/kubernetes/ssl
            name: ssl-certs-kubernetes
            readOnly: true
          - mountPath: /etc/ssl/certs
            name: ssl-certs-host
            readOnly: true
        volumes:
        - hostPath:
            path: /etc/kubernetes/ssl
          name: ssl-certs-kubernetes
        - hostPath:
            path: /usr/share/ca-certificates
          name: ssl-certs-host
  - path: "/etc/kubernetes/manifests/kube-proxy.yaml"
    content: |
      apiVersion: v1
      kind: Pod
      metadata:
        name: kube-proxy
        namespace: kube-system
      spec:
        hostNetwork: true
        containers:
        - name: kube-proxy
          image: quay.io/coreos/hyperkube:v1.3.6_coreos.0
          command:
          - /hyperkube
          - proxy
          - --master=http://127.0.0.1:8080
          - --proxy-mode=iptables
          securityContext:
            privileged: true
          volumeMounts:
          - mountPath: /etc/ssl/certs
            name: ssl-certs-host
            readOnly: true
        volumes:
        - hostPath:
            path: /usr/share/ca-certificates
          name: ssl-certs-host
  - path: "/etc/kubernetes/manifests/kube-apiserver.yaml"
    content: |
      apiVersion: v1
      kind: Pod
      metadata:
        name: kube-apiserver
        namespace: kube-system
      spec:
        hostNetwork: true
        containers:
        - name: kube-apiserver
          image: quay.io/coreos/hyperkube:v1.3.6_coreos.0
          command:
          - /hyperkube
          - apiserver
          - --bind-address=0.0.0.0
          - --etcd-servers=https://coreos-2.tux-in.com:2379
          - --allow-privileged=true
          - --service-cluster-ip-range=10.0.0.0/24
          - --secure-port=443
          - --advertise-address=coreos-2.tux-in.com
          - --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,ResourceQuota
          - --tls-cert-file=/etc/kubernetes/ssl/apiserver.pem
          - --tls-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem
          - --client-ca-file=/etc/kubernetes/ssl/ca.pem
          - --service-account-key-file=/etc/kubernetes/ssl/apiserver-key.pem
          - --runtime-config=extensions/v1beta1=true,extensions/v1beta1/networkpolicies=true
          ports:
          - containerPort: 443
            hostPort: 443
            name: https
          - containerPort: 8080
            hostPort: 8080
            name: local
          volumeMounts:
          - mountPath: /etc/kubernetes/ssl
            name: ssl-certs-kubernetes
            readOnly: true
          - mountPath: /etc/ssl/certs
            name: ssl-certs-host
            readOnly: true
        volumes:
        - hostPath:
            path: /etc/kubernetes/ssl
          name: ssl-certs-kubernetes
        - hostPath:
            path: /usr/share/ca-certificates
          name: ssl-certs-host
 ssh_authorized_keys:
          - ...
 coreos:
   etcd2:
     # generate a new token for each unique cluster from https://discovery.etcd.io/new?size=3
     # specify the initial size of your cluster with ?size=X
     discovery: ...
     advertise-client-urls: https://coreos-2.tux-in.com:2379,https://coreos-2.tux-in.com:4001
     initial-advertise-peer-urls: https://coreos-2.tux-in.com:2380
     # listen on both the official ports and the legacy ports
     # legacy ports can be omitted if your application doesn't depend on them
     listen-client-urls: https://0.0.0.0:2379,https://0.0.0.0:4001
     listen-peer-urls: https://coreos-2.tux-in.com:2380
   flannel:
     etcd_endpoints: "https://coreos-2.tux-in.com:2379"
     etcd_cafile: /etc/ssl/etcd/ca.pem
     etcd_certfile: /etc/ssl/etcd/etcd1.pem
     etcd_keyfile: /etc/ssl/etcd/etcd1-key.pem
   update:
     reboot-strategy: etcd-lock
   units:
     - name: 00-enp4s0.network
       runtime: true
       content: |
        [Match]
        Name=enp4s0

        [Network]
        Address=10.79.218.2/24
        Gateway=10.79.218.232
        DNS=8.8.8.8
     - name: var-lib-rkt.mount
       enable: true
       command: start
       content: |
         [Mount]
         What=/dev/disk/by-uuid/daca9515-5040-4f1d-ac0b-b69de3b91343
         Where=/var/lib/rkt
         Type=btrfs
         Options=loop,discard
     - name: etcd2.service
       command: start
       drop-ins:
        - name: 30-certs.conf
          content: |
           [Service]
           Environment="ETCD_CERT_FILE=/etc/ssl/etcd/etcd1.pem"
           Environment="ETCD_KEY_FILE=/etc/ssl/etcd/etcd1-key.pem"
           Environment="ETCD_TRUSTED_CA_FILE=/etc/ssl/etcd/ca.pem"
           Environment="ETCD_CLIENT_CERT_AUTH=true"
           Environment="ETCD_PEER_CERT_FILE=/etc/ssl/etcd/etcd1.pem"
           Environment="ETCD_PEER_KEY_FILE=/etc/ssl/etcd/etcd1-key.pem"
           Environment="ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/etcd/ca.pem"
           Environment="ETCD_PEER_CLIENT_CERT_AUTH=true"
     - name: flanneld.service
       command: start
       drop-ins:
        - name: 50-network-config.conf
          content: |
           [Service]
           ExecStartPre=/usr/bin/etcdctl --ca-file=/etc/ssl/etcd/ca.pem --cert-file=/etc/ssl/etcd/etcd1.pem --key-file=/etc/ssl/etcd/etcd1-key.pem --endpoint=https://coreos-2.tux-in.com:2379 set /coreos.com/network/config '{"Network":"10.1.0.0/16", "Backend": {"Type": "vxlan"}}'
     - name: calico-node.service
       command: start
       content: |
        [Unit]
        Description=Calico per-host agent
        Requires=network-online.target
        After=network-online.target

        [Service]
        Slice=machine.slice
        Environment=CALICO_DISABLE_FILE_LOGGING=true
        Environment=HOSTNAME=coreos-2.tux-in.com
        Environment=IP=10.79.218.2
        Environment=FELIX_FELIXHOSTNAME=coreos-2.tux-in.com
        Environment=CALICO_NETWORKING=false
        Environment=NO_DEFAULT_POOLS=true
        Environment=ETCD_ENDPOINTS=https://coreos-2.tux-in.com:2379
        Environment=ETCD_AUTHORITY=coreos-2.tux-in.com:2379
        Environment=ETCD_SCHEME=https
        Environment=ETCD_CA_CERT_FILE=/etc/ssl/etcd/ca.pem
        Environment=ETCD_CERT_FILE=/etc/ssl/etcd/etcd1.pem
        Environment=ETCD_KEY_FILE=/etc/ssl/etcd/etcd1-key.pem
        ExecStart=/usr/bin/rkt run --volume=resolv-conf,kind=host,source=/etc/resolv.conf,readOnly=true \
        --volume=etcd-tls-certs,kind=host,source=/etc/ssl/etcd,readOnly=true --inherit-env --stage1-from-dir=stage1-fly.aci \
        --volume=modules,kind=host,source=/lib/modules,readOnly=false \
        --mount=volume=modules,target=/lib/modules \
        --trust-keys-from-https quay.io/calico/node:v0.19.0 \
        --mount=volume=etcd-tls-certs,target=/etc/ssl/etcd \
        --mount=volume=resolv-conf,target=/etc/resolv.conf

        KillMode=mixed
        Restart=always
        TimeoutStartSec=0

        [Install]
        WantedBy=multi-user.target
     - name: kubelet.service
       command: start
       content: |
        [Service]
        ExecStartPre=/usr/bin/mkdir -p /etc/kubernetes/manifests
        ExecStartPre=/usr/bin/mkdir -p /var/log/containers

        Environment=KUBELET_VERSION=v1.3.7_coreos.0
        Environment="RKT_OPTS=--volume var-log,kind=host,source=/var/log \
          --mount volume=var-log,target=/var/log \
          --volume dns,kind=host,source=/etc/resolv.conf \
          --mount volume=dns,target=/etc/resolv.conf"

        ExecStart=/usr/lib/coreos/kubelet-wrapper \
          --api-servers=http://127.0.0.1:8080 \
          --network-plugin-dir=/etc/kubernetes/cni/net.d \
          --network-plugin=cni \
          --register-schedulable=false \
          --allow-privileged=true \
          --config=/etc/kubernetes/manifests \
          --hostname-override=coreos-2.tux-in.com \
          --cluster-dns=8.8.8.8 \
          --cluster-domain=tux-in.com
        Restart=always
        RestartSec=10
        [Install]
        WantedBy=multi-user.target

127.0.0.1:8080应该由 kubelet-apiserver 打开吗?我在这里想念什么?

谢谢!

kubernetes coreos
  • 1 个回答
  • 2034 Views
Martin Hope
ufk
Asked: 2016-09-15 11:50:17 +0800 CST

在 coreos 上运行 calico rkt 容器时出现“EtcdException:无法获取服务器列表”

  • 1

我有两台coreos stable v1122.2.0机器,每台都配置了tls的etcd2。

我使用https://github.com/coreos/etcd/tree/master/hack/tls-setup创建了证书。

现在我正在尝试配置 calico-node 以使用 rkt 在我的 coreos 主节点上运行。

我在 cloud-config 配置中有以下内容:

write_files:
 - path: "/etc/kubernetes/cni/net.d/10-calico.conf"
   content: |
     {
     "name": "calico",
     "type": "flannel",
     "delegate": {
         "type": "calico",
         "etcd_endpoints": "https://10.79.218.2:2379,https://10.79.218.3:2379",
         "log_level": "none",
         "log_level_stderr": "info",
         "hostname": "10.79.218.2",
         "policy": {
             "type": "k8s",
             "k8s_api_root": "http://127.0.0.1:8080/api/v1/"
             }
         }
     }
 - path: "/etc/kubernetes/manifests/policy-controller.yaml"
   content: |
    apiVersion: v1
     kind: Pod
     metadata:
       name: calico-policy-controller
       namespace: calico-system
     spec:
       hostNetwork: true
       containers:
         # The Calico policy controller.
         - name: k8s-policy-controller
           image: calico/kube-policy-controller:v0.2.0
           env:
             - name: ETCD_ENDPOINTS
               value: "https://10.79.218.2:2379,https://10.79.218.3:2379"
             - name: K8S_API
               value: "http://127.0.0.1:8080"
             - name: LEADER_ELECTION
               value: "true"
         # Leader election container used by the policy controller.
         - name: leader-elector
           image: quay.io/calico/leader-elector:v0.1.0
           imagePullPolicy: IfNotPresent
           args:
             - "--election=calico-policy-election"
             - "--election-namespace=calico-system"
             - "--http=127.0.0.1:4040"
...
units:
 - name: calico-node.service
   enable: true
   command: start
   content: |
    [Unit]
    Description=Calico per-host agent
    Requires=network-online.target
    After=network-online.target

    [Service]
    Slice=machine.slice
    Environment=CALICO_DISABLE_FILE_LOGGING=true
    Environment=HOSTNAME=10.79.218.2
    Environment=IP=10.79.218.2
    Environment=FELIX_FELIXHOSTNAME=10.79.218.2
    Environment=CALICO_NETWORKING=false
    Environment=NO_DEFAULT_POOLS=true
    Environment=ETCD_ENDPOINTS=https://10.79.218.2:2379,https://10.79.218.3:2379
    ExecStart=/usr/bin/rkt run --inherit-env --stage1-from-dir=stage1-fly.aci \
   --volume=modules,kind=host,source=/lib/modules,readOnly=false \
   --mount=volume=modules,target=/lib/modules \
   --trust-keys-from-https quay.io/calico/node:v0.19.0

   KillMode=mixed
   Restart=always
   TimeoutStartSec=0

   [Install]
   WantedBy=multi-user.target

请忽略空格缩进..我认为我没有正确复制/粘贴它:)

当我尝试启动 calico-node 服务时,出现以下错误:

Sep 14 05:45:17 localhost systemd[1]: Started Calico per-host agent.
Sep 14 05:45:17 localhost rkt[1644]: image: using image from file /usr/lib64/rkt/stage1-images/stage1-fly.aci
Sep 14 05:45:18 localhost rkt[1644]: image: using image from local store for image name quay.io/calico/node:v0.19.0
Sep 14 05:45:25 localhost rkt[1644]: Traceback (most recent call last):
Sep 14 05:45:25 localhost rkt[1644]:   File "startup.py", line 292, in <module>
Sep 14 05:45:25 localhost rkt[1644]:     client = IPAMClient()
Sep 14 05:45:25 localhost rkt[1644]:   File "/usr/lib/python2.7/site-packages/pycalico/datastore.py", line 228, in __init__
Sep 14 05:45:25 localhost rkt[1644]:     "%s" % (ETCD_CA_CERT_FILE_ENV, etcd_ca))
Sep 14 05:45:25 localhost rkt[1644]: pycalico.datastore_errors.DataStoreError: Invalid ETCD_CA_CERT_FILE. Certificate Authority cert is required and m
Sep 14 05:45:25 localhost rkt[1644]: Calico node failed to start
Sep 14 05:45:25 localhost systemd[1]: calico-node.service: Main process exited, code=exited, status=1/FAILURE
Sep 14 05:45:25 localhost systemd[1]: calico-node.service: Unit entered failed state.
Sep 14 05:45:25 localhost systemd[1]: calico-node.service: Failed with result 'exit-code'.
Sep 14 05:45:25 localhost systemd[1]: calico-node.service: Service hold-off time over, scheduling restart.
Sep 14 05:45:25 localhost systemd[1]: Stopped Calico per-host agent.
Sep 14 05:45:25 localhost systemd[1]: Started Calico per-host agent.
Sep 14 05:45:25 localhost rkt[1714]: image: using image from file /usr/lib64/rkt/stage1-images/stage1-fly.aci
Sep 14 05:45:26 localhost rkt[1714]: image: using image from local store for image name quay.io/calico/node:v0.19.0
Sep 14 05:45:28 localhost rkt[1714]: Traceback (most recent call last):
Sep 14 05:45:28 localhost rkt[1714]:   File "startup.py", line 292, in <module>
Sep 14 05:45:28 localhost rkt[1714]:     client = IPAMClient()
Sep 14 05:45:28 localhost rkt[1714]:   File "/usr/lib/python2.7/site-packages/pycalico/datastore.py", line 228, in __init__
Sep 14 05:45:28 localhost rkt[1714]:     "%s" % (ETCD_CA_CERT_FILE_ENV, etcd_ca))
Sep 14 05:45:28 localhost rkt[1714]: pycalico.datastore_errors.DataStoreError: Invalid ETCD_CA_CERT_FILE. Certificate Authority cert is required and m

第 2-25 行

所以我明白了Invalid ETCD_CA_CERT_FILE.。我并没有真正向 calico 指定要使用的键..所以我想我缺少一些配置。

我在 /etc/ssl/etcd 有以下等相关的键

8 -rw-------. 1 etcd etcd 1050 Sep 14 05:45 ca.pem
8 -rw-------. 1 etcd etcd  289 Sep 14 05:45 etcd1-key.pem
8 -rw-------. 1 etcd etcd 1058 Sep 14 05:45 etcd1.pem
8 -rw-------. 1 etcd etcd  227 Sep 12 03:49 server1-key.pem
8 -rw-------. 1 etcd etcd  822 Sep 12 03:49 server1.pem

我尝试添加Environment=ETCD_CA_CERT_FILE=/etc/ssl/etcd/ca.pem到 calico-node systemd 文件,但得到完全相同的结果。

有任何想法吗 ?

更新

所以我尝试手动运行 calico,而不是使用 systemd。我还添加了 calico 所需的所有环境变量

export CALICO_DISABLE_FILE_LOGGING=true
export HOSTNAME=10.79.218.2
export IP=10.79.218.2
export FELIX_FELIXHOSTNAME=10.79.218.2
export CALICO_NETWORKING=false
export NO_DEFAULT_POOLS=true
export ETCD_ENDPOINTS=https://10.79.218.2:2379,https://10.79.218.3:2379
export ETCD_AUTHORITY=10.79.218.2:2379
export ETCD_SCHEME=https
export ETCD_CA_CERT_FILE=/etc/ssl/etcd/ca.pem
export ETCD_CERT_FILE=/etc/ssl/etcd/etcd1.pem
export ETCD_KEY_FILE=/etc/ssl/etcd/etcd1-key.pem

当我尝试使用以下命令执行印花布容器时:

/usr/bin/rkt run --inherit-env --stage1-from-dir=stage1-fly.aci \
 --volume=modules,kind=host,source=/lib/modules,readOnly=false \
 --mount=volume=modules,target=/lib/modules \
 --trust-keys-from-https quay.io/calico/node:v0.19.0

我明白了

image: using image from file /usr/lib64/rkt/stage1-images/stage1-fly.aci
image: using image from local store for image name quay.io/calico/node:v0.19.0
Traceback (most recent call last):
  File "startup.py", line 292, in <module>
   client = IPAMClient()
  File "/usr/lib/python2.7/site-packages/pycalico/datastore.py", line 221, in __init__
    ETCD_CERT_FILE_ENV, etcd_cert))
pycalico.datastore_errors.DataStoreError: Cannot read ETCD_KEY_FILE and/or ETCD_CERT_FILE. Both must be readable file paths. Values provided: ETCD_KEY_FILE=/etc/ssl/etcd/etcd1-key.pem, ETCD_CERT_FILE=/etc/ssl/etcd/etcd1.pem

我将证书文件的文件权限更改为 666,但这并不能解决问题。而且我知道这些证书是有效的,因为 etcd tls 可以正常工作。所以我错过了什么?

更新 2

看来我缺少将证书目录安装在印花布容器上。

所以现在我正在运行印花布容器

/usr/bin/rkt run --volume etcd-ssl,kind=host,source=/etc/ssl/etcd/,readOnly=true --inherit-env --stage1-from-dir=stage1-fly.aci  --volume=modules,kind=host,source=/lib/modules,readOnly=false  --mount=volume=modules,target=/lib/modules  --trust-keys-from-https quay.io/calico/node:v0.19.0 --mount volume=etcd-ssl,target=/etc/ssl/etcd

我得到以下输出:

image: using image from file /usr/lib64/rkt/stage1-images/stage1-fly.aci
image: using image from local store for image name quay.io/calico/node:v0.19.0
Traceback (most recent call last):
  File "startup.py", line 292, in <module>
client = IPAMClient()
  File "/usr/lib/python2.7/site-packages/pycalico/datastore.py", line 246, in __init__
allow_reconnect=True)
  File "/usr/lib/python2.7/site-packages/etcd/client.py", line 204, in __init__
set(self.machines))
  File "/usr/lib/python2.7/site-packages/etcd/client.py", line 299, in machines
return self.machines
  File "/usr/lib/python2.7/site-packages/etcd/client.py", line 301, in machines
    raise etcd.EtcdException("Could not get the list of servers, "
etcd.EtcdException: Could not get the list of servers, maybe you provided the wrong host(s) to connect to?
Calico node failed to start

我有点接近..但仍然没有解决方案。

更新 3

我尝试通过运行将 ETCD_ENDPOINTS 设置为 coreos 机器上的 etcd 服务器export ETCD_ENDPOINTS=https://10.79.218.2:2379,现在当我尝试运行 calico rkt 映像时,我得到:

image: using image from file /usr/lib64/rkt/stage1-images/stage1-fly.aci
image: using image from local store for image name quay.io/calico/node:v0.19.0
Traceback (most recent call last):
  File "startup.py", line 295, in <module>
main()
  File "startup.py", line 251, in main
warn_if_hostname_conflict(ip)
  File "startup.py", line 192, in warn_if_hostname_conflict
current_ipv4, _ = client.get_host_bgp_ips(hostname)
  File "/usr/lib/python2.7/site-packages/pycalico/datastore.py", line 132, in wrapped
"running?" % (fn.__name__, e.message))
pycalico.datastore_errors.DataStoreError: get_host_bgp_ips: Error accessing etcd (Connection to etcd failed due to SSLError(CertificateError("hostname '10.79.218.2' doesn't match u'etcd'",),)).  Is etcd running?
Calico node failed to start
etcd coreos rkt
  • 2 个回答
  • 890 Views

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    新安装后 postgres 的默认超级用户用户名/密码是什么?

    • 5 个回答
  • Marko Smith

    SFTP 使用什么端口?

    • 6 个回答
  • Marko Smith

    命令行列出 Windows Active Directory 组中的用户?

    • 9 个回答
  • Marko Smith

    什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同?

    • 3 个回答
  • Marko Smith

    如何确定bash变量是否为空?

    • 15 个回答
  • Martin Hope
    Tom Feiner 如何按大小对 du -h 输出进行排序 2009-02-26 05:42:42 +0800 CST
  • Martin Hope
    Noah Goodrich 什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同? 2009-05-19 18:24:42 +0800 CST
  • Martin Hope
    Brent 如何确定bash变量是否为空? 2009-05-13 09:54:48 +0800 CST
  • Martin Hope
    cletus 您如何找到在 Windows 中打开文件的进程? 2009-05-01 16:47:16 +0800 CST

热门标签

linux nginx windows networking ubuntu domain-name-system amazon-web-services active-directory apache-2.4 ssh

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve