AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / user-33300

Jonas's questions

Martin Hope
Jonas
Asked: 2021-08-22 07:54:38 +0800 CST

systemd 终止使用 podman 启动的 etcd 服务 - 仅允许主 PID 接收

  • 1

我尝试将etcd作为在podman容器中运行的 systemd 服务启动。

启动后,我从 systemd 收到此错误日志:

systemd[1]: etcd.service: Got notification message from PID 4696, but reception only permitted for main PID 4868

但是 etcd 似乎可以开始尝试通知容器守护进程:

21T15:31:08.817Z","caller":"etcdserver/server.go:2500","msg":"cluster version>
Aug 21 15:31:08 ip-10-0-0-71 podman[4696]: {"level":"info","ts":"2021-08-21T15:31:08.817Z","caller":"etcdmain/main.go:47","msg":"notifying init daemon>
Aug 21 15:31:08 ip-10-0-0-71 podman[4696]: {"level":"info","ts":"2021-08-21T15:31:08.818Z","caller":"etcdmain/main.go:53","msg":"successfully notified>

但 systemd 似乎没有意识到这一点并终止了 etcd 服务:

Aug 21 15:32:34 ip-10-0-0-71 systemd[1]: etcd.service: start operation timed out. Terminating.
Aug 21 15:32:35 ip-10-0-0-71 podman[4696]: {"level":"info","ts":"2021-08-21T15:32:35.000Z","caller":"osutil/interrupt_unix.go:64","msg":"received sign>
Aug 21 15:32:35 ip-10-0-0-71 podman[4696]: {"level":"info","ts":"2021-08-21T15:32:35.000Z","caller":"embed/etcd.go:367","msg":"closing etcd server","n>

这是 systemd 服务状态:

$ sudo systemctl status etcd.service
● etcd.service - etcd
     Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: enabled)
     Active: failed (Result: timeout) since Sat 2021-08-21 15:32:35 UTC; 8min ago
    Process: 4868 ExecStart=/usr/bin/podman run -p 2380:2380 -p 2379:2379 --volume=/var/lib/etcd:/etcd-data:z --name etcd 842445240665.dkr.ecr.eu-nort>
   Main PID: 4868 (code=exited, status=0/SUCCESS)
        CPU: 3.729s

这是我从 podman 开始的 etcd 的 systemd 单元服务文件:

cat <<EOF | sudo tee /etc/systemd/system/etcd.service
[Unit]
Description=etcd
After=podman_ecr_login.service mk_etcd_data_dir.service

[Service]
Type=notify
ExecStart=/usr/bin/podman run -p 2380:2380 -p 2379:2379 --volume=/var/lib/etcd:/etcd-data:z \
 --name etcd <my-aws-account>.dkr.ecr.eu-north-1.amazonaws.com/etcd:v3.5.0 \
 /usr/local/bin/etcd --data-dir=/etcd-data \
 --name etcd0 \
 --advertise-client-urls http://127.0.0.1:2379 \
 --listen-client-urls http://0.0.0.0:2379 \
 --initial-advertise-peer-urls http://127.0.0.1:2380 \
 --listen-peer-urls http://0.0.0.0:2380 \
 --initial-cluster etcd0=http://127.0.0.1:2380

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl enable etcd
sudo systemctl start etcd

我怀疑这可能与Type=notify我使用 podman 或 etcd 的方式有关。我以与 etcd 文档中所述类似的方式启动 etcd:在容器内运行 etcd 集群 - 运行单个节点 etcd。我在 Debian 11 上使用 Podman 3.0.1 运行它。

关于如何使用 podman 作为 systemd 服务启动 etcd 的任何建议?

linux debian systemd podman etcd
  • 1 个回答
  • 221 Views
Martin Hope
Jonas
Asked: 2021-08-19 13:07:01 +0800 CST

如何从 systemd 在 docker 中启动 etcd?

  • 1

我想从 systemd 在 docker 中启动 etcd(单节点),但似乎出了点问题 - 它在启动后大约 30 秒被终止。

看起来服务以"activating"状态启动,但在大约 30 秒后终止,但未达到"active"状态。也许 docker 容器和 systemd 之间缺少任何信号?

更新(见帖子底部):systemd 服务状态达到failed (Result: timeout)- 当我删除Restart=on-failure指令时。

当我在启动后检查 etcd 服务的状态时,我得到了这个结果:

$ sudo systemctl status etcd● etcd.service - etcd   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: activating (auto-restart) (Result: exit-code) since Wed 2021-08-18 20:13:30 UTC; 4s ago
  Process: 2971 ExecStart=/usr/bin/docker run -p 2380:2380 -p 2379:2379 --volume=etcd-data:/etcd-data --name etcd my-aws-account.dkr.ecr.eu-north-1.amazonaws.com/etcd:v3.5.0 /usr/local/bin/etcd --data-dir=/etcd-data --name etcd0 --advertise-client-urls http://10.0.0.11:2379 --listen-client-urls http://0.0.0.0:2379 --initial-advertise-peer-urls http://10.0.0.11:2380 --listen-peer-urls http://0.0.0.0:2380 --initial-cluster etcd0=http://10.0.0.11:2380 (code=exited, status=125)
 Main PID: 2971 (code=exited, status=125)

我在 Amazon Linux 2 机器上运行它,并在启动时运行用户数据脚本。我已经确认docker.service并docker_ecr_login.service成功运行。

机器启动后不久,我可以看到 etcd 正在运行:

 sudo systemctl status etcd
● etcd.service - etcd
   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: activating (start) since Wed 2021-08-18 20:30:07 UTC; 1min 20s ago
 Main PID: 1573 (docker)
    Tasks: 9
   Memory: 24.3M
   CGroup: /system.slice/etcd.service
           └─1573 /usr/bin/docker run -p 2380:2380 -p 2379:2379 --volume=etcd-data:/etcd-data --name etcd my-aws-account.dkr.ecr.eu-north-1.amazonaws.com...

Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.690Z","logger":"raft","caller":"...rm 2"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.691Z","caller":"etcdserver/serve..."3.5"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.693Z","caller":"membership/clust..."3.5"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.693Z","caller":"etcdserver/server.go:2...
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.693Z","caller":"api/capability.g..."3.5"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.693Z","caller":"etcdserver/serve..."3.5"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.693Z","caller":"embed/serve.go:9...ests"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.695Z","caller":"etcdmain/main.go...emon"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.695Z","caller":"etcdmain/main.go...emon"}
Aug 18 20:30:17 ip-10-0-0-11.eu-north-1.compute.internal docker[1573]: {"level":"info","ts":"2021-08-18T20:30:17.702Z","caller":"embed/serve.go:1...2379"}
Hint: Some lines were ellipsized, use -l to show in full.

无论 etcd 监听节点 IP (10.0.0.11) 还是 127.0.0.1,我都会得到相同的行为。

我可以在本地运行 etcd,从命令行开始(它不会在 30 秒后终止),使用:

sudo docker run -p 2380:2380 -p 2379:2379 --volume=etcd-data:/etcd-data --name etcd-local \
my-aws-account.dkr.ecr.eu-north-1.amazonaws.com/etcd:v3.5.0 \
/usr/local/bin/etcd --data-dir=/etcd-data \
--name etcd0 \
--advertise-client-urls http://127.0.0.1:2379 \
--listen-client-urls http://0.0.0.0:2379 \
--initial-advertise-peer-urls http://127.0.0.1:2380 \
--listen-peer-urls http://0.0.0.0:2380 \
--initial-cluster etcd0=http://127.0.0.1:2380

etcd 的参数类似于运行单节点 etcd-ectd 3.5 文档。

这是用于启动 etcd 的启动脚本的相关部分:

sudo docker volume create --name etcd-data

cat <<EOF | sudo tee /etc/systemd/system/etcd.service
[Unit]
Description=etcd
After=docker_ecr_login.service

[Service]
Type=notify
ExecStart=/usr/bin/docker run -p 2380:2380 -p 2379:2379 --volume=etcd-data:/etcd-data \
 --name etcd my-aws-account.dkr.ecr.eu-north-1.amazonaws.com/etcd:v3.5.0 \
 /usr/local/bin/etcd --data-dir=/etcd-data \
 --name etcd0 \
 --advertise-client-urls http://10.0.0.11:2379 \
 --listen-client-urls http://0.0.0.0:2379 \
 --initial-advertise-peer-urls http://10.0.0.11:2380 \
 --listen-peer-urls http://0.0.0.0:2380 \
 --initial-cluster etcd0=http://10.0.0.11:2380
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl enable etcd
sudo systemctl start etcd

列出机器上的所有容器时,我可以看到它一直在运行:

sudo docker ps -a
CONTAINER ID   IMAGE                                                       COMMAND                  CREATED          STATUS                      PORTS                          NAMES
a744aed0beb1   my-aws-account.dkr.ecr.eu-north-1.amazonaws.com/etcd:v3.5.0   "/usr/local/bin/etcd…"   25 minutes ago   Exited (0) 24 minutes ago                          etcd

但我怀疑它无法重新启动,因为容器名称已经存在。

从 systemd 启动时,为什么 etcd 容器会在大约 30 秒后终止?看起来它成功启动了,但 systemd 只显示它处于“激活”状态,但从未处于“激活”状态,并且它似乎在大约 30 秒后终止。从 etcd docker 容器到 systemd 是否缺少一些信号?如果是这样,我怎样才能让那个信号正确?


更新:

删除Restart=on-failure服务单元文件中的指令后,我现在得到 status: failed (Result: timeout):

$ sudo systemctl status etcd
● etcd.service - etcd
   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: failed (Result: timeout) since Wed 2021-08-18 21:35:54 UTC; 5min ago
  Process: 1567 ExecStart=/usr/bin/docker run -p 2380:2380 -p 2379:2379 --volume=etcd-data:/etcd-data --name etcd my-aws-account.dkr.ecr.eu-north-1.amazonaws.com/etcd:v3.5.0 /usr/local/bin/etcd --data-dir=/etcd-data --name etcd0 --advertise-client-urls http://127.0.0.1:2379 --listen-client-urls http://0.0.0.0:2379 --initial-advertise-peer-urls http://127.0.0.1:2380 --listen-peer-urls http://0.0.0.0:2380 --initial-cluster etcd0=http://127.0.0.1:2380 (code=exited, status=0/SUCCESS)
 Main PID: 1567 (code=exited, status=0/SUCCESS)

Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal docker[1567]: {"level":"info","ts":"2021-08-18T21:35:54.332Z","caller":"osutil/interrupt...ated"}
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal docker[1567]: {"level":"info","ts":"2021-08-18T21:35:54.333Z","caller":"embed/etcd.go:36...379"]}
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal docker[1567]: WARNING: 2021/08/18 21:35:54 [core] grpc: addrConn.createTransport failed ...ing...
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal docker[1567]: {"level":"info","ts":"2021-08-18T21:35:54.335Z","caller":"etcdserver/serve...6a6c"}
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal docker[1567]: {"level":"info","ts":"2021-08-18T21:35:54.337Z","caller":"embed/etcd.go:56...2380"}
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal docker[1567]: {"level":"info","ts":"2021-08-18T21:35:54.338Z","caller":"embed/etcd.go:56...2380"}
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal docker[1567]: {"level":"info","ts":"2021-08-18T21:35:54.339Z","caller":"embed/etcd.go:36...379"]}
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal systemd[1]: Failed to start etcd.
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal systemd[1]: Unit etcd.service entered failed state.
Aug 18 21:35:54 ip-10-0-0-11.eu-north-1.compute.internal systemd[1]: etcd.service failed.
Hint: Some lines were ellipsized, use -l to show in full.
linux docker systemd amazon-linux-2 etcd
  • 1 个回答
  • 694 Views
Martin Hope
Jonas
Asked: 2021-08-08 15:38:22 +0800 CST

yum install 后如何将 containerd 作为服务启动?

  • 2

我使用建议的命令在 Amazon Linux 2 上安装了containerd:

sudo amazon-linux-extras enable docker
sudo yum install -y containerd

我在EC2 用户数据脚本中添加了它以在实例启动时运行。

但是,我应该如何将containerd(容器运行时 - 类似于 docker)作为服务启动?由于我通过yum那里安装似乎不包含systemd 服务文件。二进制文件位于/usr/bin/containerd. 我应该echo在引导脚本中使用来生成systemd 服务文件还是一个好的做法?

linux amazon-ec2 systemd containerd amazon-linux-2
  • 1 个回答
  • 1548 Views
Martin Hope
Jonas
Asked: 2010-04-10 06:37:50 +0800 CST

集群中哪些数据库易于维护和管理?

  • 2

我正在寻找易于扩展的数据库 (DBMS)。我想要高可用性,所以我需要一个多主集群,其中数据被复制到两台或更多台物理计算机。我还希望能够从一个节点开始(无复制),然后根据需要扩展到更多节点,而无需重新安装或停机。

我想要一个易于维护和管理的 DBMS。添加节点、删除节点、实时备份和监控资源使用应该很容易。

它不一定是关系数据库系统,所以 NoSQL 就可以了。我想要一个免费版本,这样我就可以小规模地测试它并与替代品进行比较。

我有什么选择?

database cluster high-availability maintenance scalability
  • 1 个回答
  • 441 Views
Martin Hope
Jonas
Asked: 2010-03-16 03:19:26 +0800 CST

如何避免主浏览器错误,MRxSmb 事件 ID 8003?

  • 2

我有一个使用Windows SBS 2003作为域控制器的域。在日志中出现主浏览器错误是很常见的,MRxSmb 事件 ID 8003。我怎样才能避免这种情况?我做错了什么?

我知道如何解决这个问题:停止Computer Browser客户端上的服务,但我不知道如何避免这种情况,因为每次添加新客户端时问题都会出现并且我忘记停止Computer Browser服务。

错误信息:

The master browser has received a server announcement from the computer
[computer] that believes that it is the master browser for the domain on
transport NetBT_Tcpip_{#######-####-####-#. The master browser is stopping 
or an election is being forced.

是否有服务器配置可以避免此问题?

active-directory windows-sbs-2003 windows-event-log
  • 1 个回答
  • 16771 Views
Martin Hope
Jonas
Asked: 2010-02-04 14:18:11 +0800 CST

RAID1 镜像损坏

  • 2

我有一台装有 Windows Small Business Server 2003 的小型服务器。我正在使用 RAID1,通过HighPoint Rocket RAID 1640 RAID 卡,使用两个硬盘驱动器。

本周服务器出现警报,并且在重新启动时我收到了错误消息Broken Mirroring (用户手册第 30 页)。我有几个选择(参见手册),首先我尝试了Continue,但服务器在引导期间重新启动。下次我关闭 Power Off并用新的硬盘替换最旧的硬盘,当我启动时,我选择了Rebuild。然后我选择了新硬盘作为新硬盘。重建过程开始,进度条显示为 0%,但几秒钟后我收到消息复制失败!,然后启动服务器并启动 Windows Server。现在它工作正常。

但我想我现在只使用一个硬盘驱动器,它没有镜像。从那以后(两天前)我没有碰过服务器。我现在该怎么办?我没有这种情况的经验。

有没有大神指导一下?

hard-drive mirroring raid1
  • 2 个回答
  • 410 Views

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    新安装后 postgres 的默认超级用户用户名/密码是什么?

    • 5 个回答
  • Marko Smith

    SFTP 使用什么端口?

    • 6 个回答
  • Marko Smith

    命令行列出 Windows Active Directory 组中的用户?

    • 9 个回答
  • Marko Smith

    什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同?

    • 3 个回答
  • Marko Smith

    如何确定bash变量是否为空?

    • 15 个回答
  • Martin Hope
    Tom Feiner 如何按大小对 du -h 输出进行排序 2009-02-26 05:42:42 +0800 CST
  • Martin Hope
    Noah Goodrich 什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同? 2009-05-19 18:24:42 +0800 CST
  • Martin Hope
    Brent 如何确定bash变量是否为空? 2009-05-13 09:54:48 +0800 CST
  • Martin Hope
    cletus 您如何找到在 Windows 中打开文件的进程? 2009-05-01 16:47:16 +0800 CST

热门标签

linux nginx windows networking ubuntu domain-name-system amazon-web-services active-directory apache-2.4 ssh

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve