我在 Proxmox 5.2-11 下运行 Ubuntu 16.04 容器。应用最新一轮补丁1后,我无法在控制台或通过 ssh 登录。
我在管理程序上安装了容器根 FS 并添加pts/0
到/etc/security/access.conf
(我们运行pam_access
)并允许根登录到控制台。我们有root : lxc/tty0 lxc/tty1 lxc/tty2
我access.conf
认为足够的东西,所以为什么我pts/0
现在需要它是令人费解的。
我注意到 ssh 没有运行,所以尝试手动启动它 ( /usr/sbin/sshd -DDD -f /etc/ssh/sshd_config
) 并收到此错误:
Missing privilege separation directory: /var/run/sshd
我手动创建了目录,启动ssh
并最终能够登录,但重新启动后,问题仍然存在。未创建目录。只有有用journalctl
的部分和唯一有趣的部分是关于“不允许操作”的内容,但没有更多信息。
我对 16.04 不太熟悉,所以想知道如何找到有关该问题的更多信息。我没有/var/log/syslog
或/var/log/messages
只有kern.log
这样一种迷失。
systemd-sysv 229-4ubuntu21.9
libpam-systemd 229-4ubuntu21.9
libsystemd0 229-4ubuntu21.9
systemd 229-4ubuntu21.9
udev 229-4ubuntu21.9
libudev1 229-4ubuntu21.9
iproute2 4.3.0-1ubuntu3.16.04.4
libsasl2-modules-db 2.1.26.dfsg1-14ubuntu0.1
libsasl2-2 2.1.26.dfsg1-14ubuntu0.1
ldap-utils 2.4.42dfsg-2ubuntu3.4
libldap-2.4-2 2.4.42dfsg-2ubuntu3.4
libsasl2-modules 2.1.26.dfsg1-14ubuntu0.1
libgs9-common 9.25dfsg1-0ubuntu0.16.04.3
ghostscript 9.25dfsg1-0ubuntu0.16.04.3
libgs9 9.25dfsg1-0ubuntu0.16.04.3
[2]
Nov 27 10:13:48 host16 systemd[1]: Starting OpenBSD Secure Shell server...
Nov 27 10:13:48 host16 sshd[474]: Missing privilege separation directory: /var/run/sshd
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Control process exited, code=exited status=255
Nov 27 10:13:48 host16 systemd[1]: Failed to start OpenBSD Secure Shell server.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Unit entered failed state.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Failed with result 'exit-code'.
Nov 27 10:13:48 host16 mysqld_safe[495]: Starting mysqld daemon with databases from /var/lib/mysql/mysql
Nov 27 10:13:48 host16 mysqld[500]: 181127 10:13:48 [Note] /usr/sbin/mysqld (mysqld 10.0.36-MariaDB-0ubuntu0.16.04.1) starting as process 499 ...
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Service hold-off time over, scheduling restart.
Nov 27 10:13:48 host16 systemd[1]: Stopped OpenBSD Secure Shell server.
Nov 27 10:13:48 host16 systemd[1]: Failed to reset devices.list on /system.slice/ssh.service: Operation not permitted
Nov 27 10:13:48 host16 systemd[1]: Starting OpenBSD Secure Shell server...
Nov 27 10:13:48 host16 sshd[502]: Missing privilege separation directory: /var/run/sshd
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Control process exited, code=exited status=255
Nov 27 10:13:48 host16 systemd[1]: Failed to start OpenBSD Secure Shell server.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Unit entered failed state.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Failed with result 'exit-code'.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Service hold-off time over, scheduling restart.
Nov 27 10:13:48 host16 systemd[1]: Stopped OpenBSD Secure Shell server.
Nov 27 10:13:48 host16 systemd[1]: Failed to reset devices.list on /system.slice/ssh.service: Operation not permitted
Nov 27 10:13:48 host16 systemd[1]: Starting OpenBSD Secure Shell server...
Nov 27 10:13:48 host16 sshd[503]: Missing privilege separation directory: /var/run/sshd
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Control process exited, code=exited status=255
Nov 27 10:13:48 host16 systemd[1]: Failed to start OpenBSD Secure Shell server.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Unit entered failed state.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Failed with result 'exit-code'.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Service hold-off time over, scheduling restart.
Nov 27 10:13:48 host16 systemd[1]: Stopped OpenBSD Secure Shell server.
Nov 27 10:13:48 host16 systemd[1]: Failed to reset devices.list on /system.slice/ssh.service: Operation not permitted
Nov 27 10:13:48 host16 systemd[1]: Starting OpenBSD Secure Shell server...
Nov 27 10:13:48 host16 sshd[504]: Missing privilege separation directory: /var/run/sshd
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Control process exited, code=exited status=255
Nov 27 10:13:48 host16 systemd[1]: Failed to start OpenBSD Secure Shell server.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Unit entered failed state.
Nov 27 10:13:48 host16 systemd[1]: ssh.service: Failed with result 'exit-code'.
Nov 27 10:13:49 host16 systemd[1]: ssh.service: Service hold-off time over, scheduling restart.
Nov 27 10:13:49 host16 systemd[1]: Stopped OpenBSD Secure Shell server.
Nov 27 10:13:49 host16 systemd[1]: ssh.service: Start request repeated too quickly.
Nov 27 10:13:49 host16 systemd[1]: Failed to start OpenBSD Secure Shell server.
Nov 27 10:13:49 host16 systemd[1]: ssh.service: Unit entered failed state.
Nov 27 10:13:49 host16 systemd[1]: ssh.service: Failed with result 'start-limit-hit'.
Nov 27 10:13:49 host16 systemd[1]: Started /etc/rc.local Compatibility.
Nov 27 10:13:49 host16 systemd[1]: Failed to reset devices.list on /system.slice/plymouth-quit.service: Operation not permitted
Nov 27 10:13:49 host16 systemd[1]: Starting Terminate Plymouth Boot Screen...
Nov 27 10:13:49 host16 systemd[1]: Failed to reset devices.list on /system.slice/plymouth-quit-wait.service: Operation not permitted
Nov 27 10:13:49 host16 systemd[1]: Starting Hold until boot process finishes up...
Nov 27 10:13:49 host16 systemd[1]: Failed to reset devices.list on /system.slice/rc-local.service: Operation not permitted
Nov 27 10:13:49 host16 systemd[1]: Started Hold until boot process finishes up.
Nov 27 10:13:49 host16 systemd[1]: Started Container Getty on /dev/pts/1.
Nov 27 10:13:49 host16 systemd[1]: Started Container Getty on /dev/pts/0.
Nov 27 10:13:49 host16 systemd[1]: Failed to reset devices.list on /system.slice/console-getty.service: Operation not permitted
Nov 27 10:13:49 host16 systemd[1]: Started Console Getty.
Nov 27 10:13:49 host16 systemd[1]: Reached target Login Prompts.
Nov 27 10:13:49 host16 systemd[1]: Started Terminate Plymouth Boot Screen.
Nov 27 10:13:52 host16 nslcd[338]: accepting connections
Nov 27 10:13:52 host16 nslcd[275]: ...done.
Nov 27 10:13:52 host16 systemd[1]: Started LSB: LDAP connection daemon.
Nov 27 10:13:52 host16 systemd[1]: Failed to reset devices.list on /system.slice/cron.service: Operation not permitted
Nov 27 10:13:52 host16 systemd[1]: Started Regular background program processing daemon.
Nov 27 10:13:52 host16 systemd[1]: Failed to reset devices.list on /system.slice/atd.service: Operation not permitted
添加的systemd-tmpfiles --create
输出
您犯的一个错误是尝试
sshd
手动开始。如果您改为
sshd
通过官方方式开始,它应该可以正常工作。该service
命令知道在您的发行版上启动服务的正确方法是什么,这应该有效:对于 sysv init 脚本,这就是您需要做的一切。目录丢失的原因是它
/var/run
是一个符号链接/run
并且/run
是一个tmpfs
挂载点。这意味着在每次启动/var/run
时都会开始为空。当您使用该service
命令时,/etc/init.d/ssh
脚本将用于启动,但在此之前,如果脚本不存在,sshd
则会创建该脚本。/var/run/sshd
随着
systemd
事情的发展有点不同。将/usr/lib/tmpfiles.d/sshd.conf
使用此内容调用一个文件:在引导期间,这应该会导致
/var/run/sshd
创建目录。您需要验证文件是否存在并具有正确的内容。如果该/var/run/sshd
目录仍然丢失,您可以验证它是否在您systemd-tmpfiles --create
手动运行时创建。所以 /run (和 /var/run 符号链接到它)每次重新启动都会重新创建。除了 systemd-tmpfiles 对包括 (/var)/run/sshd 在内的某些文件没有这样做。
显然,这是通过 OpenVZ 内核升级修复的。但是现在要实际修复它,您需要编辑
/usr/lib/tmpfiles.d/sshd.conf
并/var
从行中删除d /var/run/sshd 0755 root root
以改为阅读:d /run/sshd 0755 root root
就是这样..!
当 openssh-server 升级时,我们希望他们能修复这个 bug(或者它真的是 systemd 中的一个 bug?还是 openvz ??)——否则你可能会遇到同样的问题。
显然,当运行 OpenVZ 内核 2.6.32-042stab134.7 或更新版本时,这会得到解决。我觉得奇怪的是在 systemd 启动脚本中无法以某种方式修复。可能像在启动后自动创建 /run/sshd/ 然后启动 sshd 这样的丑陋黑客会起作用。
我的输出
systemd-tmpfiles --create
:OpenVZ 2.6.32-042stab134.7 的更新日志这样说:
尽管多年来我在 systemd 上遇到的麻烦一样多,但我必须承认这个问题源于 Ansible同步指令。
出于某种原因,在使用我们的 ansbile 脚本配置此主机后,它离开了 / 目录(以及 /etc、/opt 和其他)由管理员用户拥有,而不是 root。运行
chown
纠正后,/var/run/sshd
现在再次在启动时创建。我真的很感谢所有的输入,但这里没有错误,至少在将不适当的所有权应用于根目录导致未定义的系统行为的意义上。
我也有这种行为。我的问题是 ssh.socket 以某种方式启用。禁用 ssh.socket 时,ssh.service 确实会在启动时正常启动。
我看到的一种方法是在您的 Dockerfile 中简单地创建该目录。