我有一个已经运行了几年的 GCE 实例。在夜间,实例重新启动并显示以下日志:
2022-02-13 04:46:36.370 CET compute.instances.hostError Instance terminated by Compute Engine.
2022-02-13 04:47:08.279 CET compute.instances.automaticRestart Instance automatically restarted by Compute Engine.
但是实例没有重新启动。
我可以连接到看到这个的串行控制台:
serialport: Connected to ***.europe-west1-b.*** port 1 (
[ TIME ] Timed out waiting for device ***
[DEPEND] Dependency failed for File… ***.
[DEPEND] Dependency failed for /data.
[DEPEND] Dependency failed for Local File Systems.
[ OK ] Stopped Dispatch Password …ts to Console Directory Watch.
[ OK ] Stopped Forward Password R…uests to Wall Directory Watch.
[ OK ] Reached target Timers.
Starting Raise network interfaces...
[ OK ] Closed Syslog Socket.
[ OK ] Reached target Login Prompts.
[ OK ] Reached target Paths.
[ OK ] Reached target Sockets.
[ OK ] Started Emergency Shell.
[ OK ] Reached target Emergency Mode.
Starting Create Volatile Files and Directories...
[ OK ] Finished Create Volatile Files and Directories.
Starting Network Time Synchronization...
Starting Update UTMP about System Boot/Shutdown...
[ OK ] Finished Update UTMP about System Boot/Shutdown.
Starting Update UTMP about System Runlevel Changes...
[ OK ] Finished Update UTMP about System Runlevel Changes.
[ OK ] Started Network Time Synchronization.
[ OK ] Reached target System Time Set.
[ OK ] Reached target System Time Synchronized.
Stopping Network Time Synchronization...
[ OK ] Stopped Network Time Synchronization.
Starting Network Time Synchronization...
[ OK ] Started Network Time Synchronization.
[ OK ] Finished Raise network interfaces.
[ OK ] Reached target Network.
[ OK ] Reached target Network is Online.
You are in emergency mode. After logging in, type "journalctl -xb" to view
system logs, "systemctl reboot" to r
Cannot open access to console, the root account is locked.
See sulogin(8) man page for more details.
Press Enter to continue.
似乎其中一个磁盘无法连接 - 但现在我该怎么办?该磁盘似乎在计算引擎中通常可用。
恐怕您无法对这个受影响的虚拟机做任何事情。
在Host Events文档或常见问题解答中,您可以找到以下信息:
VM 实例位于“云”中,它仍然是运行您的工作负载的物理机器。不幸的是,此实例出现硬件或软件故障,您无能为力。
GCP 引入了一种称为实时迁移的东西,它可以防止这种情况发生。
可能的解决方法
正如您提到的磁盘是持久的并且在 GCP 中仍然可见,您可以尝试将它们重新附加到另一个 VM。如何指南可以在创建和附加磁盘文档中找到。
我终于找到了这个错误的奇怪原因 - 见原文
/etc/fstab
:但是这条路上没有这样的设备。我通过附加解决了这个问题
/dev/sdb
,但我想这不是最好的解决方案。我想知道设备突然完全消失并最终杀死机器是怎么发生的。