AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / server / 问题

问题[mcelog](server)

Martin Hope
ItsJustMe
Asked: 2017-02-20 16:00:51 +0800 CST

启动后服务器内核崩溃,不知道如何处理日志

  • 1

我们刚刚收到了一个全新的双 CPU 服务器,它在启动后不久就不断崩溃并出现内核恐慌,这甚至发生在操作系统空闲时的设置过程中。我能够安装操作系统并启用 mcelog 来尝试了解正在发生的事情,尽管我不确定输出是什么。在线阅读使我认为这可能是其中一个插槽 (1) 上的 DIMM 有缺陷,但我运行 memtest 几次,没有发现任何错误。这可能是软件问题吗?我已经尝试了 2 个操作系统,并且两者都发生了同样的事情,尽管在 Debian/Proxmox 中比在 CentOS 中更常见。

服务器规格:

双英特尔 8 核至强 E5-2620v4

2 x DIMM 32GB DDR4 2400MHz RECC DDR4

MB 超微 X10DRL-i

这不是 CPU 温度,因为在 memtest 或操作系统安装期间,温度从未超过 35ºC。我还能够在 CPU 崩溃并且温度正常之前在 CPU 上运行一些短裤基准测试。

我怎么知道这里发生了什么?在它发生之前我可以访问服务器几分钟,我已经下载了 vmcore 转储,但我不确定如何处理它。

这是启动然后崩溃 50 秒后的 mce 日志:

[   56.367615] mce: [Hardware Error]: Machine check events logged
[   70.420914] mce: [Hardware Error]: Machine check events logged
[   71.886789] Disabling lock debugging due to kernel taint
[   71.886894] mce: [Hardware Error]: CPU 24: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   71.887009] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   71.887122] mce: [Hardware Error]: TSC 206cc7cd362 
[   71.887184] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 11 microcode b00001d
[   71.887289] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[   71.889392] mce: [Hardware Error]: CPU 30: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   71.889489] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   71.889595] mce: [Hardware Error]: TSC 206cc7cd11d 
[   71.889657] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 1d microcode b00001d
[   71.889760] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[   71.891804] mce: [Hardware Error]: CPU 14: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   71.891901] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   71.892007] mce: [Hardware Error]: TSC 206cc7cd10e 
[   71.892068] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 1c microcode b00001d
[   71.892171] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[   71.894217] mce: [Hardware Error]: CPU 13: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   71.894314] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   71.894420] mce: [Hardware Error]: TSC 206cc7cd23c 
[   71.894480] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 1a microcode b00001d
[   71.894585] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[   71.896634] mce: [Hardware Error]: CPU 29: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   71.896730] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   71.896835] mce: [Hardware Error]: TSC 206cc7cd194 
[   71.896896] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 1b microcode b00001d
[   71.897000] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[   71.899053] mce: [Hardware Error]: CPU 28: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   71.899150] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   71.899256] mce: [Hardware Error]: TSC 206cc7cd719 
[   71.899335] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 19 microcode b00001d
[   71.899438] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[   71.901485] mce: [Hardware Error]: CPU 12: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   71.901582] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   71.901687] mce: [Hardware Error]: TSC 206cc7cd720 
[   71.901748] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 18 microcode b00001d
[   71.901851] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[   71.903934] mce: [Hardware Error]: CPU 10: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   71.904031] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   71.904136] mce: [Hardware Error]: TSC 206cc7cd851 
[   71.904197] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 14 microcode b00001d
[   71.904300] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[   71.906306] mce: [Hardware Error]: CPU 26: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   71.906403] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   71.906508] mce: [Hardware Error]: TSC 206cc7cd863 
[   71.906569] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 15 microcode b00001d
[   71.909482] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[   71.914367] mce: [Hardware Error]: CPU 11: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   71.917304] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   71.920287] mce: [Hardware Error]: TSC 206cc7cd515 
[   71.923159] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 16 microcode b00001d
[   71.926031] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[   71.930820] mce: [Hardware Error]: CPU 27: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   71.933685] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   71.936557] mce: [Hardware Error]: TSC 206cc7cd449 
[   71.939384] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 17 microcode b00001d
[   71.944180] mce: [Hardware Error]: CPU 9: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   71.947059] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   71.949956] mce: [Hardware Error]: TSC 206cc7cd766 
[   71.952786] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 12 microcode b00001d
[   71.957580] mce: [Hardware Error]: CPU 25: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   71.960480] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   71.963366] mce: [Hardware Error]: TSC 206cc7cd751 
[   71.966210] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 13 microcode b00001d
[   71.971031] mce: [Hardware Error]: CPU 31: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   71.973919] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   71.976817] mce: [Hardware Error]: TSC 206cc7cd7f7 
[   71.979690] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 1f microcode b00001d
[   71.984474] mce: [Hardware Error]: CPU 15: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   71.987371] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   71.990290] mce: [Hardware Error]: TSC 206cc7cd803 
[   71.993151] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 1e microcode b00001d
[   71.997992] mce: [Hardware Error]: CPU 8: Machine Check Exception: 5 Bank 20: fa00004000020e0f
[   72.000918] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff8138fb97> {intel_idle+0xd7/0x160}
[   72.003828] mce: [Hardware Error]: TSC 206cc7cd374 
[   72.006692] mce: [Hardware Error]: PROCESSOR 0:406f1 TIME 1487438906 SOCKET 1 APIC 10 microcode b00001d
[   72.011533] mce: [Hardware Error]: Machine check: Processor context corrupt
[   72.014436] Kernel panic - not syncing: Fatal machine check
central-processing-unit memory kernel-panic mcelog
  • 1 个回答
  • 1668 Views

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    新安装后 postgres 的默认超级用户用户名/密码是什么?

    • 5 个回答
  • Marko Smith

    SFTP 使用什么端口?

    • 6 个回答
  • Marko Smith

    命令行列出 Windows Active Directory 组中的用户?

    • 9 个回答
  • Marko Smith

    什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同?

    • 3 个回答
  • Marko Smith

    如何确定bash变量是否为空?

    • 15 个回答
  • Martin Hope
    Tom Feiner 如何按大小对 du -h 输出进行排序 2009-02-26 05:42:42 +0800 CST
  • Martin Hope
    Noah Goodrich 什么是 Pem 文件,它与其他 OpenSSL 生成的密钥文件格式有何不同? 2009-05-19 18:24:42 +0800 CST
  • Martin Hope
    Brent 如何确定bash变量是否为空? 2009-05-13 09:54:48 +0800 CST
  • Martin Hope
    cletus 您如何找到在 Windows 中打开文件的进程? 2009-05-01 16:47:16 +0800 CST

热门标签

linux nginx windows networking ubuntu domain-name-system amazon-web-services active-directory apache-2.4 ssh

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve