Estou tentando instalar o CentOS 7.5 x64 mais recente no PC de fator de forma pequeno ASUS Eee Box EB1037 . É um Intel Celeron J1900 (Bay Trail) com uma NVIDIA GeForce GT 820M integrada. A mídia de instalação será bloqueada, a menos que o Nouveau seja desabilitado primeiro. Isto é bom. Mas após a instalação e as reinicializações subsequentes, a partição EFI parece corrompida.
Esta pergunta NÃO é sobre como solucionar problemas de inicialização, mas sim entender por que exatamente essa falha de inicialização está corrompendo a partição EFI e causando falha no GRUB.
Aqui está o procedimento de instalação:
- CentOS 7.5 queimado para USB
- Boot to USB installer (grub bootloader)
- Edite a opção grub para adicionar "nouveau.modeset=0"
- Definir fuso horário
- Seleção de software: instalação mínima (sem alterações)
- Rede e nome do host: Definir nome do host
- Definir partições manuais como "partições padrão" (sem LVM) e layout de partição automático
- A instalação continua
- Definir senha root e conta de usuário (como administrador)
- Instalação concluída
- Reinício
- O GRUB do disco rígido aparece
Não alterei nenhuma das configurações do GRUB (como desabilitar o Nouveau). Veja as configurações padrão aqui:
Tentei inicializar o CentOS com esses padrões e travou como esperado (já que não desabilitei o Nouveau). Tudo o que eu podia ver era uma tela preta. O monitor estava ligado, mas os indicadores do teclado e a luz de fundo, bem como o LED do mouse óptico, estavam todos desligados. O teclado não era responsável por ctrl-alt-del.
Performed a hard reset by holding the power button. System booted up to the hard disk GRUB menu a second time with no problems. Tried to boot using defaults again and it locked the same as before (as expected, as I still haven't disabled Nouveau).
Note that I still have the CentOS USB installer inserted. Upon this THIRD reboot (after the previous two post-install reboots), the system takes me to the USB GRUB instead of the hard disk one. Odd. Popped out the CentOS USB and rebooted with ctrl-alt-del.
Now I see a message from GRUB flash on the screen briefly indicated thing it cannot read the EFI partition:
After a moment it disappears and I see this:
The system is now no longer bootable to the EFI partition.
Why is this happening? How is the EFI partition corrupting?
Additional Information
Secure Boot is Enabled in the BIOS and cannot be disabled but is set to "Other OS".
There is only ONE SATA port inside the unit and it is populated by a Samsung 850 Pro 500GB SSD. Despite being set to AHCI and visible as SATA1 and the only disk connected to the system, CentOS identifies it as sdb
instead of sda
, possibly because it thinks that the USB install media is sda
. It does not present the USB drive as a second disk during installation, however, and displays the Samsung SSD as the only visible drive.
GRUB sees the attached CentOS install USB media as (hd0) and the onboard SATA as (hd1) when both as inserted. The onboard SATA is seen as (hd0) when the USB media is removed. Interestingly, the onboard SATA is seen as sd
by the CentOS installer but hd
by GRUB.
Highlights
- System has an Nvidia graphics processor (Optimus?)
- Secure Boot is ENABLED (cannot be disabled)
- BIOS presents USB disks as attached SATA disks? (
sda
during installation,hd0
in GRUB)
PLEASE NOTE
I can already get the system to boot by removing the USB stick after installation, setting nouveau.modeset=0
and updating GRUB afterwards at /boot/efi/EFI/centos/grub.cfg
.
The question is to understand what is corrupting the EFI partition!
Photo of the system booted:
The name
\EFI\BOOT\grubx64.efi
tells me the system is not using the CentOS default UEFI boot path, but the fallback one. But the fallback boot path is\EFI\BOOT\bootx64.efi
, which would be occupied by the SecureBoot shim. So it would seem the shim is loaded, but it is failing to perform the next step: the loading of the actual GRUB bootloader from the fallback directory.My theory:
\EFI\CentOS\shimx64.efi
is the SecureBoot shim bootloader, and\EFI\CentOS\grubx64.efi
is the actual GRUB bootloader. The path\EFI\CentOS\shimx64.efi
was registered into UEFI NVRAM boot variables. The installer also (attempted to) set up a second copy with shim in the default fallback/removable media boot path\EFI\BOOT\bootx64.efi
and GRUB as\EFI\BOOT\grubx64.efi
.\EFI\CentOS\shimx64.efi
and\EFI\CentOS\grubx64.efi
. This boot attempt then resulted in a hang because Nouveau was not disabled.\EFI\BOOT\bootx64.efi
instead. That happens when you tell UEFI to boot from a specific disk but don't specify a bootloader path. For some reason or another, this allows the fallback copy of the SecureBoot shim to be loaded, but then fails in loading\EFI\BOOT\grubx64.efi
. Note that it doesn't say the file is corrupted: it is saying that the file just does not exist.Now, you should probably use
efibootmgr -v
to view your UEFI boot variables as they exist now, and write down the current set-up, or at least the CentOS boot entry, so that you will be able to reproduce it if it is lost ever again. In that situation, you might either boot into rescue mode from CentOS installation media and use theefibootmgr
command to fix the NVRAM variables, or perhaps just type in the correct settings using the UEFI "boot settings" menu, if it allows that. (Sadly, most UEFI implementations I've seen won't.)You should also verify that the fallback GRUB bootloader is intact. The file should be accessible as
/boot/efi/EFI/BOOT/grubx64.efi
in Linux. Verify that the file exists and is identical to/boot/efi/EFI/CentOS/grubx64.efi
.I don't really know what caused the UEFI NVRAM boot variables to be lost between the first reboot and the third one. There are various buggy UEFI implementations out there. Or did you perhaps reset the "BIOS settings" as part of troubleshooting the hang that turned out to be caused by Nouveau? Resetting the UEFI "BIOS settings" may or may not reset the NVRAM boot variables too, depending on UEFI implementation.
If it turns out the occasional loss of UEFI NVRAM boot variables is a firmware bug, you might check for a BIOS upgrade: run
dmidecode -s bios-version
to see the current version. According to ASUS support pages, the most up-to-date UEFI BIOS for your system is version 1301. ASUS typically includes an update feature into the UEFI BIOS itself; if that's true on your system, you just need to save the update file onto the EFI system partition (= anywhere under/boot/efi
in CentOS), go to BIOS settings, activate the update tool from there, and tell it where the update file is.One possible reason for NVRAM corruption is the
efi-pstore
kernel module. If it is enabled (or built into the CentOS standard kernel) and the feature to store kernel log intopstore
on a kernel panic is active, this may have filled the NVRAM to 100% with a series of variables containing the kernel log. This might have caused the firmware to detect the variable storage as corrupt and reinitialized the NVRAM boot variables automatically.Se o fallback
/boot/efi/EFI/BOOT/grubx64.efi
realmente não foi danificado, a falha ao inicializar a partir do caminho de fallback pode ter sido causada por um bug no shim do SecureBoot ou pela aplicação excessivamente zelosa do Secure Boot no caminho de inicialização do fallback do HDD (tecnicamente, um recurso não documentado debugdo firmware UEFI que o torna incompatível com o shim SecureBoot). Uma atualização do shim do SecureBoot pode ajudar nesse caso.