AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • Início
  • system&network
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • Início
  • system&network
    • Recentes
    • Highest score
    • tags
  • Ubuntu
    • Recentes
    • Highest score
    • tags
  • Unix
    • Recentes
    • tags
  • DBA
    • Recentes
    • tags
  • Computer
    • Recentes
    • tags
  • Coding
    • Recentes
    • tags
Início / user-568403

jameszp's questions

Martin Hope
jameszp
Asked: 2023-07-09 11:04:00 +0800 CST

Erro SMART (CurrentPendingSector) e (OfflineUncorrictableSector)

  • 6

Tenho recebido as seguintes mensagens de erro todos os dias há vários meses e não sei como parar de receber essas mensagens.

CurrentPendingSector

This message was generated by the smartd daemon running on:

   host name:  myhost
   DNS domain: [Empty]

The following warning/error was logged by the smartd daemon:

Device: /dev/sda [SAT], 6 Currently unreadable (pending) sectors

Device info:
KingFast, S/N:03112222C0002, FW:U0803A0, 256 GB

For details see host's SYSLOG.

You can also use the smartctl utility for further investigation.
The original message about this issue was sent at Fri Feb  3 19:41:29 2023 PST
Another message will be sent in 24 hours if the problem persists.

OfflineUncorrectableSector

This message was generated by the smartd daemon running on:

   host name:  myhost
   DNS domain: [Empty]

The following warning/error was logged by the smartd daemon:

Device: /dev/sda [SAT], 3 Offline uncorrectable sectors

Device info:
KingFast, S/N:03112222C0002, FW:U0803A0, 256 GB

For details see host's SYSLOG.

You can also use the smartctl utility for further investigation.
The original message about this issue was sent at Fri Feb  3 19:41:29 2023 PST
Another message will be sent in 24 hours if the problem persists.

smartctl -a /dev/sda

smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.19.0-46-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     KingFast
Serial Number:    03112222C0002
Firmware Version: U0803A0
User Capacity:    256,060,514,304 bytes [256 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
TRIM Command:     Available
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2 T13/2015-D revision 3
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Jul  8 15:44:59 2023 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x02) Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (  120) seconds.
Offline data collection
capabilities:            (0x11) SMART execute Offline immediate.
                    No Auto Offline data collection support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    No Selective Self-test supported.
SMART capabilities:            (0x0002) Does not save SMART data before
                    entering power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    (  10) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   100   100   050    Old_age   Always       -       0
  5 Reallocated_Sector_Ct   0x0032   100   100   050    Old_age   Always       -       6
  9 Power_On_Hours          0x0032   100   100   050    Old_age   Always       -       3335
 12 Power_Cycle_Count       0x0032   100   100   050    Old_age   Always       -       440
160 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       3
161 Unknown_Attribute       0x0033   100   100   050    Pre-fail  Always       -       86
163 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       26
164 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       79004
165 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       481
166 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       6
167 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       114
168 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       5050
169 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       98
175 Program_Fail_Count_Chip 0x0032   100   100   050    Old_age   Always       -       0
176 Erase_Fail_Count_Chip   0x0032   100   100   050    Old_age   Always       -       0
177 Wear_Leveling_Count     0x0032   100   100   050    Old_age   Always       -       0
178 Used_Rsvd_Blk_Cnt_Chip  0x0032   100   100   050    Old_age   Always       -       6
181 Program_Fail_Cnt_Total  0x0032   100   100   050    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   050    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   050    Old_age   Always       -       88
194 Temperature_Celsius     0x0022   100   100   050    Old_age   Always       -       35
195 Hardware_ECC_Recovered  0x0032   100   100   050    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   100   100   050    Old_age   Always       -       3
197 Current_Pending_Sector  0x0032   100   100   050    Old_age   Always       -       6
198 Offline_Uncorrectable   0x0032   100   100   050    Old_age   Always       -       3
199 UDMA_CRC_Error_Count    0x0032   100   100   050    Old_age   Always       -       0
232 Available_Reservd_Space 0x0032   100   100   050    Old_age   Always       -       86
241 Total_LBAs_Written      0x0030   100   100   050    Old_age   Offline      -       168900
242 Total_LBAs_Read         0x0030   100   100   050    Old_age   Offline      -       815543
245 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       191939

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      3329         -
# 2  Short offline       Completed without error       00%      3325         -
# 3  Short offline       Completed without error       00%      3321         -
# 4  Short offline       Completed without error       00%      3313         -
# 5  Short offline       Completed without error       00%      3309         -
# 6  Short offline       Completed without error       00%      3306         -
# 7  Extended offline    Completed without error       00%      3250         -
# 8  Extended offline    Completed without error       00%      3232         -
# 9  Extended offline    Completed without error       00%      3229         -
#10  Extended offline    Completed without error       00%       976         -
#11  Extended offline    Completed without error       00%       968         -

Selective Self-tests/Logging not supported

Eu tentei ignorar os erros 197e com198/etc/smartd.conf

/dev/sda -d removable -n standby -H -l error -l selftest -f -t -I 197 -I 198 -s (S/../.././(01|09|17)|L/../../3/11) -m root -M exec /usr/share/smartmontools/smartd-runner

para nenhum proveito.

Também não vejo nenhum LBA_of_first_errorna seção de autoteste.

Para mim, parece que SMART overall-health self-assessment test result: PASSED está saudável e os autotestes não retornam erros. Meu entendimento atual é que o disco parece estar íntegro, mas ainda está enviando essas mensagens erroneamente.

Há algo que estou perdendo?

A /dev/sdaunidade é um SSD KingFast de 256 GB e não tenho certeza se isso seria relevante, pois não consegui encontrar nada online para esta unidade ou fabricante específico.

Como eu poderia parar de receber essas mensagens, mas ainda ter o monitoramento SMART para outros problemas genuínos na unidade e como corrigir o problema se essa mensagem de erro realmente indicar algum problema com a unidade?

Obrigado!

Editar:

Depois de correr smartctl -t long /dev/sda, eu tenho

smartctl -a /dev/sda
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.19.0-46-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     KingFast
Serial Number:    03112222C0002
Firmware Version: U0803A0
User Capacity:    256,060,514,304 bytes [256 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
TRIM Command:     Available
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2 T13/2015-D revision 3
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Jul  9 10:05:33 2023 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x03) Offline data collection activity
                    is in progress.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 241) Self-test routine in progress...
                    10% of test remaining.
Total time to complete Offline 
data collection:        (  600) seconds.
Offline data collection
capabilities:            (0x11) SMART execute Offline immediate.
                    No Auto Offline data collection support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    No Selective Self-test supported.
SMART capabilities:            (0x0002) Does not save SMART data before
                    entering power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    (  10) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   100   100   050    Old_age   Always       -       0
  5 Reallocated_Sector_Ct   0x0032   100   100   050    Old_age   Always       -       6
  9 Power_On_Hours          0x0032   100   100   050    Old_age   Always       -       3341
 12 Power_Cycle_Count       0x0032   100   100   050    Old_age   Always       -       441
160 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       3
161 Unknown_Attribute       0x0033   100   100   050    Pre-fail  Always       -       86
163 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       26
164 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       79553
165 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       482
166 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       6
167 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       115
168 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       5050
169 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       98
175 Program_Fail_Count_Chip 0x0032   100   100   050    Old_age   Always       -       0
176 Erase_Fail_Count_Chip   0x0032   100   100   050    Old_age   Always       -       0
177 Wear_Leveling_Count     0x0032   100   100   050    Old_age   Always       -       0
178 Used_Rsvd_Blk_Cnt_Chip  0x0032   100   100   050    Old_age   Always       -       6
181 Program_Fail_Cnt_Total  0x0032   100   100   050    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   050    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   050    Old_age   Always       -       88
194 Temperature_Celsius     0x0022   100   100   050    Old_age   Always       -       46
195 Hardware_ECC_Recovered  0x0032   100   100   050    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   100   100   050    Old_age   Always       -       3
197 Current_Pending_Sector  0x0032   100   100   050    Old_age   Always       -       6
198 Offline_Uncorrectable   0x0032   100   100   050    Old_age   Always       -       3
199 UDMA_CRC_Error_Count    0x0032   100   100   050    Old_age   Always       -       0
232 Available_Reservd_Space 0x0032   100   100   050    Old_age   Always       -       86
241 Total_LBAs_Written      0x0030   100   100   050    Old_age   Offline      -       170468
242 Total_LBAs_Read         0x0030   100   100   050    Old_age   Offline      -       815560
245 Unknown_Attribute       0x0032   100   100   050    Old_age   Always       -       193199

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      3337         -
# 2  Short offline       Completed without error       00%      3329         -
# 3  Short offline       Completed without error       00%      3325         -
# 4  Short offline       Completed without error       00%      3321         -
# 5  Short offline       Completed without error       00%      3313         -
# 6  Short offline       Completed without error       00%      3309         -
# 7  Short offline       Completed without error       00%      3306         -
# 8  Extended offline    Completed without error       00%      3250         -
# 9  Extended offline    Completed without error       00%      3232         -
#10  Extended offline    Completed without error       00%      3229         -
#11  Extended offline    Completed without error       00%       976         -
#12  Extended offline    Completed without error       00%       968         -

Selective Self-tests/Logging not supported

O teste off-line estendido nº 12 Completed without error, então não tenho certeza do que devo fazer a partir daqui.

Edição nº 2:

Também executei o seguinte, que acredito indicar que não há erros na unidade:

badblocks -sv /dev/sda
Checking blocks 0 to 250059095
Checking for bad blocks (read-only test): done                                                 
Pass completed, 0 bad blocks found. (0/0/0 errors)
dd if=/dev/sda of=/dev/null bs=64K conv=noerror
3907173+1 records in
3907173+1 records out
256060514304 bytes (256 GB, 238 GiB) copied, 485.648 s, 527 MB/s
hard-disk
  • 1 respostas
  • 43 Views
Martin Hope
jameszp
Asked: 2023-04-27 12:40:18 +0800 CST

xrandr --listproviders não mostrando nenhuma placa gráfica

  • 5

Gostaria de poder xrandrreconhecer minhas duas placas gráficas RTX 3070 Nvidia.

No entanto, xrandr não retorna nada.

xrandr --listproviders
Providers: number : 0

Estou usando nvidia-primee nvidia-driver-530.

Acredito que isso seja causado pela Xwaylandexecução de uma camada de composição X em cima waylande não pela execução xorgdireta.

Embora eu realmente não saiba se waylandé realmente a causa raiz, acredito que gostaria de desabilitar tudo waylandna minha máquina servidor (Ubuntu Server 22.04) e executar xorgapenas. Estou acessando o servidor por meio ssh -Xde um cliente de desktop Ubuntu.

xrandr --listmonitors
Monitors: 1
 0: +*XWAYLAND15 3840/620x2160/330+0+0  XWAYLAND15
xinput
WARNING: running xinput against an Xwayland server. See the xinput man page for details.
⎡ Virtual core pointer                      id=2    [master pointer  (3)]
⎜   ↳ Virtual core XTEST pointer                id=4    [slave  pointer  (2)]
⎜   ↳ xwayland-pointer:16                       id=6    [slave  pointer  (2)]
⎜   ↳ xwayland-relative-pointer:16              id=7    [slave  pointer  (2)]
⎜   ↳ xwayland-pointer-gestures:16              id=8    [slave  pointer  (2)]
⎣ Virtual core keyboard                     id=3    [master keyboard (2)]
    ↳ Virtual core XTEST keyboard               id=5    [slave  keyboard (3)]
    ↳ xwayland-keyboard:16                      id=9    [slave  keyboard (3)]

echo $XDG_SESSION_TYPE
tty
loginctl show-session 1 -p Type
Type=tty

Quais configurações ou configuração eu precisaria obter xrandr --listproviderspara exibir as placas gráficas Nvidia?

Eu ficaria feliz em fornecer detalhes adicionais, conforme necessário.

x11
  • 1 respostas
  • 22 Views
Martin Hope
jameszp
Asked: 2023-04-25 03:05:54 +0800 CST

Forçar a execução de gráficos em uma GPU específica

  • 5

Meu computador possui uma placa gráfica integrada e 2 GPUs Nvidia RTX 3070. Estou usando o Ubuntu 20.04 e nvidia-driver-530.

lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation AlderLake-S GT1 (rev 0c)
01:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070 Lite Hash Rate] (rev a1)
05:00.0 VGA compatible controller: NVIDIA Corporation GA104 [GeForce RTX 3070 Lite Hash Rate] (rev a1)

No momento, estou tentando testar minhas placas gráficas 3070 com o Phoronix Test Suite.

Estou usando nvidia-primee prime-select: on-demandrodando o terminal nos testes intel iGPU e phoronix na Nvidia 3070: prime-run phoronix-test-suite run unigine-heaven.

Houve alguns problemas para começar nvidia-primea trabalhar, então segui as sugestões deste artigo: https://askubuntu.com/questions/1364762/prime-run-command-not-found

cat /usr/bin/prime-run
#!/bin/bash
export __NV_PRIME_RENDER_OFFLOAD=1
export __GLX_VENDOR_LIBRARY_NAME=nvidia
export __VK_LAYER_NV_optimus=NVIDIA_only
export VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/nvidia_icd.json
exec "$@"

Ao usar, prime-runconsigo executar com êxito o conjunto de testes phoronix na GPU 0, que possui id de barramento 01:00.0/ PCI:1:0:0.

No entanto, parece que não consigo executar nenhum teste com a GPU 1 que possui bus id 05:00.0/ PCI:5:0:0.

Modificar /etc/X11/xorg.confalterando o número do barramento e reiniciando conforme sugerido pelos links a seguir não pareceu fazer nada e ainda funcionou na GPU 0.

  • https://stackoverflow.com/questions/18382271/how-can-i-modify-xorg-conf-file-to-force-x-server-to-run-on-a-specific-gpu-ia
  • https://askubuntu.com/questions/787030/setting-the-default-gpu
cat /etc/X11/xorg.conf
# nvidia-xconfig: X configuration file generated by nvidia-xconfig
# nvidia-xconfig:  version 530.41.03

Section "ServerLayout"
    Identifier     "Layout0"
    Screen      0  "Screen0"
    InputDevice    "Keyboard0" "CoreKeyboard"
    InputDevice    "Mouse0" "CorePointer"
EndSection

Section "Files"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Mouse0"
    Driver         "mouse"
    Option         "Protocol" "auto"
    Option         "Device" "/dev/psaux"
    Option         "Emulate3Buttons" "no"
    Option         "ZAxisMapping" "4 5"
EndSection

Section "InputDevice"
    # generated from default
    Identifier     "Keyboard0"
    Driver         "kbd"
EndSection

Section "Monitor"
    Identifier     "Monitor0"
    VendorName     "Unknown"
    ModelName      "Unknown"
    Option         "DPMS"
EndSection

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
#    BusID          "PCI:1:0:0"
    BusID          "PCI:5:0:0"
EndSection

Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    Monitor        "Monitor0"
    DefaultDepth    24
    SubSection     "Display"
        Depth       24
    EndSubSection
EndSection

Na verdade, apaguei etc/X11/xorg.confe consegui executar os testes do phoronix na GPU 0 sem o arquivo conf. Eu acho que um dos drivers ou programas que executo seleciona automaticamente a placa nvidia com o menor ID de barramento.

Gostaria de saber onde devo procurar para alterar as configurações ou quaisquer arquivos de configuração para selecionar a segunda gpu RTX 3070 com o bus id 05:00.0. Eu ficaria mais do que feliz em fornecer qualquer informação adicional.

xorg
  • 1 respostas
  • 33 Views

Sidebar

Stats

  • Perguntas 205573
  • respostas 270741
  • best respostas 135370
  • utilizador 68524
  • Highest score
  • respostas
  • Marko Smith

    Possível firmware ausente /lib/firmware/i915/* para o módulo i915

    • 3 respostas
  • Marko Smith

    Falha ao buscar o repositório de backports jessie

    • 4 respostas
  • Marko Smith

    Como exportar uma chave privada GPG e uma chave pública para um arquivo

    • 4 respostas
  • Marko Smith

    Como podemos executar um comando armazenado em uma variável?

    • 5 respostas
  • Marko Smith

    Como configurar o systemd-resolved e o systemd-networkd para usar o servidor DNS local para resolver domínios locais e o servidor DNS remoto para domínios remotos?

    • 3 respostas
  • Marko Smith

    apt-get update error no Kali Linux após a atualização do dist [duplicado]

    • 2 respostas
  • Marko Smith

    Como ver as últimas linhas x do log de serviço systemctl

    • 5 respostas
  • Marko Smith

    Nano - pule para o final do arquivo

    • 8 respostas
  • Marko Smith

    erro grub: você precisa carregar o kernel primeiro

    • 4 respostas
  • Marko Smith

    Como baixar o pacote não instalá-lo com o comando apt-get?

    • 7 respostas
  • Martin Hope
    user12345 Falha ao buscar o repositório de backports jessie 2019-03-27 04:39:28 +0800 CST
  • Martin Hope
    Carl Por que a maioria dos exemplos do systemd contém WantedBy=multi-user.target? 2019-03-15 11:49:25 +0800 CST
  • Martin Hope
    rocky Como exportar uma chave privada GPG e uma chave pública para um arquivo 2018-11-16 05:36:15 +0800 CST
  • Martin Hope
    Evan Carroll status systemctl mostra: "Estado: degradado" 2018-06-03 18:48:17 +0800 CST
  • Martin Hope
    Tim Como podemos executar um comando armazenado em uma variável? 2018-05-21 04:46:29 +0800 CST
  • Martin Hope
    Ankur S Por que /dev/null é um arquivo? Por que sua função não é implementada como um programa simples? 2018-04-17 07:28:04 +0800 CST
  • Martin Hope
    user3191334 Como ver as últimas linhas x do log de serviço systemctl 2018-02-07 00:14:16 +0800 CST
  • Martin Hope
    Marko Pacak Nano - pule para o final do arquivo 2018-02-01 01:53:03 +0800 CST
  • Martin Hope
    Kidburla Por que verdadeiro e falso são tão grandes? 2018-01-26 12:14:47 +0800 CST
  • Martin Hope
    Christos Baziotis Substitua a string em um arquivo de texto enorme (70 GB), uma linha 2017-12-30 06:58:33 +0800 CST

Hot tag

linux bash debian shell-script text-processing ubuntu centos shell awk ssh

Explore

  • Início
  • Perguntas
    • Recentes
    • Highest score
  • tag
  • help

Footer

AskOverflow.Dev

About Us

  • About Us
  • Contact Us

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve