AskOverflow.Dev

AskOverflow.Dev Logo AskOverflow.Dev Logo

AskOverflow.Dev Navigation

  • 主页
  • 系统&网络
  • Ubuntu
  • Unix
  • DBA
  • Computer
  • Coding
  • LangChain

Mobile menu

Close
  • 主页
  • 系统&网络
    • 最新
    • 热门
    • 标签
  • Ubuntu
    • 最新
    • 热门
    • 标签
  • Unix
    • 最新
    • 标签
  • DBA
    • 最新
    • 标签
  • Computer
    • 最新
    • 标签
  • Coding
    • 最新
    • 标签
主页 / unix / 问题

问题[smartctl](unix)

Martin Hope
Mathias Sven
Asked: 2024-08-29 21:19:44 +0800 CST

smartctl 和设备类型不匹配

  • 6

长话短说,我试图更好地理解存储类型接口的不同标准,但 的输出smartctl让我有点困惑。这是我系统中的实际问题吗(就像另一篇帖子中看到的一些固件已过时一样)还是我误解了 的输出smartctl。

观察:

> sudo smartctl --scan
/dev/sda -d scsi # /dev/sda, SCSI device
/dev/nvme0 -d nvme # /dev/nvme0, NVMe device

我有一块 HDD 和一块 NVMe,但据我所知,HDD 不是 SCSI,除非它是“为什么我的 SATA 设备显示在 /proc/scsi/scsi 下? ”。但如果是,为什么我可以同时使用两者-d ata并-d scsi获取有关它的信息:

> sudo smartctl -d ata --info /dev/sda
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.10.5] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Scorpio Black (AF)
Device Model:     WDC WD5000BPKT-75PK4T0
Serial Number:    WD-WX11EC114329
LU WWN Device Id: 5 0014ee 6ad29b3f3
Firmware Version: 01.01A01
User Capacity:    500,107,862,016 bytes [500 GB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database 7.3/5387
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Thu Aug 29 14:09:19 2024 WEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
> sudo smartctl -d scsi --info /dev/sda
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.10.5] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

User Capacity:        500,107,862,016 bytes [500 GB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
LU is fully provisioned
Rotation Rate:        7200 rpm
Logical Unit id:      0x50014ee6ad29b3f3
Serial number:        WD-WX11EC114329
Device type:          disk
Local Time is:        Thu Aug 29 14:09:35 2024 WEST
SMART support is:     Unavailable - device lacks SMART capability.

根据两者的输出,ata显然是“正确”的类型,但却sudo smartctl -d ata --scan没有返回任何内容(与不同sudo smartctl -d scsi --scan)。

为什么看起来我可以同时使用ata和scsi来访问信息,但为什么却被检测scsi为--scan?

smartctl
  • 1 个回答
  • 16 Views
Martin Hope
Jack G
Asked: 2024-07-18 09:33:42 +0800 CST

smartctl 谎称 NVME 的使用寿命约为 2800TBW?我的 NVME 的实际使用寿命是多少?

  • 9

smartctl -x在我的三星 SSD 860 EVO M.2 2TB 上显示:

Device Statistics (GP Log 0x04)
Page  Offset Size        Value Flags Description
0x01  =====  =               =  ===  == General Statistics (rev 1) ==
0x01  0x008  4            1132  ---  Lifetime Power-On Resets
0x01  0x010  4            6584  ---  Power-on Hours
0x01  0x018  6     59675855461  ---  Logical Sectors Written
0x01  0x020  6      1711777462  ---  Number of Write Commands
0x01  0x028  6     51882440157  ---  Logical Sectors Read
0x01  0x030  6      1869976194  ---  Number of Read Commands
0x01  0x038  6          293000  ---  Date and Time TimeStamp
0x04  =====  =               =  ===  == General Errors Statistics (rev 1) ==
0x04  0x008  4               0  ---  Number of Reported Uncorrectable Errors
0x04  0x010  4              97  ---  Resets Between Cmd Acceptance and Completion
0x05  =====  =               =  ===  == Temperature Statistics (rev 1) ==
0x05  0x008  1              40  ---  Current Temperature
0x05  0x020  1              64  ---  Highest Temperature
0x05  0x028  1              18  ---  Lowest Temperature
0x05  0x058  1              70  ---  Specified Maximum Operating Temperature
0x06  =====  =               =  ===  == Transport Statistics (rev 1) ==
0x06  0x008  4           20530  ---  Number of Hardware Resets
0x06  0x010  4               0  ---  Number of ASR Events
0x06  0x018  4               0  ---  Number of Interface CRC Errors
0x07  =====  =               =  ===  == Solid State Device Statistics (rev 1) ==
0x07  0x008  1               1  N--  Percentage Used Endurance Indicator
                                |||_ C monitored condition met
                                ||__ D supports DSN
                                |___ N normalized value

28TB 的容量听起来有点低,因为我已经拥有这款 NVME 一年了,但这是可以接受的。然而,使用百分比耐久度指标仅为 1%。这意味着这款设备中仍有大约 100 倍或 2800TBW 的剩余容量,这是额定容量 1200TBW 的两倍多,所以这不可能是舍入误差。

smartctl 撒谎了吗?(并不是说它会撒谎;我的意思是,我的 NVME 是否在对 smartctl 撒谎,smartctl 是否误解了我的 NVME,等等?)我如何才能确定我的 NVME 中剩余的实际 TBW 寿命?

smartctl
  • 2 个回答
  • 859 Views
Martin Hope
kiler129
Asked: 2022-09-17 17:51:26 +0800 CST

SMART 测试在几次运行后自行修复

  • 1

我发现了一些奇怪的东西:

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%      8003         -
# 2  Extended offline    Completed: read failure       90%      8001         5907400
# 3  Extended offline    Completed: read failure       90%      8001         5907400
# 4  Extended offline    Completed: read failure       90%      8001         5907400
# 5  Extended offline    Completed: read failure       90%      8001         5907400
# 6  Short offline       Completed: read failure       80%      8001         5907400
# 7  Short offline       Completed: read failure       80%      8000         5907400
# 8  Extended offline    Completed without error       00%         1         -

我让驱动器抛出大量 ATA 错误,数据不可读。我决定对它进行 RMA,所以我运行了hdparm安全擦除程序并shred在它上面扔了一个。由于这是一个小型(500GB 三星 EVO)固态硬盘,因此运行速度相对较快。我跑了另一个smartctl -t short......它“修复”了自己。

该驱动器仍然具有ATA Error Count: 207以下属性:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   075   075   010    Pre-fail  Always       -       123
  9 Power_On_Hours          0x0032   098   098   000    Old_age   Always       -       8004
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       27
177 Wear_Leveling_Count     0x0013   099   099   000    Pre-fail  Always       -       4
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   075   075   010    Pre-fail  Always       -       123
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   075   075   010    Pre-fail  Always       -       123
187 Reported_Uncorrect      0x0032   099   099   000    Old_age   Always       -       207
190 Airflow_Temperature_Cel 0x0032   060   051   000    Old_age   Always       -       40
195 Hardware_ECC_Recovered  0x001a   199   199   000    Old_age   Always       -       207
199 UDMA_CRC_Error_Count    0x003e   099   099   000    Old_age   Always       -       1
235 Unknown_Attribute       0x0012   099   099   000    Old_age   Always       -       12
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       3737223587

是什么导致 SMART 测试突然“修复”自身?我不认为驱动器可以再信任了?但是,我怀疑三星现在是否会对其进行 RMA,因为它没有通过测试...

ssd smartctl
  • 1 个回答
  • 30 Views
Martin Hope
Stephen Boston
Asked: 2022-05-30 05:25:37 +0800 CST

新 SSD 上的 smartctl Power_Cycle_Count

  • 0

在我的系统上进行一次电源循环和一次 smartctl 长时间测试后,我的三星 EVO 1T SSD :

 ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

...

 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       6

...
 

那是VALUE:099关于什么的?考虑到自动化的 QA 周期,我希望肯定会少于 10 个。这是二手盘吗?或者 ...

ssd smartctl
  • 1 个回答
  • 47 Views
Martin Hope
Stephen Boston
Asked: 2022-05-25 14:47:05 +0800 CST

ssd 不会挂载:坏超级块但没有坏块:写入错误

  • 0

刚刚注意到我正在使用 SDD 作为 SSD。已更正

我需要帮助解释这种情况。/dev/sda是备份的数据磁盘并具有可重现的数据,因此这不是系统关键,但我想避免恢复/重建数据的工作,其中一些将非常耗时

是否可以恢复/修复?

如果有怎么办?如果我擦除磁盘以重新使用它的可靠性是什么?

摘要(详细报告如下):

  • 不会安装:坏超级块
  • badblocks 没有发现坏块
  • smartctl 没有报错
  • fsck 无法设置超级块标志
  • fdisk 显示干净的分区
  • dmesg 显示写入错误
  • parted 显示 792 GB 可用 1 TB 驱动器

挂载 ssd 失败,如下所示:

 [stephen@meer ~]$ sudo mount /dev/sda1 /mnt/sda
 mount: /mnt/sda: can't read superblock on /dev/sda1.
        dmesg(1) may have more information after failed mount system call.
 [stephen@meer ~]$ 
 

但 badblocks 没有发现坏块

 [root@meer stephen]# badblocks -v /dev/sda1              
 Checking blocks 0 to 976760831
 Checking for bad blocks (read-only test): done                                                 
 Pass completed, 0 bad blocks found. (0/0/0 errors)

但是 smartctl 没有发现错误

 [root@meer stephen]# smartctl -a /dev/sda 
 smartctl 7.3 2022-02-28 r5338 [x86_64-linux-5.17.9-arch1-1] (local build)
 Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
 
 === START OF INFORMATION SECTION ===
 Model Family:     WD Blue / Red / Green SSDs
 Device Model:     WDC  WDS100T2B0A-00SM50
 Serial Number:    213159800516
 LU WWN Device Id: 5 001b44 8bc4fdc6e
 Firmware Version: 415020WD
 User Capacity:    1,000,204,886,016 bytes [1.00 TB]
 Sector Size:      512 bytes logical/physical
 Rotation Rate:    Solid State Device
 Form Factor:      2.5 inches
 TRIM Command:     Available, deterministic, zeroed
 Device is:        In smartctl database 7.3/5319
 ATA Version is:   ACS-4 T13/BSR INCITS 529 revision 5
 SATA Version is:  SATA 3.3, 6.0 Gb/s (current: 1.5 Gb/s)
 Local Time is:    Tue May 24 16:06:23 2022 PDT
 SMART support is: Available - device has SMART capability.
 SMART support is: Enabled
 
 === START OF READ SMART DATA SECTION ===
 SMART overall-health self-assessment test result: PASSED
 
 General SMART Values:
 Offline data collection status:  (0x00)    Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
 Self-test execution status:      (   0)    The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
 Total time to complete Offline 
 data collection:       (    0) seconds.
 Offline data collection
 capabilities:           (0x11) SMART execute Offline immediate.
                    No Auto Offline data collection support.
                    Suspend Offline collection upon new
                    command.
                    No Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    No Selective Self-test supported.
 SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
 Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
 Short self-test routine 
 recommended polling time:   (   2) minutes.
 Extended self-test routine
 recommended polling time:   (  10) minutes.
 
 SMART Attributes Data Structure revision number: 4
 Vendor Specific SMART Attributes with Thresholds:
 ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
   5 Reallocated_Sector_Ct   0x0032   100   100   ---    Old_age   Always       -       124
   9 Power_On_Hours          0x0032   100   100   ---    Old_age   Always       -       1470
  12 Power_Cycle_Count       0x0032   100   100   ---    Old_age   Always       -       134
 165 Block_Erase_Count       0x0032   100   100   ---    Old_age   Always       -       4312400063
 166 Minimum_PE_Cycles_TLC   0x0032   100   100   ---    Old_age   Always       -       1
 167 Max_Bad_Blocks_per_Die  0x0032   100   100   ---    Old_age   Always       -       65
 168 Maximum_PE_Cycles_TLC   0x0032   100   100   ---    Old_age   Always       -       14
 169 Total_Bad_Blocks        0x0032   100   100   ---    Old_age   Always       -       630
 170 Grown_Bad_Blocks        0x0032   100   100   ---    Old_age   Always       -       124
 171 Program_Fail_Count      0x0032   100   100   ---    Old_age   Always       -       128
 172 Erase_Fail_Count        0x0032   100   100   ---    Old_age   Always       -       0
 173 Average_PE_Cycles_TLC   0x0032   100   100   ---    Old_age   Always       -       2
 174 Unexpected_Power_Loss   0x0032   100   100   ---    Old_age   Always       -       90
 184 End-to-End_Error        0x0032   100   100   ---    Old_age   Always       -       0
 187 Reported_Uncorrect      0x0032   100   100   ---    Old_age   Always       -       0
 188 Command_Timeout         0x0032   100   100   ---    Old_age   Always       -       64
 194 Temperature_Celsius     0x0022   070   053   ---    Old_age   Always       -       30 (Min/Max 18/53)
 199 UDMA_CRC_Error_Count    0x0032   100   100   ---    Old_age   Always       -       0
 230 Media_Wearout_Indicator 0x0032   001   001   ---    Old_age   Always       -       0x002600140026
 232 Available_Reservd_Space 0x0033   097   097   004    Pre-fail  Always       -       97
 233 NAND_GB_Written_TLC     0x0032   100   100   ---    Old_age   Always       -       2703
 234 NAND_GB_Written_SLC     0x0032   100   100   ---    Old_age   Always       -       2842
 241 Host_Writes_GiB         0x0030   253   253   ---    Old_age   Offline      -       466
 242 Host_Reads_GiB          0x0030   253   253   ---    Old_age   Offline      -       622
 244 Temp_Throttle_Status    0x0032   000   100   ---    Old_age   Always       -       0
 
 SMART Error Log Version: 1
 No Errors Logged
 
 SMART Self-test log structure revision number 1
 Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
 # 1  Extended offline    Completed without error       00%      1470         -
 
 Selective Self-tests/Logging not supported
 
 

并且 fsck 失败了:

 [root@meer ~]# e2fsck -cfpv /dev/sda1
 /dev/sda1: recovering journal
 e2fsck: Input/output error while recovering journal of /dev/sda1
 e2fsck: unable to set superblock flags on /dev/sda1
 
 
 /dev/sda1: ********** WARNING: Filesystem still has errors **********
 
 
 
 
 
 May 24 15:38:29 meer kernel: I/O error, dev sda, sector 121899008 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
 May 24 15:38:29 meer kernel: sd 2:0:0:0: [sda] tag#31 CDB: Write(10) 2a 00 07 44 08 00 00 00 08 00
 May 24 15:38:29 meer kernel: sd 2:0:0:0: [sda] tag#31 Add. Sense: Unaligned write command
 May 24 15:38:29 meer kernel: sd 2:0:0:0: [sda] tag#31 Sense Key : Illegal Request [current] 
 May 24 15:38:29 meer kernel: sd 2:0:0:0: [sda] tag#31 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
 May 24 15:38:29 meer kernel: ata3.00: configured for UDMA/33
 May 24 15:38:29 meer kernel: ata3.00: error: { ABRT }
 May 24 15:38:29 meer kernel: ata3.00: status: { DRDY ERR }
 May 24 15:38:29 meer kernel: ata3.00: cmd ca/00:08:00:08:44/00:00:00:00:00/e7 tag 31 dma 4096 out
                                       res 51/04:08:00:08:44/00:00:07:00:00/e7 Emask 0x1 (device error)
 May 24 15:38:29 meer kernel: ata3.00: failed command: WRITE DMA
 May 24 15:38:29 meer kernel: ata3.00: irq_stat 0x40000001
 May 24 15:38:29 meer kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
 May 24 15:38:29 meer kernel: ata3: EH complete
 May 24 15:38:29 meer kernel: ata3.00: configured for UDMA/33
 May 24 15:38:29 meer kernel: ata3.00: error: { ABRT }
 May 24 15:38:29 meer kernel: ata3.00: status: { DRDY ERR }
 May 24 15:38:29 meer kernel: ata3.00: cmd ca/00:08:00:08:44/00:00:00:00:00/e7 tag 6 dma 4096 out
                                       res 51/04:08:00:08:44/00:00:07:00:00/e7 Emask 0x1 (device error)
 May 24 15:38:29 meer kernel: ata3.00: failed command: WRITE DMA
 May 24 15:38:29 meer kernel: ata3.00: irq_stat 0x40000001
 May 24 15:38:29 meer kernel: ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
 

fdisk 看到的分区。

 Disk /dev/sda: 931.51 GiB, 1000204886016 bytes, 1953525168 sectors
 Disk model: WDC  WDS100T2B0A
 Units: sectors of 1 * 512 = 512 bytes
 Sector size (logical/physical): 512 bytes / 512 bytes
 I/O size (minimum/optimal): 512 bytes / 512 bytes
 Disklabel type: gpt
 Disk identifier: 3F701164-2CF8-6D48-A94E-478634C140BE
 
 Device     Start        End    Sectors   Size Type
 /dev/sda1   2048 1953523711 1953521664 931.5G Linux filesystem

来自 dmesg

 [ 5292.895300] ata3.00: configured for UDMA/33
 [ 5292.895315] ata3: EH complete
 [ 5293.021851] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
 [ 5293.021859] ata3.00: irq_stat 0x40000001
 [ 5293.021864] ata3.00: failed command: WRITE DMA
 [ 5293.021866] ata3.00: cmd ca/00:08:00:08:44/00:00:00:00:00/e7 tag 18 dma 4096 out
                         res 51/04:08:00:08:44/00:00:07:00:00/e7 Emask 0x1 (device error)
 [ 5293.021874] ata3.00: status: { DRDY ERR }
 [ 5293.021877] ata3.00: error: { ABRT }

分开:

 root@meer stephen]# parted /dev/sda
 GNU Parted 3.5
 Using /dev/sda
 Welcome to GNU Parted! Type 'help' to view a list of commands.
 (parted) print free                                                       
 Model: ATA WDC WDS100T2B0A (scsi)
 Disk /dev/sda: 1000GB
 Sector size (logical/physical): 512B/512B
 Partition Table: gpt
 Disk Flags: 
 
 Number  Start   End     Size    File system  Name  Flags
         17.4kB  1049kB  1031kB  Free Space
  1      1049kB  1000GB  1000GB  ext4
         1000GB  1000GB  729kB   Free Space
 
disk smartctl
  • 1 个回答
  • 366 Views
Martin Hope
Hills of Eternity
Asked: 2022-01-28 08:14:10 +0800 CST

Marvell 88SE9230 上的 Linux。如何获得统计数据?

  • 1

我在我的家庭 Linux 服务器上使用 Marvell 88SE9230 控制器。惠普确实有设置突袭和获取一些统计数据的实用程序。但我想知道如何从 Linux 系统获得任何状态。快速谷歌搜索仅显示用于访问以前版本内核上的阵列本身的 Linux 驱动程序,但我想知道驱动器的 SMART 状态。

Smartctl 不起作用:

root@iris:~# smartctl -a -d marvell -T verypermissive /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-96-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

Read Device Identity failed: Unknown error

=== START OF INFORMATION SECTION ===
Device Model:     [No Information Found]
Serial Number:    [No Information Found]
Firmware Version: [No Information Found]
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   [No Information Found]
Local Time is:    Thu Jan 27 19:11:54 2022 MSK
SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 82-83 don't show if SMART supported.
SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 85-87 don't show if SMART is enabled.
                  Checking to be sure by trying SMART RETURN STATUS command.
SMART support is: Unknown - Try option -s with argument 'on' to enable it.
Read SMART Data failed: Success

=== START OF READ SMART DATA SECTION ===
SMART Status command failed: Success
SMART overall-health self-assessment test result: UNKNOWN!
SMART Status, Attributes and Thresholds cannot be read.

Read SMART Error Log failed: Success

Read SMART Self-test Log failed: Success

Selective Self-tests/Logging not supported

如何从控制器中获取至少一些统计信息?

linux smartctl
  • 1 个回答
  • 468 Views
Martin Hope
Martin Vegter
Asked: 2019-07-10 09:20:26 +0800 CST

smartctl -a /dev/sda 在全新 SSD 上显示错误

  • 2

我刚刚在我的笔记本电脑中安装了全新的 SSD,我看到smartctl -a /dev/sda已经显示错误

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   095   095   050    Old_age   Always       -       2/4698640
  5 Retired_Block_Count     0x0033   100   100   003    Pre-fail  Always       -       0
  9 Power_On_Hours_and_Msec 0x0032   100   100   000    Old_age   Always       -       0h+16m+22.260s
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       3
171 Program_Fail_Count      0x000a   100   100   000    Old_age   Always       -       0
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
174 Unexpect_Power_Loss_Ct  0x0030   000   000   000    Old_age   Offline      -       2
177 Wear_Range_Delta        0x0000   000   000   000    Old_age   Offline      -       0
181 Program_Fail_Count      0x000a   100   100   000    Old_age   Always       -       0
182 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
187 Reported_Uncorrect      0x0012   100   100   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   037   043   000    Old_age   Always       -       37 (Min/Max 24/43)
195 ECC_Uncorr_Error_Count  0x001c   099   099   000    Old_age   Offline      -       2/4698640
196 Reallocated_Event_Count 0x0033   100   100   003    Pre-fail  Always       -       0
201 Unc_Soft_Read_Err_Rate  0x001c   099   099   000    Old_age   Offline      -       2/4698640
204 Soft_ECC_Correct_Rate   0x001c   099   099   000    Old_age   Offline      -       2/4698640
230 Life_Curve_Status       0x0013   100   100   000    Pre-fail  Always       -       100
231 SSD_Life_Left           0x0013   100   100   010    Pre-fail  Always       -       25769803776
233 SandForce_Internal      0x0032   000   000   000    Old_age   Always       -       0
234 SandForce_Internal      0x0032   000   000   000    Old_age   Always       -       0
241 Lifetime_Writes_GiB     0x0032   000   000   000    Old_age   Always       -       0
242 Lifetime_Reads_GiB      0x0032   000   000   000    Old_age   Always       -       2

具体来说,这些属性显示非零:

Raw_Read_Error_Rate
ECC_Uncorr_Error_Count
Unc_Soft_Read_Err_Rate
Soft_ECC_Correct_Rate

这是否意味着我的 SSD 已经出现故障?

ssd smartctl
  • 1 个回答
  • 407 Views
Martin Hope
Chris Stryczynski
Asked: 2019-02-26 10:49:42 +0800 CST

SMART 测试在之前的测试失败后没有失败地完成,没有重新分配任何扇区?

  • 0

我有一个驱动器在其 SMART 测试中失败,其形式如下:

smartctl -a /dev/sdc:

...
# 1  Short offline       Completed: read failure       50%      6354         4377408
# 2  Extended offline    Completed: read failure       90%      6354         4377408

然后我想将此“扇区”标记为坏扇区,所以我认为我只需要在其上写入大量数据。所以我dd以前写了一堆零。这填满了驱动器,之后我又进行了一次智能测试。

它成功完成,但是查看 SMART 属性,我没有看到任何变化:

196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0

除了完全知道我总是面临驱动器故障的风险之外,上述信息是否与驱动器故障相关?

以下是 smartctl 属性之前/之后的差异:

diff --git a/x.txt b/x.txt
index 4cfe1b7..1bcace5 100644
--- a/x.txt
+++ b/x.txt
@@ -12,7 +12,7 @@ Sector Sizes:     512 bytes logical, 4096 bytes physical
 Device is:        In smartctl database [for details use: -P show]
 ATA Version is:   ACS-2 (minor revision not indicated)
 SATA Version is:  SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s)
-Local Time is:    Sun Feb 24 16:50:01 2019 GMT
+Local Time is:    Mon Feb 25 18:33:35 2019 GMT
 SMART support is: Available - device has SMART capability.
 SMART support is: Enabled

@@ -55,31 +55,38 @@ SCT capabilities:          (0x70b5) SCT Status supported.
 SMART Attributes Data Structure revision number: 16
 Vendor Specific SMART Attributes with Thresholds:
 ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
-  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
-  3 Spin_Up_Time            0x0027   180   179   021    Pre-fail  Always       -       5991
-  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       114
+  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       4
+  3 Spin_Up_Time            0x0027   177   177   021    Pre-fail  Always       -       6116
+  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       116
   5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
   7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
-  9 Power_On_Hours          0x0032   092   092   000    Old_age   Always       -       6356
+  9 Power_On_Hours          0x0032   092   092   000    Old_age   Always       -       6372
  10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
  11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
- 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       57
+ 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       59
 192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       46
-193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       67
-194 Temperature_Celsius     0x0022   122   114   000    Old_age   Always       -       28
+193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       69
+194 Temperature_Celsius     0x0022   116   114   000    Old_age   Always       -       34
 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
 197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
 198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
-200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       1
+200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

 SMART Error Log Version: 1
 No Errors Logged

 SMART Self-test log structure revision number 1
 Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
-# 1  Short offline       Completed: read failure       50%      6354         4377408
-# 2  Extended offline    Completed: read failure       90%      6354         4377408
+# 1  Extended offline    Completed without error       00%      6367         -
+# 2  Short offline       Completed: read failure       60%      6361         4377409
+# 3  Short offline       Completed: read failure       50%      6361         4377409
+# 4  Extended offline    Completed: read failure       90%      6359         4377409
+# 5  Short offline       Completed without error       00%      6359         -
+# 6  Short offline       Completed: read failure       60%      6356         4377409
+# 7  Short offline       Completed: read failure       50%      6354         4377408
+# 8  Extended offline    Completed: read failure       90%      6354         4377408
+6 of 6 failed self-tests are outdated by newer successful extended offline self-test # 1

 SMART Selective self-test log data structure revision number 1
  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

和当前的输出smartctl -a:

smartctl 6.6 2018-12-05 r4851 [x86_64-linux-4.14.98] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital AV-GP (AF)
Device Model:     WDC WD20EURS-63SPKY0
Serial Number:    WD-WMC1T2763021
LU WWN Device Id: 5 0014ee 6addb4b7c
Firmware Version: 80.00A80
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Mon Feb 25 18:49:12 2019 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (27240) seconds.
Offline data collection
capabilities:            (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    ( 275) minutes.
Conveyance self-test routine
recommended polling time:    (   5) minutes.
SCT capabilities:          (0x70b5) SCT Status supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       4
  3 Spin_Up_Time            0x0027   177   177   021    Pre-fail  Always       -       6116
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       116
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   092   092   000    Old_age   Always       -       6373
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       59
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       46
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       69
194 Temperature_Celsius     0x0022   116   114   000    Old_age   Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      6367         -
# 2  Short offline       Completed: read failure       60%      6361         4377409
# 3  Short offline       Completed: read failure       50%      6361         4377409
# 4  Extended offline    Completed: read failure       90%      6359         4377409
# 5  Short offline       Completed without error       00%      6359         -
# 6  Short offline       Completed: read failure       60%      6356         4377409
# 7  Short offline       Completed: read failure       50%      6354         4377408
# 8  Extended offline    Completed: read failure       90%      6354         4377408
6 of 6 failed self-tests are outdated by newer successful extended offline self-test # 1

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
disk smartctl
  • 1 个回答
  • 778 Views
Martin Hope
Chris Stryczynski
Asked: 2019-02-25 05:48:20 +0800 CST

smartctl 报告整体健康测试通过但测试失败?

  • 6

为什么SMART overall-health self-assessment test result: PASSED两次测试都失败时显示?

sudo smartctl -a /dev/sdc 
smartctl 6.6 2018-12-05 r4851 [x86_64-linux-4.14.98] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital AV-GP (AF)
Device Model:     WDC WD20EURS-63SPKY0
Serial Number:    WD-WMC1T2763021
LU WWN Device Id: 5 0014ee 6addb4b7c
Firmware Version: 80.00A80
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Sun Feb 24 13:43:30 2019 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 117) The previous self-test completed having
                    the read element of the test failed.
Total time to complete Offline 
data collection:        (27240) seconds.
Offline data collection
capabilities:            (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    ( 275) minutes.
Conveyance self-test routine
recommended polling time:    (   5) minutes.
SCT capabilities:          (0x70b5) SCT Status supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   180   179   021    Pre-fail  Always       -       5991
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       113
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   092   092   000    Old_age   Always       -       6354
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       56
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       46
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       66
194 Temperature_Celsius     0x0022   122   114   000    Old_age   Always       -       28
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       1

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       50%      6354         4377408
# 2  Extended offline    Completed: read failure       90%      6354         4377408

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
hard-disk smartctl
  • 2 个回答
  • 7320 Views
Martin Hope
Steve Brown
Asked: 2019-01-28 08:39:53 +0800 CST

是否可以将 Debian 9 的 SMART Monitoring 放入 Slack?

  • 3

我最近在家庭服务器上设置了 Debian 9,我想对我的 HDD 运行一些 SMART 检查,然后在有任何问题时收到警报。理想情况下,我希望警报进入我的 Slack 实例,因为我已将它连接到我的智能手机并发现它非常有用(我的 UPS 已经收到警报,并且那里的 ping 失败)。

我一直在研究 smartd/smartctl,但我似乎找不到将通知放入 Slack 的方法。

我希望可以在某处调用一些 bash 脚本来处理通知并调用另一个(python)脚本将通知发送到 Slack。

(编辑:只是为了澄清我已经有了 Slack Notifications 的 Python 脚本,因为我在其他地方使用它,所以我们在那里很好)

编辑 以下两种解决方案都经过测试,对我来说效果很好。我选择了邮件 + 脚本解决方案,因为它涵盖了我在 Slack 因任何原因无法工作的情况下,但两者都很可靠,我感谢 RalfFriedl 为我包括环境变量。

debian smartctl
  • 2 个回答
  • 767 Views

Sidebar

Stats

  • 问题 205573
  • 回答 270741
  • 最佳答案 135370
  • 用户 68524
  • 热门
  • 回答
  • Marko Smith

    模块 i915 可能缺少固件 /lib/firmware/i915/*

    • 3 个回答
  • Marko Smith

    无法获取 jessie backports 存储库

    • 4 个回答
  • Marko Smith

    如何将 GPG 私钥和公钥导出到文件

    • 4 个回答
  • Marko Smith

    我们如何运行存储在变量中的命令?

    • 5 个回答
  • Marko Smith

    如何配置 systemd-resolved 和 systemd-networkd 以使用本地 DNS 服务器来解析本地域和远程 DNS 服务器来解析远程域?

    • 3 个回答
  • Marko Smith

    dist-upgrade 后 Kali Linux 中的 apt-get update 错误 [重复]

    • 2 个回答
  • Marko Smith

    如何从 systemctl 服务日志中查看最新的 x 行

    • 5 个回答
  • Marko Smith

    Nano - 跳转到文件末尾

    • 8 个回答
  • Marko Smith

    grub 错误:你需要先加载内核

    • 4 个回答
  • Marko Smith

    如何下载软件包而不是使用 apt-get 命令安装它?

    • 7 个回答
  • Martin Hope
    user12345 无法获取 jessie backports 存储库 2019-03-27 04:39:28 +0800 CST
  • Martin Hope
    Carl 为什么大多数 systemd 示例都包含 WantedBy=multi-user.target? 2019-03-15 11:49:25 +0800 CST
  • Martin Hope
    rocky 如何将 GPG 私钥和公钥导出到文件 2018-11-16 05:36:15 +0800 CST
  • Martin Hope
    Evan Carroll systemctl 状态显示:“状态:降级” 2018-06-03 18:48:17 +0800 CST
  • Martin Hope
    Tim 我们如何运行存储在变量中的命令? 2018-05-21 04:46:29 +0800 CST
  • Martin Hope
    Ankur S 为什么 /dev/null 是一个文件?为什么它的功能不作为一个简单的程序来实现? 2018-04-17 07:28:04 +0800 CST
  • Martin Hope
    user3191334 如何从 systemctl 服务日志中查看最新的 x 行 2018-02-07 00:14:16 +0800 CST
  • Martin Hope
    Marko Pacak Nano - 跳转到文件末尾 2018-02-01 01:53:03 +0800 CST
  • Martin Hope
    Kidburla 为什么真假这么大? 2018-01-26 12:14:47 +0800 CST
  • Martin Hope
    Christos Baziotis 在一个巨大的(70GB)、一行、文本文件中替换字符串 2017-12-30 06:58:33 +0800 CST

热门标签

linux bash debian shell-script text-processing ubuntu centos shell awk ssh

Explore

  • 主页
  • 问题
    • 最新
    • 热门
  • 标签
  • 帮助

Footer

AskOverflow.Dev

关于我们

  • 关于我们
  • 联系我们

Legal Stuff

  • Privacy Policy

Language

  • Pt
  • Server
  • Unix

© 2023 AskOverflow.DEV All Rights Reserve