LINUX.ORG.RU

Развалилось.

 , ,


0

2

Собственно, да. Развалилось.

На старте стабильно вываливается бесконечная куча ошибок, но после загрузки диск работает, и зеркало с ним ребилдится. До ребута. SMART хороший.

Это что-то интересное или банальный поход за новым диском?

[    0.973630] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    0.974737] ata5.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) rejected by device (Stat=0x61 Err=0x04)
[    0.974738] ata5.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[    0.974739] ata5.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[    0.975149] ata5.00: failed to read native max address (err_mask=0x1)
[    0.975150] ata5.00: HPA support seems broken, skipping HPA handling
[    0.975605] ata5.00: failed to enable AA (error_mask=0x1)
[    0.975607] ata5.00: ATA-10: WDC WD10EZEX-00BBHA0, 01.01A01, max UDMA/133
[    0.975608] ata5.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 32)
[    0.976865] ata5.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) rejected by device (Stat=0x61 Err=0x04)
[    0.976867] ata5.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[    0.976867] ata5.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[    0.977029] ata5.00: failed to enable AA (error_mask=0x1)
[    0.977032] ata5.00: configured for UDMA/133 (device error ignored)

[    1.039425] ata5.00: exception Emask 0x0 SAct 0x20000000 SErr 0x0 action 0x0
[    1.039426] ata5.00: irq_stat 0x40000008
[    1.039428] ata5.00: failed command: READ FPDMA QUEUED
[    1.039430] ata5.00: cmd 60/08:e8:00:00:00/00:00:00:00:00/40 tag 29 ncq dma 4096 in
                        res 61/04:00:00:00:00/00:00:00:00:00/00 Emask 0x401 (device error) <F>
[    1.039431] ata5.00: status: { DRDY DF ERR }
[    1.039432] ata5.00: error: { ABRT }
[    1.039930] ata5.00: failed to enable AA (error_mask=0x1)
[    1.041448] ata5.00: failed to enable AA (error_mask=0x1)
[    1.041451] ata5.00: configured for UDMA/133 (device error ignored)
[    1.073431] ata5: EH complete
[    1.087221] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[    1.087222] ata5.00: irq_stat 0x40000001
[    1.087224] ata5.00: failed command: READ DMA
[    1.087226] ata5.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 1 dma 4096 in
                        res 61/04:08:00:00:00/00:00:00:00:00/e0 Emask 0x1 (device error)
[    1.087226] ata5.00: status: { DRDY DF ERR }
[    1.087227] ata5.00: error: { ABRT }
[    1.088297] ata5.00: failed to enable AA (error_mask=0x1)
[    1.089810] ata5.00: failed to enable AA (error_mask=0x1)
[    1.089813] ata5.00: configured for UDMA/133 (device error ignored)
[    1.089819] ata5: EH complete
[    1.103221] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[    1.103222] ata5.00: irq_stat 0x40000001
[    1.103223] ata5.00: failed command: READ DMA
[    1.103225] ata5.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 2 dma 4096 in
                        res 61/04:08:00:00:00/00:00:00:00:00/e0 Emask 0x1 (device error)
[    1.103226] ata5.00: status: { DRDY DF ERR }
[    1.103227] ata5.00: error: { ABRT }
[    1.104290] ata5.00: failed to enable AA (error_mask=0x1)
[    1.105802] ata5.00: failed to enable AA (error_mask=0x1)
[    1.105805] ata5.00: configured for UDMA/133 (device error ignored)
[    1.105812] ata5: EH complete
[    1.119222] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[    1.119224] ata5.00: irq_stat 0x40000001
[    1.119225] ata5.00: failed command: READ DMA
[    1.119227] ata5.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 3 dma 4096 in
                        res 61/04:08:00:00:00/00:00:00:00:00/e0 Emask 0x1 (device error)
[    1.119228] ata5.00: status: { DRDY DF ERR }
[    1.119228] ata5.00: error: { ABRT }
[    1.120009] ata5.00: failed to enable AA (error_mask=0x1)
[    1.121235] ata5.00: failed to enable AA (error_mask=0x1)
[    1.121238] ata5.00: configured for UDMA/133 (device error ignored)
[    1.121244] ata5: EH complete
[    1.135221] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[    1.135222] ata5.00: irq_stat 0x40000001
[    1.135224] ata5.00: failed command: READ DMA
[    1.135226] ata5.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 4 dma 4096 in
                        res 61/04:08:00:00:00/00:00:00:00:00/e0 Emask 0x1 (device error)
[    1.135227] ata5.00: status: { DRDY DF ERR }
[    1.135227] ata5.00: error: { ABRT }
[    1.136021] ata5.00: failed to enable AA (error_mask=0x1)
[    1.137545] ata5.00: failed to enable AA (error_mask=0x1)
[    1.137548] ata5.00: configured for UDMA/133 (device error ignored)
[    1.137555] sd 4:0:0:0: [sdc] tag#4 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[    1.137557] sd 4:0:0:0: [sdc] tag#4 Sense Key : Illegal Request [current] 
[    1.137559] sd 4:0:0:0: [sdc] tag#4 Add. Sense: Unaligned write command
[    1.137560] sd 4:0:0:0: [sdc] tag#4 CDB: Read(10) 28 00 00 00 00 00 00 00 08 00
[    1.137562] blk_update_request: I/O error, dev sdc, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[    1.137563] Buffer I/O error on dev sdc, logical block 0, async page read
[    1.137566] ata5: EH complete
[    1.163234] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[    1.163236] ata5.00: irq_stat 0x40000001
[    1.163237] ata5.00: failed command: READ DMA
[    1.163239] ata5.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 30 dma 4096 in
                        res 61/04:08:00:00:00/00:00:00:00:00/e0 Emask 0x1 (device error)
[    1.163240] ata5.00: status: { DRDY DF ERR }
[    1.163240] ata5.00: error: { ABRT }
[    1.164044] ata5.00: failed to enable AA (error_mask=0x1)
[    1.165561] ata5.00: failed to enable AA (error_mask=0x1)
[    1.165564] ata5.00: configured for UDMA/133 (device error ignored)
[    1.165570] ata5: EH complete

...

[   14.481952] ata5: EH complete
[   14.495270] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[   14.495274] ata5.00: irq_stat 0x40000001
[   14.495276] ata5.00: failed command: READ DMA EXT
[   14.495279] ata5.00: cmd 25/00:08:00:6d:70/00:00:74:00:00/e0 tag 18 dma 4096 in
                        res 61/04:08:00:6d:70/00:00:74:00:00/e0 Emask 0x1 (device error)
[   14.495280] ata5.00: status: { DRDY DF ERR }
[   14.495280] ata5.00: error: { ABRT }
[   14.496106] ata5.00: failed to enable AA (error_mask=0x1)
[   14.497616] ata5.00: failed to enable AA (error_mask=0x1)
[   14.497623] ata5.00: configured for UDMA/133 (device error ignored)
[   14.497645] sd 4:0:0:0: [sdc] tag#18 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[   14.497648] sd 4:0:0:0: [sdc] tag#18 Sense Key : Illegal Request [current] 
[   14.497653] sd 4:0:0:0: [sdc] tag#18 Add. Sense: Unaligned write command
[   14.497656] sd 4:0:0:0: [sdc] tag#18 CDB: Read(10) 28 00 74 70 6d 00 00 00 08 00
[   14.497660] blk_update_request: I/O error, dev sdc, sector 1953524992 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[   14.497663] Buffer I/O error on dev sdc, logical block 244190624, async page read
[   14.497678] ata5: EH complete


Ответ на: комментарий от vvn_black

Самотестирование ещё не завершилось, но вангую, что проблем не найдёт.

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Blue
Device Model:     WDC WD10EZEX-00BBHA0
Serial Number:    WD-WCC6Y7EXN7RF
LU WWN Device Id: 5 0014ee 214de31b9
Firmware Version: 01.01A01
User Capacity:    1 000 204 886 016 bytes [1,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Tue Aug 29 15:18:53 2023 MSK
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 244)	Self-test routine in progress...
					40% of test remaining.
Total time to complete Offline 
data collection: 		(10740) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 112) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   174   172   021    Pre-fail  Always       -       2291
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       183
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       64
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       176
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       6
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       182
194 Temperature_Celsius     0x0022   104   104   000    Old_age   Always       -       39
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

bo4ok
() автор топика
Ответ на: комментарий от bo4ok

WD10EZEX

У меня два таких в btrfs raid1 (6000+ часов наработки), в конце года заканчивается гарантия магазина.

Недавно тоже была проблема с рейдом, на одном после self-теста появились Current_Pending_Sector и Offline_Uncorrectable, после восстановления секторов счётчик Current_Pending_Sector обнулился, Reallocated_Event_Count так и остался нулевым. Решил пока подождать.

vvn_black ★★★★★
()
Последнее исправление: vvn_black (всего исправлений: 2)
Ответ на: комментарий от vvn_black

Мда, какчество нынче… 8к часов и уже страшно.

Device Model:     WDC WD4000F9YZ-09N20L0
  9 Power_On_Hours          0x0032   049   049   000    Old_age   Always       -       37365

Device Model:     WDC WD4000F9YZ-09N20L1
  9 Power_On_Hours          0x0032   064   064   000    Old_age   Always       -       26988

А этим хоть бы хны…

pekmop1024 ★★★★★
()

Йа не уверен, что случилось, но диск решил обнулиться)

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Blue
Device Model:     WDC WD10EZEX-00BBHA0
Serial Number:    WD-WCC6Y7EXN7RF
LU WWN Device Id: 5 0014ee 214de31b9
Firmware Version: 01.01A01
User Capacity:    1 000 204 886 016 bytes [1,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Wed Aug 30 00:41:52 2023 MSK
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(10740) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 112) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   253   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   100   253   021    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       1
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       0
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       1
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       0
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       1
194 Temperature_Celsius     0x0022   110   110   000    Old_age   Always       -       33
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   253   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%         0         -
# 2  Short offline       Completed without error       00%         0         -
# 3  Short offline       Interrupted (host reset)      90%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

bo4ok
() автор топика
Ответ на: комментарий от bo4ok

Как интересно. А те 64 часа, что были в предыдущем сообщении, реальны, накопитель с завода такой?

А что покажет ″smartclt -l scttemp″, винт забыл лог температуры, или ноль часов, но при этом куча данных из прошлого?

Вобще, лучше было смотреть/сохранить ″smartctl -x″, там больше бесполезных буковок :)

mky ★★★★★
()
Ответ на: комментарий от fresa

Мя не знаю уже, там железо не самое новое. Что-то вроде 4 поколения интела.

Оно сейчас стоит и потихоньку срёт в лог ошибками и «ata5: limiting SATA link speed to 1.5 Gbps». Кабель меняли, толку нет.

bo4ok
() автор топика
Ответ на: комментарий от bo4ok

1.5 это уже не нормально, если порт умеет в 6 или 3(в доках к плате написано какие порты сколько могут). если кабель меняли и другой диск не заводится на максимальную скорость, то мб порт отмирает

fresa
()
Последнее исправление: fresa (всего исправлений: 1)
Ответ на: комментарий от Pinkbyte

как вариант - служебка осыпалась на месте смарта. битый сектор, модуль служебки не вычитывается, чтобы не пугать юзера (ну не возвращать же диск на завод из-за такой мелочи!) - инициализируется чистый смарт.

NiTr0 ★★★★★
()

Долгожданное обновление! \o/

Новый диск проработал неделю и сделал

[  314.969459] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[  314.969466] ata5.00: irq_stat 0x40000001
[  314.969470] ata5.00: failed command: FLUSH CACHE EXT
[  314.969481] ata5.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 11
                        res 61/04:00:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error)
[  314.969485] ata5.00: status: { DRDY DF ERR }
[  314.969489] ata5.00: error: { ABRT }
[  314.970665] ata5.00: failed to enable AA (error_mask=0x1)
[  314.972294] ata5.00: failed to enable AA (error_mask=0x1)
[  314.972305] ata5.00: configured for UDMA/133 (device error ignored)
[  314.972310] ata5.00: device reported invalid CHS sector 0
[  314.972330] sd 4:0:0:0: [sdc] tag#11 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[  314.972336] sd 4:0:0:0: [sdc] tag#11 Sense Key : Illegal Request [current] 
[  314.972341] sd 4:0:0:0: [sdc] tag#11 Add. Sense: Unaligned write command
[  314.972347] sd 4:0:0:0: [sdc] tag#11 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00
[  314.972355] blk_update_request: I/O error, dev sdc, sector 8 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[  314.972360] md: super_written gets error=10
[  314.972365] md/raid1:md127: Disk failure on sdc, disabling device.
               md/raid1:md127: Operation continuing on 1 devices.

При этом SMART классически PASSED и сбойных секторов нет, самотестирование, как и на прошлом, проходит без ошибок за 0 секунд. Полный смарт: https://hastebin.com/share/budavevadu.markdown

bo4ok
() автор топика