LINUX.ORG.RU

Што это было?...


0

0

Вдруг всё подвисло, непрекращающаяся активность винта,
странные звуки из него, через несколько минут прошло.
dmesg выдал:

ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x80000 action 0x0
ata5.00: BMDMA stat 0x25
ata5: SError: { 10B8B }
ata5.00: cmd c8/00:18:bf:9a:e6/00:00:00:00:00/e0 tag 0 dma 12288 in
res 51/40:03:d4:9a:e6/00:00:00:00:00/e0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/133
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x80000 action 0x0
ata5.00: BMDMA stat 0x25
ata5: SError: { 10B8B }
ata5.00: cmd c8/00:18:bf:9a:e6/00:00:00:00:00/e0 tag 0 dma 12288 in
res 51/40:03:d4:9a:e6/00:00:00:00:00/e0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/133
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x80000 action 0x0
ata5.00: BMDMA stat 0x25
ata5: SError: { 10B8B }
ata5.00: cmd c8/00:18:bf:9a:e6/00:00:00:00:00/e0 tag 0 dma 12288 in
res 51/40:03:d4:9a:e6/00:00:00:00:00/e0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/133
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x80000 action 0x0
ata5.00: BMDMA stat 0x25
ata5: SError: { 10B8B }
ata5.00: cmd c8/00:18:bf:9a:e6/00:00:00:00:00/e0 tag 0 dma 12288 in
res 51/40:03:d4:9a:e6/00:00:00:00:00/e0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/133
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x80000 action 0x0
ata5.00: BMDMA stat 0x25
ata5: SError: { 10B8B }
ata5.00: cmd c8/00:18:bf:9a:e6/00:00:00:00:00/e0 tag 0 dma 12288 in
res 51/40:03:d4:9a:e6/00:00:00:00:00/e0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/133
ata5: EH complete
ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x80000 action 0x0
ata5.00: BMDMA stat 0x25
ata5: SError: { 10B8B }
ata5.00: cmd c8/00:18:bf:9a:e6/00:00:00:00:00/e0 tag 0 dma 12288 in
res 51/40:03:d4:9a:e6/00:00:00:00:00/e0 Emask 0x9 (media error)
ata5.00: status: { DRDY ERR }
ata5.00: error: { UNC }
ata5.00: configured for UDMA/133
sd 4:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE,SUGGEST_OK
sd 4:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
Descriptor sense data with sense descriptors (in hex):
72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
00 e6 9a d4
sd 4:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
end_request: I/O error, dev sda, sector 15112916
ata5: EH complete
sd 4:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
sd 4:0:0:0: [sda] Write Protect is off
sd 4:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 4:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 4:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 MB)
sd 4:0:0:0: [sda] Write Protect is off
sd 4:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 4:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA


и много такого же.

Re: Што это было?...

Проверь SMART и поверхность диска какой-нибудь MHDD

devl547 ★★★★★ ()

Re: Што это было?...

Очередной сигейт?

anonymous ()
Ответ на: Re: Што это было?... от wfrr

Re: Што это было?...

>Внезапно какойнить slocate запустился.

Может это тихо и незаметно wine обновился. Ну почти незаметно.

sskirtochenko ★★ ()

Re: Што это было?...

Походу какая-то аппаратно-софтовая несовместимость винта и контроллера.
Думаю, что смотреть надо в сторону контроллера/драйвера
У меня такой глюк был с контроллером SATA-150 на sis'овском чипсете, причем не со всеми винтами. При этом росло значение UDMA_CRC_Error_Count в смарте. Теперь эти же самые винты вполне себе нормально работают на мареринке с контроллером intel.

isn ★★ ()

Re: Што это было?...

Такая же беда. 

ata1.01: qc timeout (cmd 0xa0)
ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata1.01: cmd a0/00:00:00:00:00/00:00:00:00:00/b0 tag 0
         cdb 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
         res 51/20:03:00:00:00/00:00:00:00:00/b0 Emask 0x5 (timeout)
ata1.01: status: { DRDY ERR }
ata1: link is slow to respond, please be patient (ready=0)
ata1: device not ready (errno=-16), forcing hardreset
ata1: soft resetting link
ata1.00: configured for UDMA/100
ata1.01: configured for UDMA/33
ata1: EH complete
sd 0:0:0:0: [sda] 156301488 512-byte hardware sectors: (80.0 GB/74.5 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 0:0:0:0: [sda] 156301488 512-byte hardware sectors: (80.0 GB/74.5 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA


ноут, 
00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 02)
пробовал другой винт, да и этот в mhdd на удивление свежий (не смотря на то что ему два года и он samsung)

Причем со старым драйвером (который винты называл hda,hdb..) раньше проблемы устранялись, а в 2.6.28 он стал вешать ядро намертво (uptime 2-3 часа в среднем). С драйвером который дает sda,sdb.. ситуация стабильная - раз в пол часа такая хрень проскакивает и приложения при обращении к винту виснут. Лежит секунд 30, потом поднимается как ни в чем не бывало.

ei-grad ★★★★★ ()

Re: Што это было?...

Хотя у автора топика скорее всё таки блоки битые...

ei-grad ★★★★★ ()

Re: Што это было?...

А извините, системную инфу написать забыл...

Винт Hitachi HDS72161, система OpenSuSE 11.0, не обновляемая

Linux asu-mihas.medgorod.ru 2.6.25.5-1.1-pae #1 SMP 2008-06-07 01:55:22 +0200 i686 i686 i386 GNU/Linux

SMART показывает:


asu-mihas:/home/mihas # smartctl -a /dev/sda
smartctl 5.39 2008-05-08 21:56 [i686-pc-linux-gnu] (local build)
Copyright (C) 2002-8 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Hitachi Deskstar 7K160
Device Model:     Hitachi HDS721616PLA380
Serial Number:    PVF904Z5RUYZ6N
Firmware Version: P22OABEA
User Capacity:    160 041 885 696 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  ATA/ATAPI-7 T13 1532D revision 1
Local Time is:    Mon Feb  2 07:53:24 2009 KRAT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                 (2865) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        (  48) minutes.
SCT capabilities:              (0x003f) SCT Status supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   089   089   016    Pre-fail  Always       -       4915202
  2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0007   131   131   024    Pre-fail  Always       -       163 (Average 141)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       32
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       13
  7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   100   100   020    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0012   099   099   000    Old_age   Always       -       11240
 10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       31
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       490
193 Load_Cycle_Count        0x0012   100   100   000    Old_age   Always       -       490
194 Temperature_Celsius     0x0002   153   153   000    Old_age   Always       -       39 (Lifetime Min/Max 20/47)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       14
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       1
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   253   000    Old_age   Always       -       0

SMART Error Log Version: 1
ATA Error Count: 414 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 414 occurred at disk power-on lifetime: 11221 hours (467 days + 13 hours)
  When the command that caused the error occurred, the device was active or idle.

Ay49Mihas ★★★★ ()

Re: Што это было?...

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 03 d4 9a e6 e0  Error: UNC 3 sectors at LBA = 0x00e69ad4 = 15112916

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 cf 9a e6 e0 08   4d+06:47:30.704  READ DMA
  27 00 00 00 00 00 e0 08   4d+06:47:30.704  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 08   4d+06:47:30.704  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 08   4d+06:47:30.704  SET FEATURES [Set transfer mode]
  27 00 00 00 00 00 e0 08   4d+06:47:30.704  READ NATIVE MAX ADDRESS EXT

Error 413 occurred at disk power-on lifetime: 11221 hours (467 days + 13 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 03 d4 9a e6 e0  Error: UNC 3 sectors at LBA = 0x00e69ad4 = 15112916

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 cf 9a e6 e0 08   4d+06:47:26.004  READ DMA
  27 00 00 00 00 00 e0 08   4d+06:47:26.004  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 08   4d+06:47:26.004  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 08   4d+06:47:26.004  SET FEATURES [Set transfer mode]
  27 00 00 00 00 00 e0 08   4d+06:47:26.004  READ NATIVE MAX ADDRESS EXT

Error 412 occurred at disk power-on lifetime: 11221 hours (467 days + 13 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 03 d4 9a e6 e0  Error: UNC 3 sectors at LBA = 0x00e69ad4 = 15112916

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 cf 9a e6 e0 08   4d+06:47:21.304  READ DMA
  27 00 00 00 00 00 e0 08   4d+06:47:21.304  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 08   4d+06:47:21.304  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 08   4d+06:47:21.304  SET FEATURES [Set transfer mode]
  27 00 00 00 00 00 e0 08   4d+06:47:21.304  READ NATIVE MAX ADDRESS EXT

Error 411 occurred at disk power-on lifetime: 11221 hours (467 days + 13 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 03 d4 9a e6 e0  Error: UNC 3 sectors at LBA = 0x00e69ad4 = 15112916

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 cf 9a e6 e0 08   4d+06:47:16.504  READ DMA
  27 00 00 00 00 00 e0 08   4d+06:47:16.504  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 08   4d+06:47:16.504  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 08   4d+06:47:16.504  SET FEATURES [Set transfer mode]
  27 00 00 00 00 00 e0 08   4d+06:47:16.504  READ NATIVE MAX ADDRESS EXT

Error 410 occurred at disk power-on lifetime: 11221 hours (467 days + 13 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 03 d4 9a e6 e0  Error: UNC 3 sectors at LBA = 0x00e69ad4 = 15112916

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 cf 9a e6 e0 08   4d+06:47:11.804  READ DMA
  27 00 00 00 00 00 e0 08   4d+06:47:11.804  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 08   4d+06:47:11.804  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 08   4d+06:47:11.804  SET FEATURES [Set transfer mode]
  27 00 00 00 00 00 e0 08   4d+06:47:11.804  READ NATIVE MAX ADDRESS EXT

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Ay49Mihas ★★★★ ()
Ответ на: Re: Што это было?... от anonymous

Re: Што это было?...

Вот он, хвалёный Hitachi. Надо было WD брать.

anonymous ()
Вы не можете добавлять комментарии в эту тему. Тема перемещена в архив.