LINUX.ORG.RU

винт умирает?


0

0

Сейчас при сильной нагрузке (linuxdcpp сканил кучу файлов, при этом я тоже чем-то занимался) начались странности )
далее комп завис, перезагрузился, в логе обнаружил записи вида:

Device: /dev/sda, is SMART capable. Adding to "monitor" list.
Monitoring 1 ATA and 0 SCSI devices
Device: /dev/sda, 1 Currently unreadable (pending) sectors

далее опять запустил linuxdcpp и в логе от kernel:

ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1.01: BMDMA stat 0x65
ata1.01: cmd 25/00:08:55:50:5a/00:00:11:00:00/f0 tag 0 cdb 0x0 data 4096 in
res 51/40:01:5c:50:5a/40:00:11:00:00/f0 Emask 0x9 (media error)
ata1.01: configured for UDMA/100
ata1: EH complete
ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata1.01: BMDMA stat 0x65
ata1.01: cmd 25/00:08:55:50:5a/00:00:11:00:00/f0 tag 0 cdb 0x0 data 4096 in
res 51/40:01:5c:50:5a/40:00:11:00:00/f0 Emask 0x9 (media error)
ata1.01: configured for UDMA/100
ata1: EH complete


думал проверить читаемость всех файлов из каталога и ниже, который подсовываю - как это сделать?
Чтото-типа cp ./dir -Recurce /dev/null

ос сусе103
винт wd320гб, куплен ~полгода назад, не носил, не кидал


в догонку: только сейчас вылезло в лог:

Nov 20 21:25:35 smartd[3941]: Device: /dev/sda, 1 Currently unreadable (pending) sectors
Nov 20 21:25:35 smartd[3941]: Device: /dev/sda, ATA error count increased from 0 to 24
Nov 20 21:25:35 smartd[3941]: Sending warning via /usr/lib/smartmontools/smart-notify to root@localhost ...
Nov 20 21:25:35 smartd[3941]: Warning via /usr/lib/smartmontools/smart-notify to root@localhost produced unexpected output (53 bytes)
Nov 20 21:25:35 smartd[3941]: Warning via /usr/lib/smartmontools/smart-notify to root@localhost: successful

e5150
() автор топика

Последние WD, вообще-то, довольно неплохи были.. У самого такой же в качестве одного из внешних для домашнего использования.. никаких проблем..

Можно было бы попробовать ещё с libata поиграться.. Если не поможет, значит, всё-таки, проблемы с диском..

MiracleMan ★★★★★
()

Прогони mhdd его. Только ман сначала почитай. Было аналогичное сообщение смарта недавно, логический бэд был на одном секторе. Симптомы - при чтении сектора щелкает головками и вешает систему нафиг. Вылечилось scan'ом с самой-нижней-функцией включенной, не помню как называется, Erase ЧЕГОТОТАМ.

MadCAD ★★
()

smartctl --all /dev/sda что показывает?

Хотя, с винтами такое бывает :(

anonymous_incognito ★★★★★
()
Ответ на: комментарий от MiracleMan

> Последние WD, вообще-то, довольно неплохи были..

Тихие и быстрые - да. А вот по поводу надежности, как раз сейчас пошла волна по форумам "сдох WD-*AAKS, купленный полгода -год назад".

anonymous
()
Ответ на: комментарий от MadCAD

> Вылечилось scan'ом с самой-нижней-функцией включенной, не помню как называется, Erase ЧЕГОТОТАМ.

Remap называется, советчик хренов :)

Deleted
()
Ответ на: комментарий от anonymous

> Тихие и быстрые - да. А вот по поводу надежности, как раз сейчас пошла волна по форумам "сдох WD-*AAKS, купленный полгода -год назад".

Дык правильно, это ж вендузятнеги пишут. А какой вендузятнег будет заморачиваться обеспечением надлежащих условий для работы винтов? Вот и получается что в корпусе с температурой >50 от потоков раскалённого парой-тройкой "тридэускорителей" воздуха винты подыхают валом.

Gharik
()
Ответ на: комментарий от MadCAD

Вот пришел домой, включил, опять ошибки в логе:
smartd[3941]: Device: /dev/sda, ATA error count increased from 24 to 60

Небольшой вопрос - винт у меня IDE, почему он идет как sda?


Вывод smartctl --all /dev/sda:

smartctl version 5.37 [i686-suse-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family: Western Digital Caviar SE family
Device Model: WDC WD3200JB-00KFA0
Serial Number: WD-WCAMR3227228
Firmware Version: 08.05J08
User Capacity: 320 072 933 376 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 6
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Wed Nov 21 20:07:39 2007 MSK
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (9600) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
No General Purpose Logging support.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 116) minutes.
Conveyance self-test routine
recommended polling time: ( 6) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0003 218 183 021 Pre-fail Always - 4075
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 609
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 200 200 051 Pre-fail Always - 0
9 Power_On_Hours 0x0032 098 098 000 Old_age Always - 1770
10 Spin_Retry_Count 0x0013 100 100 051 Pre-fail Always - 0
11 Calibration_Retry_Count 0x0012 100 100 051 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 596
194 Temperature_Celsius 0x0022 128 098 000 Old_age Always - 22
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 1
198 Offline_Uncorrectable 0x0010 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0

SMART Error Log Version: 1
ATA Error Count: 60 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

e5150
() автор топика
Ответ на: комментарий от e5150

вторая часть:

------------------------------------------------------------------------------- ----------------
Error 60 occurred at disk power-on lifetime: 1769 hours (73 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 01 5c 50 5a f0 Error: UNC 1 sectors at LBA = 0x005a505c = 5918812

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 08 55 50 5a 11 59 01:41:20.635 READ DMA EXT
27 00 00 00 00 00 00 59 01:41:20.630 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 00 59 01:41:20.625 IDENTIFY DEVICE
ef 03 45 00 00 00 00 59 01:41:20.615 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 00 59 01:41:20.615 READ NATIVE MAX ADDRESS EXT

Error 59 occurred at disk power-on lifetime: 1769 hours (73 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 01 5c 50 5a f0 Error: UNC 1 sectors at LBA = 0x005a505c = 5918812

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 08 55 50 5a 11 59 01:41:18.115 READ DMA EXT
27 00 00 00 00 00 00 59 01:41:18.115 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 00 59 01:41:18.105 IDENTIFY DEVICE
ef 03 45 00 00 00 00 59 01:41:18.095 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 00 59 01:41:18.095 READ NATIVE MAX ADDRESS EXT

Error 58 occurred at disk power-on lifetime: 1769 hours (73 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 01 5c 50 5a f0 Error: UNC 1 sectors at LBA = 0x005a505c = 5918812

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 08 55 50 5a 11 59 01:41:16.070 READ DMA EXT
27 00 00 00 00 00 00 59 01:41:16.070 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 00 59 01:41:16.060 IDENTIFY DEVICE
ef 03 45 00 00 00 00 59 01:41:16.055 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 00 59 01:41:16.055 READ NATIVE MAX ADDRESS EXT

Error 57 occurred at disk power-on lifetime: 1769 hours (73 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 01 5c 50 5a f0 Error: UNC 1 sectors at LBA = 0x005a505c = 5918812

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 08 55 50 5a 11 59 01:41:14.025 READ DMA EXT
27 00 00 00 00 00 00 59 01:41:14.025 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 00 59 01:41:14.015 IDENTIFY DEVICE
ef 03 45 00 00 00 00 59 01:41:14.010 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 00 59 01:41:14.010 READ NATIVE MAX ADDRESS EXT

Error 56 occurred at disk power-on lifetime: 1769 hours (73 days + 17 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 01 5c 50 5a f0 Error: UNC 1 sectors at LBA = 0x005a505c = 5918812

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
25 00 08 55 50 5a 11 59 01:41:11.995 READ DMA EXT
27 00 00 00 00 00 00 59 01:41:11.995 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 00 59 01:41:11.985 IDENTIFY DEVICE
ef 03 45 00 00 00 00 59 01:41:11.985 SET FEATURES [Set transfer mode]
27 00 00 00 00 00 00 59 01:41:11.985 READ NATIVE MAX ADDRESS EXT

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

e5150
() автор топика
Ответ на: комментарий от e5150

> Небольшой вопрос - винт у меня IDE, почему он идет как sda?

Потому-что libata сейчас более функциональна.

Deleted
()
Вы не можете добавлять комментарии в эту тему. Тема перемещена в архив.