Hello!
I've been using Linux Mint for about 3 months now without any problems (except for fglrx... ) but now I've found a great problem, however, I'm not sure if it's Mint's fault or not.
I have a new 2 TB Western Digital Caviar Green HDD (wd20earx), it's 3 months old just like my Mint installation. It used to have MBR and a single 2 TB NTFS partition. Yesterday when I woke up I found nothing on it, after looking at it with HD Sentinel (from a Windows) it said that it's half-formatted (and with full format, not with quick!) with ext4 which happened the previous day at ~20:30. However, at that time the only running thing was Linux Mint and a Chromium (without root rights of course) browser. I didn't notice a single thing about it neither a window asking "is it okay if I kill your HDD?" or any suspicious sounds of heavy disk-usage due to formatting. The HDD's SMART looks 98% fine too. At (Oct 29) 21:15:38 I've shut down the system. The HDD was mounted the whole day just like for 3 months before where nothing wrong happened. I used it that day too and as I remember I used it (write+read too) at ~20:00 last time that day.
I also can't find any error message related to this, only one from ~17:45, from syslog:
Oct 29 17:44:17 MikiMint kernel: [21655.808037] ata5.00: exception Emask 0x10 SAct 0x0 SErr 0x4050002 action 0xe frozen
Oct 29 17:44:17 MikiMint kernel: [21655.808042] ata5: SError: { RecovComm PHYRdyChg CommWake DevExch }
Oct 29 17:44:17 MikiMint kernel: [21655.808044] ata5.00: failed command: SMART
Oct 29 17:44:17 MikiMint kernel: [21655.808048] ata5.00: cmd b0/da:00:00:4f:c2/00:00:00:00:00/00 tag 0
Oct 29 17:44:17 MikiMint kernel: [21655.808049] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x14 (ATA bus error)
Oct 29 17:44:17 MikiMint kernel: [21655.808051] ata5.00: status: { DRDY }
Oct 29 17:44:17 MikiMint kernel: [21655.808058] ata5: hard resetting link
Oct 29 17:44:18 MikiMint kernel: [21656.688120] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Oct 29 17:44:18 MikiMint kernel: [21656.745126] ata5.00: configured for UDMA/133
Oct 29 17:44:18 MikiMint kernel: [21656.745150] ata5: EH complete
Smart says ~20 days online, and ~500 read errors, the only other non-zero values are power on/off related.
Luckily 99% of the data was backed up, so I was lucky this time but I'm afraid that it might happen again...
Is it possible that Mint noticed a serious HDD error and then tried to fix it without asking(!) first?
Any ideas why it happened?
Also, what other log files, data, etc. should I attach?
Thanks in advance!
Garmine
P.s.:
I've checked my other HDDs' SMART:
A 1.2 yrs runned 1 TB says 1 read errors,
while the other, 1 yrs runned 80GB says 0 read errors,
but the new has 500 read errors in only 20 days!
Please say that I'm wrong but then it's an HDD fault, isn't it? :S
[Solved-hopefully]Hard Disk error?
Forum rules
Before you post read how to get help. Topics in this forum are automatically closed 6 months after creation.
Before you post read how to get help. Topics in this forum are automatically closed 6 months after creation.
[Solved-hopefully]Hard Disk error?
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 2 times in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
Re: Hard Disk error?
It a hardware error, either on the HDD SATA micro-controller or the power-supply interface or a bad/faulty SATA cable connection.
Re: Hard Disk error?
Thanks!
So I've changed the SATA and Power Supply cables. I also plugged the SATA to another jack. But I still don't get why HD Sentinel showed the destructed partition as "ext4". That's why Mint is suspicious too. I don't think so that my motherboard or power supply is faulty since that would make errors in other places (e.g. in the other drives) but there's no sign of it.
The inside of the machine was clean too, only a minimal dust was there (I clean it regularly).
But can a faulty cable kill the whole partition (especially like this) ?
So I've changed the SATA and Power Supply cables. I also plugged the SATA to another jack. But I still don't get why HD Sentinel showed the destructed partition as "ext4". That's why Mint is suspicious too. I don't think so that my motherboard or power supply is faulty since that would make errors in other places (e.g. in the other drives) but there's no sign of it.
The inside of the machine was clean too, only a minimal dust was there (I clean it regularly).
But can a faulty cable kill the whole partition (especially like this) ?
Re: Hard Disk error?
Code: Select all
Oct 29 17:44:17 MikiMint kernel: [21655.808049] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x14 (ATA bus error)
Oct 29 17:44:17 MikiMint kernel: [21655.808051] ata5.00: status: { DRDY }
Oct 29 17:44:17 MikiMint kernel: [21655.808058] ata5: hard resetting link
What gives "smartctl -A /dev/sdX" ?
Also I don't really understand this part
What is now the actual volume format: NTFS or EXT4 ?It used to have MBR and a single 2 TB NTFS partition. Yesterday when I woke up I found nothing on it, after looking at it with HD Sentinel (from a Windows) it said that it's half-formatted (and with full format, not with quick!) with ext4 which happened the previous day at ~20:30.
Re: Hard Disk error?
Thank you!
After the error it was unusable (with anything), however, the prog said it looks like an unfinished EXT4. But probably you're right and it didn't recognize the corrupted partition table and that's why it showed like that.
Yesterday I've reformatted the whole disk to NTFS, after changing the cables. I also copied the data from the old drive again (~800 GB), and it didn't increase the number of read errors. So it looks fine for now.
So it's NTFS right now.
smartcl output:
It had a single NTFS partition before.eanfrid wrote:What is now the actual volume format: NTFS or EXT4 ?It used to have MBR and a single 2 TB NTFS partition. Yesterday when I woke up I found nothing on it, after looking at it with HD Sentinel (from a Windows) it said that it's half-formatted (and with full format, not with quick!) with ext4 which happened the previous day at ~20:30.
After the error it was unusable (with anything), however, the prog said it looks like an unfinished EXT4. But probably you're right and it didn't recognize the corrupted partition table and that's why it showed like that.
Yesterday I've reformatted the whole disk to NTFS, after changing the cables. I also copied the data from the old drive again (~800 GB), and it didn't increase the number of read errors. So it looks fine for now.
So it's NTFS right now.
smartcl output:
Code: Select all
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 158 155 051 Pre-fail Always - 502
3 Spin_Up_Time 0x0027 165 163 021 Pre-fail Always - 6741
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 181
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 484
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 159
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 47
193 Load_Cycle_Count 0x0032 198 198 000 Old_age Always - 8034
194 Temperature_Celsius 0x0022 124 106 000 Old_age Always - 26
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
Re: Hard Disk error?
Yep, seems to be fine again. But keep an eye on SMART and ATA-bus errors: hardware (even brand new) can fail at any time... Linux is quite good at monitoring these hardware errors and does its best to make faulty hardware work. You can also install smart-notifier and gsmartcontrol (smartctl GUI) if you wish
Re: Hard Disk error?
Thank you!eanfrid wrote:Yep, seems to be fine again. But keep an eye on SMART and ATA-bus errors: hardware (even brand new) can fail at any time... Linux is quite good at monitoring these hardware errors and does its best to make faulty hardware work. You can also install smart-notifier and gsmartcontrol (smartctl GUI) if you wish
Also, updated the topic: "[Solved-hopefully]"