Has anyone experienced Palimpsest discrepancies?
Palimsest in Fedora 16 reports a disk (MAXTOR STM3320620AS) as failing: "DISK HAS MANY BAD SECTORS" etc. SMART: 197 Current pending Sector Count Value -4 sectors.
Palimsest in Fedora 17 beta reports the same disk (MAXTOR STM3320620AS) as "OK" SMART: 197 Current pending Sector Count 0 sectors OK.
Obviously one is incorrect, which one?
Regards.
On 30/04/12 11:35 AM, ergodic wrote:
Has anyone experienced Palimpsest discrepancies?
Palimsest in Fedora 16 reports a disk (MAXTOR STM3320620AS) as failing: "DISK HAS MANY BAD SECTORS" etc. SMART: 197 Current pending Sector Count Value -4 sectors.
Palimsest in Fedora 17 beta reports the same disk (MAXTOR STM3320620AS) as "OK" SMART: 197 Current pending Sector Count 0 sectors OK.
Obviously one is incorrect, which one?
Try as root:
smartctl -a /dev/sdX
(where X is of course pointing to your disk)
You can also run test of the disk (this is generally non-destructive but if your disk has bad sectors doing any operations on it may cause further damage so if you have any data on the disk you want to recover... you may want to do create image of the disk first):
smartctl -t short /dev/sdX
Dariusz
Palimsest in Fedora 16 reports a disk (MAXTOR STM3320620AS) as failing: "DISK HAS MANY BAD SECTORS" etc. SMART: 197 Current pending Sector Count Value -4 sectors.
Palimsest in Fedora 17 beta reports the same disk (MAXTOR STM3320620AS) as "OK" SMART: 197 Current pending Sector Count 0 sectors OK.
Obviously one is incorrect, which one?
They're both incorrect. The Fedora 16 "Value -4" (that is, "-4" ==> 0xFFFFFFFC) probably is a bug [although _where_ is unknown: could be palimsest, could be the interface hardware, could be drive microcode, ...]. The Fedora 17 "OK" is suspect. Also check with "hdparm" and other tools, even on other OS.
How old is the drive? That model number says 3-platter, 320MB, SATA. If it's three or more years old, then just replace it. A new drive is $90 or less in US, and 500GB is available for the same price and same other specs (size, power, heat, performance, ...). The data is worth far more than that.
--
On Mon, 2012-04-30 at 12:04 -0700, John Reiser wrote:
How old is the drive? That model number says 3-platter, 320MB, SATA. If it's three or more years old, then just replace it. A new drive is $90 or less in US, and 500GB is available for the same price and same other specs (size, power, heat, performance, ...). The data is worth far more than that.
Dunno about that attitude being universal. On my workstations I use NFS homes so there is no data, only code. I depend on SMART being right enough so I can usually yank a failing drive before it goes so bad the worker can't login and use the machine. But I could care less about any data on it, it is only a clone; because NFS root is just too much of a performance hit compared to the cheapness of drives these days. I try to keep one spare workstation ready to drop in place of a failed unit so downtime is minimal no matter what manages to go wrong with one.
Because a failure isn't a big deal and SMART is usually pretty good about giving an advance warning I run em until they die. In fact if the failure is just a few bad blocks I often zero the drive (or run the manufacturer's low level tool) and if the SMART warning goes away I put it back in service and usually get a few more years out of em. My average drive age is easily over five years on workstations. Still have quite a few 40GB IDE drives in service. A 500GB drive might be dirt cheap now but a fairly complete Linux install still fits on a 40GB drive so why throw out perfectly servicable equip?
If your data is worth anything it shouldn't be subject to loss of a drive regardless of age.
I. THOU SHALT MAKE BACKUPS II. THOU SHALT KEEP THY BACKUPS CURRENT III. THOU SHALT VERIFY THY BACKUPS
On Tue, 01 May 2012 13:03:18 -0500 John Morris wrote:
I depend on SMART being right enough so I can usually yank a failing drive before it goes so bad the worker can't login and use the machine.
Of course, it was the SMART firmware on my Crucial SSD drive that made it break after 5184 hours of operation :-). (Fortunately new firmware fixed things).
On 05/01/2012 08:18 PM, Tom Horsley wrote:
On Tue, 01 May 2012 13:03:18 -0500 John Morris wrote:
I depend on SMART being right enough so I can usually yank a failing drive before it goes so bad the worker can't login and use the machine.
Of course, it was the SMART firmware on my Crucial SSD drive that made it break after 5184 hours of operation :-). (Fortunately new firmware fixed things).
I got hit by this too a few weeks ago. Quite nasty bug in that it will hit every Crucial M4 owner eventually and renders the drive completely useless but luckily also harmless because it doesn't damage anything and a simple firmware update fixes the problem completely with no data loss.
Regards, Dennis
On Mon, 2012-04-30 at 13:35 -0400, ergodic wrote:
Has anyone experienced Palimpsest discrepancies?
Palimsest in Fedora 16 reports a disk (MAXTOR STM3320620AS) as failing: "DISK HAS MANY BAD SECTORS" etc. SMART: 197 Current pending Sector Count Value -4 sectors.
Palimsest in Fedora 17 beta reports the same disk (MAXTOR STM3320620AS) as "OK" SMART: 197 Current pending Sector Count 0 sectors OK.
Obviously one is incorrect, which one?
Regards.
I've got both F16 and F17 and have never seen anything like that. Check and make you have the correct kernel modules installed
On Mon, 2012-04-30 at 13:35 -0400, ergodic wrote:
Has anyone experienced Palimpsest discrepancies?
Palimsest in Fedora 16 reports a disk (MAXTOR STM3320620AS) as failing: "DISK HAS MANY BAD SECTORS" etc. SMART: 197 Current pending Sector Count Value -4 sectors.
Palimsest in Fedora 17 beta reports the same disk (MAXTOR STM3320620AS) as "OK" SMART: 197 Current pending Sector Count 0 sectors OK.
Obviously one is incorrect, which one?
Interpretation of the SMART data can be tricky. There's a longstanding bug where palimpsest considers numbers of 'bad' sectors that are well within manufacturers' tolerances to indicate a failing disk:
https://bugzilla.redhat.com/show_bug.cgi?id=498115
and it's possible you hit that in F16 and it's been corrected, at least for your particular disk, in F17. That '-4' does seem odd, though, as another poster mentioned. The bug report linked above has some useful diagnostic steps you can take.
Finally back in town again and back to Palimpsest. A little background first. My box is a Core 2 Quad Q6600 Kentsfield at 2.4GHz, Asus P5K mobo, 6GB core, XFX GeForce 8600 video card and two hard drive drawers. For OS, Grub1 multi-booting Fedora 15, 16, 17β, Win-XP and Win-7.
By January 2011, Fedora 14 reported for the Maxtor 320GB drive: “DISK HAS MANY BAD SECTORS Age: 136.0 days PwrCycles: 1285 Bad sectors: -4”, with a recommendation for immediate back-up and replacement. At that time a Seagate ST1000 1TB drive was installed internally in the box and partitioned to accept various data partitions and a Fedora OS to be used as a back-up OS. The system normally operates from one of the OSs in a drive slipped in one of the drive drawers. All data in the Maxtor 320GB was moved to the Seagate drive and is periodically imaged to another external drive via a drawer.
The Maxtor 320 GB drive was imaged to a new WD 320GB drive for back-up. The Maxtor was kept in service to see when it would fail. So far it has not!
What has changed in F-17β from F-14,F-15 and F-16 to change the report to OK with 0 bad sectors.
All three previous Fedoras still report for the disk as of today: “DISK HAS MANY BAD SECTORS Age: 248.3 days PwrCycles: 3370 Bad sectors: -4”. This amounts to 2695.2 hours and 2085 power cycles: since the first warning.
A Samsung 160GB that was removed from service o 2010-11-07 at age: 90.7 days with 1115 power cycles and 12 bad sectors, shows under F-14, F-15 and F-16 “ Disk has a few bad sectors” . With F-17β it shows “ Disk is OK Age: 104.7 days Power Cycles: 1271 Bad Sectors: 12 Pending Sectors: 0”
hdtune-224 in Windows XP reports the Maxtor as OK.
My question is one of trust, which one is accurate?
However, I am inclined to believe that is the way Maxtor reports the parameter in question.
Thanks to all,
Cheers.
----- Original Message ----- On Mon, 2012-04-30 at 13:35 -0400, ergodic wrote:
Has anyone experienced Palimpsest discrepancies?
Palimsest in Fedora 16 reports a disk (MAXTOR STM3320620AS) as failing: "DISK HAS MANY BAD SECTORS" etc. SMART: 197 Current pending Sector Count Value -4 sectors.
Palimsest in Fedora 17 beta reports the same disk (MAXTOR STM3320620AS) as "OK" SMART: 197 Current pending Sector Count 0 sectors OK.
Obviously one is incorrect, which one?
Interpretation of the SMART data can be tricky. There's a longstanding bug where palimpsest considers numbers of 'bad' sectors that are well within manufacturers' tolerances to indicate a failing disk:
https://bugzilla.redhat.com/show_bug.cgi?id=498115
and it's possible you hit that in F16 and it's been corrected, at least for your particular disk, in F17. That '-4' does seem odd, though, as another poster mentioned. The bug report linked above has some useful diagnostic steps you can take.