I decided, after the last discussion of smartd and S.M.A.R.T. disks, to take a look in my /var/log/messages, and I'm seeing fair bit of this: Sep 10 20:11:23 mhrichter smartd[3361]: Device: /dev/sda, 4294967295 Offline uncorrectable sectors Sep 10 20:41:23 mhrichter smartd[3361]: Device: /dev/hdb, 21 Currently unreadable (pending) sectors Sep 10 20:41:24 mhrichter smartd[3361]: Device: /dev/sda, 4294967295 Currently unreadable (pending) sectors Sep 10 20:41:24 mhrichter smartd[3361]: Device: /dev/sda, 4294967295 Offline uncorrectable sectors Sep 10 21:11:23 mhrichter smartd[3361]: Device: /dev/hdb, 21 Currently unreadable (pending) sectors Sep 10 21:11:23 mhrichter smartd[3361]: Device: /dev/sda, 4294967295 Currently unreadable (pending) sectors Sep 10 21:11:23 mhrichter smartd[3361]: Device: /dev/sda, 4294967295 Offline uncorrectable sectors Clearly there is a minor problem on /dev/hdb, which doesn't really surprise me, nor is it particularly worrisome (because I don't use that drive much). However, the other one I find more than a little curious. /dev/sda is a Seagate 300GB SATA drive that's coming up on two years old next month, but the number of "Currently unreadable (pending) sectors" or "Offline uncorrectable sectors," depending on which one you believe, is interesting - 4294967295 is FFFFFFFF in hex, and I'm running a 64-bit machine. Google is not particularly informative on this subject - anyone know more than general suggestions about dd, badblocks, etc.? This is my boot and primary system disk (has been for some time), but the error message is essentially meaningless (to me, right now). Thanks. mhr PS: In the last couple of months there was a discussion of how to make the disk less active, starting with someone reporting that their disk drive activity light blinked every 30 seconds or something like that. I tried to find it again, but I couldn't pin down what to search for - what was the solution?
MHR wrote:> Google is not particularly informative on this subject - anyone know > more than general suggestions about dd, badblocks, etc.? This is my > boot and primary system disk (has been for some time), but the error > message is essentially meaningless (to me, right now).Download the manufacturer's tools and run a diagnostics on it, it will tell you the truth about what's going on. I wouldn't trust any generic OS tools over the manufacturer's tools, there was a discussion on this topic on this list I think not too long ago. The biggest gotcha with the vendor tools though is they are usually limited in the types of disk controllers they support. nate
On Wed, Sep 10, 2008 at 9:41 PM, MHR <mhullrich at gmail.com> wrote:> I decided, after the last discussion of smartd and S.M.A.R.T. disks, > to take a look in my /var/log/messages, and I'm seeing fair bit of > this: > > Sep 10 20:11:23 mhrichter smartd[3361]: Device: /dev/sda, 4294967295 > Offline uncorrectable sectors > Sep 10 20:41:23 mhrichter smartd[3361]: Device: /dev/hdb, 21 Currently > unreadable (pending) sectors(snip)> Google is not particularly informative on this subject - anyone know > more than general suggestions about dd, badblocks, etc.? This is my > boot and primary system disk (has been for some time), but the error > message is essentially meaningless (to me, right now).You should start thinking of replacing the disk. There is a discussion in the forum: http://www.centos.org/modules/newbb/viewtopic.php?topic_id=15880&forum=39 I am one of the people there who were getting the same error and replaced the disk. Akemi / toracat
On Wed, 2008-09-10 at 21:41 -0700, MHR wrote:> I decided, after the last discussion of smartd and S.M.A.R.T. disks, > to take a look in my /var/log/messages, and I'm seeing fair bit of > this: > > Sep 10 20:11:23 mhrichter smartd[3361]: Device: /dev/sda, 4294967295 > Offline uncorrectable sectors > Sep 10 20:41:23 mhrichter smartd[3361]: Device: /dev/hdb, 21 Currently > unreadable (pending) sectors > <snip>> Google is not particularly informative on this subject - anyone know > more than general suggestions about dd, badblocks, etc.? This is my > boot and primary system disk (has been for some time), but the error > message is essentially meaningless (to me, right now).A google using manufacturer smart site::centos.org should lead to a couple good threads on this list. I'd cite the recent related, but I'm short of time ATM. As long as you only see one, or very few, errors and very limited growth in the number, no worry IMO. However, to confirm this, use smartctl to get a full check and logging done. Then use it to review the logs. I've got one that has had 2 errors for more than 6 months now. Used the manufacturer tools, it got repaired, only one occurrence since.> > Thanks. > > mhr > <snip>-- Bill