I have a large disk that has several failed sectors. The drive basically is the article storage for news so it has lots of files. Basically the error messages I get during the inn expire operation is there are a couple failed sectors where the drive cannot successfully read the sectors. The LBA is given. The problem is finding out what those LBA's are used for. The drive SMART status show plenty of available spare sectors, but since it can't read those sectors it won't remap them to a spare sector till the next write of that sector. expire basically gives up when it reaches that error. So my first attempt was to run a cksum of all the files on the disk. That actually cought one of the sectors and gave me the file name. I deleted the file and since it was an overview file for one group, I just rebuilt it. There are still more to go though. That process took many hours. I have not found anything in the archives or man pages or ports that addresses identifying the object/file that has that LBA. So I have started looking into the ufs structures to see how that could be done. fdisk source shows how to access the partition data. For the specific disk, fdisk reports a media sector size of 512 and the block count matches that. So I assume I would have to subtract the start of that partition from the LBA. However, that assumes that the LBA is in the same 512 byte block numbering system. I am not convinced that would always be correct. Next has to address the bsdlabel. I am now presuming that the LBA value of 0 is the start of the drive, not the start of the partition. I am not sure if this is correct either. If so, then bsdlabel type code would be required to identify the partition. Then the start of the partition would need to be subtracted from the LBA. At that point I think I have the values that would be found in the block tables in the inodes. Before digging into the inode structures I though it would be a good idea to check my understanding to this point. Am I on the right path?
On Fri, 2006-Mar-03 17:23:33 -0800, Doug Hardie wrote:>The drive SMART status show plenty of available spare sectors, but >since it can't read those sectors it won't remap them to a spare >sector till the next write of that sector.It's probably a good idea to start think about replacing the disk anyway.>I have not found anything in the archives or man pages or ports that >addresses identifying the object/file that has that LBA.Look at the thread starting http://lists.FreeBSD.org/pipermail/freebsd-hackers/2006-February/015475.html> So I assume I would have to subtract the start >of that partition from the LBA.Yes.> However, that assumes that the LBA >is in the same 512 byte block numbering system.It is. You can always double check by verifying that the superblocks are where you calculate they should be and that when you translate the LBA to an offset within the partition, you get an error when you attempt to read that LBA within the partition.>Next has to address the bsdlabel. I am now presuming that the LBA >value of 0 is the start of the drive, not the start of the >partition.I'm almost certain they are - the error message would include the slice/partition number if they were relative to the start of the slice/partition (otherwise you couldn't be certain which slice/ partition was affected). Again, you can check by verifying that the superblocks are where you expect and the translated LBAs give errors.>At that point I think I have the values that would be found in the >block tables in the inodes.It could also be in the cylinder group metadata - superblock copy, inodes or free block bitmaps. And remember to correctly handle indirect blocks.>Before digging into the inode structures I though it would be a good >idea to check my understanding to this point. Am I on the right path?Yes you are. -- Peter Jeremy