Jay V
2016-Apr-01 05:58 UTC
[Ocfs2-users] fsck.ocfs2 not fixing as it outputs errors when checking w/ no flag (-fn) but is clean with yes flag (-fy)
On 3/31/2016 10:37 PM, Junxiao Bi wrote:> On 04/01/2016 11:20 AM, Jay Vasa wrote: >> On 3/31/2016 6:36 PM, Herbert van den Bergh wrote: >>> It seems to me that the reason fsck -fn is reporting errors is because >>> it isn't replaying the journal: >>> >>> ** Skipping journal replay because -n was given. There may be spurious >>> errors that journal replay would fix. ** >>> ** Skipping slot recovery because -n was given. ** >>> >>> So there are outstanding changes in the journal that need to be made >>> to the fs, but fsck -fn skips them. Then later it runs into the >>> inconsistencies that would have been cleared if the journal was replayed. >>> >>> fsck -fy does replay the journal, so it doesn't see the >>> inconsistencies that were fixed by it. >>> >>> When you do the fsck -fn AFTER fsck -fy, does it still say now that it >>> is skipping journal replay? If so, I wonder why. If not, does it >>> still report the exact same inode / cluster numbers as the previous >>> time you ran it? If fsck -fy had to make any changes (including >>> replaying the journal), run it again, and repeat until it doesn't make >>> any changes to the filesystem. This is just to make sure it isn't >>> leaving some inconsistency unfixed. So please do: >>> >>> umount (on ALL nodes) >>> fsck -fy >>> fsck -fy (if the previous fsck made ANY changes including replaying >>> the journal) >>> fsck -fn (check if it mentions skipping the journal replay) >>> >>> If you still see any errors reported by fsck -fn, are they exactly the >>> same ones as you've sent earlier? >>> >> This is exactly what I did on the first time I ran it. I really don't >> want to have another downtime doing exactly this again. > So the "corrupted" ocfs2 volume is online now, does it work well? If > ocfs2 is really corrupted, i think it will soon fall into a read-only fs > or panic. If it works well, then maybe fsck.ocfs2 -fn report the > corruption wrongly. > > Thanks, > Junxiao.Yes the "corrupted" ocfs2 is working just fine. It has not fallen to read-only and has not had a panic. I though am worried that it will in the future and go read-only at some-time. I have though been lately minimizing the load on it as I am worried about this happening and seems no way to fix it. Thanks, Jay>> If you see I ran this exactly: >> % umount /dev/drbd2 -- the umount stalled so I rebooted it >> % fsck -fy /dev/drbd2 >> -- this fixed the journal replay >> % fsck -fy /dev/drbd2 >> -- this did nothing >> % fsck -fn /dev/drbd2 >> -- this showed the errors all over again. Yes exactly the same errors. >> >> Look at the bottom of this message as that is exactly what I ran, and >> yes everything was unmounted. This is the only reason why I brought up >> this issue. >> >> If you really want me to do this again, I can, but I don't like bringing >> down the filesystem another 6 hours for this. I have already tried fsck >> this about 20 times. >> >> Thanks, >> Jay