Eric Raskin
2014-Mar-22 22:44 UTC
[Ocfs2-users] FSCK may be failing and corrupting my disk???
Hi: I am running a two-node Oracle VM Server 2.2.2 installation. We were having some strange problems creating new virtual machines, so I shut down the systems and unmounted the OVS Repository (ocfs2 file system on Equallogic equipment). I ran a fsck -y first, which replayed the logs and said all was clean. But, I am pretty sure there are other issues, so I started an fsck -fy One of the messages I got was: Cluster 161213953 is claimed by the following inodes: <76289548> /running_pool/450_gebidb/System.img [DUP_CLUSTERS_CLONE] Inode "(null)" may be cloned or deleted to break the claim it has on its clusters. Clone inode "(null)" to break claims on clusters it shares with other inodes? y I then watched with an strace -p <fsck process> to see what was happening, since it was taking a long time with no messages. I see: pwrite64(3, "INODE01\0H\26O}\377\377\22\0\0\0\0\0\0$\0\0\0\0\0\0\0\0\0\0"..., 4096, 90112) = 4096 pwrite64(3, "EXBLK01\0\0\0\0\0\0\0\0\0\0\0+\3H\26O}\306\374&\0\0\0\0\0"..., 4096, 10465599488) = 4096 pwrite64(3, "GROUP01\0\300\17\0\4P\0\0\0H\26O}\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 10462699520) = 4096 pwrite64(3, "INODE01\0H\26O}\377\377\22\0\0\0\0\0\0$\0\0\0\0\0\0\0\0\0\0"..., 4096, 90112) = 4096 pwrite64(3, "EXBLK01\0\0\0\0\0\0\0\0\0\0\0/\3H\26O}\302\374&\0\0\0\0\0"..., 4096, 10465583104) = 4096 pwrite64(3, "GROUP01\0\300\17\0\4Q\0\0\0H\26O}\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 10462699520) = 4096 pwrite64(3, "INODE01\0H\26O}\377\377\22\0\0\0\0\0\0$\0\0\0\0\0\0\0\0\0\0"..., 4096, 90112) = 4096 pwrite64(3, "EXBLK01\0\0\0\0\0\0\0\0\0\0\0003\3H\26O}\274\374&\0\0\0\0\0"..., 4096, 10465558528) = 4096 pwrite64(3, "INODE01\0H\26O}\0\0L\0\0\0\0\0\24\346\17\0\0\0\0\0\0\0\0\0"..., 4096, 2686701568) = 4096 pwrite64(3, "GROUP01\0\300\17\0~\3\0#\0H\26O}\0\0\0\0\0n\0\1\0\0\0\0"..., 4096, 100940120064) = 4096 pwrite64(3, "INODE01\0H\26O}\377\377\7\0\0\0\0\0\0\6\0\30\0\0\0\0\0\0\0\0"..., 4096, 45056) = 4096 pwrite64(3, "GROUP01\0\300\17\0\4P\0\0\0H\26O}\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 10462699520) = 4096 pwrite64(3, "INODE01\0H\26O}\377\377\22\0\0\0\0\0\0$\0\0\0\0\0\0\0\0\0\0"..., 4096, 90112) = 4096 pwrite64(3, "EXBLK01\0\0\0\0\0\0\0\0\0\0\0\272\2H\26O}\274\374&\0\0\0\0\0"..., 4096, 10465558528) = 4096 pwrite64(3, "EXBLK01\0\0\0\0\0\0\0\0\0\0\0003\3H\26O}\274\374&\0\0\0\0\0"..., 4096, 10465558528) = 4096 pwrite64(3, "GROUP01\0\300\17\0\4O\0\0\0H\26O}\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 10462699520) = 4096 pwrite64(3, "INODE01\0H\26O}\377\377\22\0\0\0\0\0\0$\0\0\0\0\0\0\0\0\0\0"..., 4096, 90112) = 4096 This is going on and on. It looks like it is writing lots of entries to fix one duplicate inode??? At this point, I have aborted the fsck, as I am worried that it is completely trashing our OVS repository disk. Can anybody shed some light on this before I restart the fsck? We need to be back up and running ASAP! Thanks in advance! -- ----------------------------------------------------------------------------------------------------------------------------------------------- Eric H. Raskin 914-765-0500 x120 Professional Advertising Systems Inc. 914-765-0503 fax 200 Business Park Dr Suite 304 eraskin at paslists.com Armonk, NY 10504 http://www.paslists.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20140322/1ef20914/attachment.html
Sunil Mushran
2014-Mar-23 01:40 UTC
[Ocfs2-users] FSCK may be failing and corrupting my disk???
Cloning the inode means inode + data. Let it finish. On Sat, Mar 22, 2014 at 3:44 PM, Eric Raskin <eraskin at paslists.com> wrote:> Hi: > > I am running a two-node Oracle VM Server 2.2.2 installation. We were > having some strange problems creating new virtual machines, so I shut down > the systems and unmounted the OVS Repository (ocfs2 file system on > Equallogic equipment). > > I ran a fsck -y first, which replayed the logs and said all was clean. > But, I am pretty sure there are other issues, so I started an fsck -fy > > One of the messages I got was: > > Cluster 161213953 is claimed by the following inodes: > <76289548> > /running_pool/450_gebidb/System.img > [DUP_CLUSTERS_CLONE] Inode "(null)" may be cloned or deleted to break the > claim it has on its clusters. Clone inode "(null)" to break claims on > clusters it shares with other inodes? y > > I then watched with an strace -p <fsck process> to see what was happening, > since it was taking a long time with no messages. I see: > > pwrite64(3, > "INODE01\0H\26O}\377\377\22\0\0\0\0\0\0$\0\0\0\0\0\0\0\0\0\0"..., 4096, > 90112) = 4096 > pwrite64(3, > "EXBLK01\0\0\0\0\0\0\0\0\0\0\0+\3H\26O}\306\374&\0\0\0\0\0"..., 4096, > 10465599488) = 4096 > pwrite64(3, > "GROUP01\0\300\17\0\4P\0\0\0H\26O}\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, > 10462699520) = 4096 > pwrite64(3, > "INODE01\0H\26O}\377\377\22\0\0\0\0\0\0$\0\0\0\0\0\0\0\0\0\0"..., 4096, > 90112) = 4096 > pwrite64(3, > "EXBLK01\0\0\0\0\0\0\0\0\0\0\0/\3H\26O}\302\374&\0\0\0\0\0"..., 4096, > 10465583104) = 4096 > pwrite64(3, > "GROUP01\0\300\17\0\4Q\0\0\0H\26O}\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, > 10462699520) = 4096 > pwrite64(3, > "INODE01\0H\26O}\377\377\22\0\0\0\0\0\0$\0\0\0\0\0\0\0\0\0\0"..., 4096, > 90112) = 4096 > pwrite64(3, > "EXBLK01\0\0\0\0\0\0\0\0\0\0\0003\3H\26O}\274\374&\0\0\0\0\0"..., 4096, > 10465558528) = 4096 > pwrite64(3, > "INODE01\0H\26O}\0\0L\0\0\0\0\0\24\346\17\0\0\0\0\0\0\0\0\0"..., 4096, > 2686701568) = 4096 > pwrite64(3, "GROUP01\0\300\17\0~\3\0#\0H\26O}\0\0\0\0\0n\0\1\0\0\0\0"..., > 4096, 100940120064) = 4096 > pwrite64(3, > "INODE01\0H\26O}\377\377\7\0\0\0\0\0\0\6\0\30\0\0\0\0\0\0\0\0"..., 4096, > 45056) = 4096 > pwrite64(3, > "GROUP01\0\300\17\0\4P\0\0\0H\26O}\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, > 10462699520) = 4096 > pwrite64(3, > "INODE01\0H\26O}\377\377\22\0\0\0\0\0\0$\0\0\0\0\0\0\0\0\0\0"..., 4096, > 90112) = 4096 > pwrite64(3, > "EXBLK01\0\0\0\0\0\0\0\0\0\0\0\272\2H\26O}\274\374&\0\0\0\0\0"..., 4096, > 10465558528) = 4096 > pwrite64(3, > "EXBLK01\0\0\0\0\0\0\0\0\0\0\0003\3H\26O}\274\374&\0\0\0\0\0"..., 4096, > 10465558528) = 4096 > pwrite64(3, > "GROUP01\0\300\17\0\4O\0\0\0H\26O}\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, > 10462699520) = 4096 > pwrite64(3, > "INODE01\0H\26O}\377\377\22\0\0\0\0\0\0$\0\0\0\0\0\0\0\0\0\0"..., 4096, > 90112) = 4096 > > This is going on and on. It looks like it is writing lots of entries to > fix one duplicate inode??? > > At this point, I have aborted the fsck, as I am worried that it is > completely trashing our OVS repository disk. > > Can anybody shed some light on this before I restart the fsck? We need to > be back up and running ASAP! > > Thanks in advance! > -- > > ----------------------------------------------------------------------------------------------------------------------------------------------- > Eric H. Raskin 914-765-0500 x120 Professional Advertising Systems Inc. > 914-765-0503 fax 200 Business Park Dr Suite 304 eraskin at paslists.com Armonk, > NY 10504 http://www.paslists.com > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20140322/836944c5/attachment.html