Frank Zhang
2011-Oct-12 02:04 UTC
[Ocfs2-users] Partition table crash, where can I find debug message?
Hi Experts, recently I observed a partition table crash that made me really scared. I have two OVM servers sharing OCFS2 over iscsi, after running a bunch of VMs for a while, all VMs were gone and I saw the mount points of OCFS2 gone on both hosts. Then I tried to mount it again, the iscsi device crashed by saying "please specify filesystem type". I checked dmesg but there is nothing useful except "SCSI device sdc: drive cache: write back sdc: unknown partition table sd 2:0:0:1: Attached scsi disk sdc sd 2:0:0:1: Attached scsi generic sg3 type 0 OCFS2 Node Manager 1.4.4 OCFS2 DLM 1.4.4 OCFS2 DLMFS 1.4.4 OCFS2 User DLM kernel interface loaded connection1:0: detected conn error (1011)" basically after logging into ISCSI device on both hosts, I created soft links of /dev/ovm_iscsi1 pointing to device node under /dev/disk/by-path/real_isci_device, then I formatted /dev/ovm_iscsi1 to OCFS2 and mounted them to somewhere(of course I configured /etc/ocfs2/cluster.conf and made o2cb correctly start). Could somebody tell me where to get more debug info to trace the problem? This is really scared considering I may lose all my VMs because of the silent crash. And is there any way to recover the partition table? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20111011/41175f7d/attachment.html
Sunil Mushran
2011-Oct-12 17:07 UTC
[Ocfs2-users] Partition table crash, where can I find debug message?
Not sure what you mean by a partition table crash. Is it that someone overwrote the partition table on the iscsi server? That's what it looks like. If mount cannot detect the fs type, then it means atleast superblock corruption. And such corruptions typically caused by external entities. Stray dd perhaps. Did you try recovering the superblock using one of the the backups? fsck.ocfs2 -r [1-6] /dev/sdX ? On 10/11/2011 07:04 PM, Frank Zhang wrote:> > Hi Experts, recently I observed a partition table crash that made me really scared. > > I have two OVM servers sharing OCFS2 over iscsi, after running a bunch of VMs for a while, all VMs were gone and I saw the mount points of OCFS2 gone on both hosts. > > Then I tried to mount it again, the iscsi device crashed by saying "please specify filesystem type". I checked dmesg but there is nothing useful except > > "SCSI device sdc: drive cache: write back > > sdc: unknown partition table > > sd 2:0:0:1: Attached scsi disk sdc > > sd 2:0:0:1: Attached scsi generic sg3 type 0 > > OCFS2 Node Manager 1.4.4 > > OCFS2 DLM 1.4.4 > > OCFS2 DLMFS 1.4.4 > > OCFS2 User DLM kernel interface loaded > > connection1:0: detected conn error (1011)" > > basically after logging into ISCSI device on both hosts, I created soft links of /dev/ovm_iscsi1 pointing to device node under /dev/disk/by-path/real_isci_device, then I formatted /dev/ovm_iscsi1 to OCFS2 and mounted them to somewhere(of course I configured /etc/ocfs2/cluster.conf and made o2cb correctly start). > > Could somebody tell me where to get more debug info to trace the problem? This is really scared considering I may lose all my VMs because of the silent crash. > > And is there any way to recover the partition table? Thanks > > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users-------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20111012/f67c2ec5/attachment.html