Florin Andrei
2010-Mar-29 20:15 UTC
[Ocfs2-users] node B reboots when node A is isolated from the network
A and B are identical machines. Network has lots of redundancy. They both access same OCFS2 volumes over Fiber Channel on a SAN. Red Hat Enterprise Linux Server release 5.3 (Tikanga) 2.6.18-128.el5 x86_64 OCFS2 1.2.9 I log in to the switch and isolate node A from the network (shutdown all ports to node A on the switch), but the node remains connected to SAN over fiber. In that case, node B will reset itself and then boot up. While booting up, node B will not remount the OCFS2 volumes on the SAN. If I try to mount them manually on B, mount fails. All this time, the OCFS2 volumes are still mounted on node A - I can access them if I go on A through the console and do some filesystem tests (df, mount, touch, rm ... ). I can only remount the OCFS2 volumes on node B after I bring node A back online on the Ethernet switch. I assume this is actually normal behavior. Am I correct? -- Florin Andrei http://florin.myip.org/
Sunil Mushran
2010-Mar-29 20:20 UTC
[Ocfs2-users] node B reboots when node A is isolated from the network
If node A is a lower node number than node B, then the behavior is correct. In a 2 node cluster, if the two nodes cannot talk to each other, the higher node number will fence itself. Also, when a node mounts a volume, it initiates connections to other live nodes. If any connection fails, the mount will fail. Florin Andrei wrote:> A and B are identical machines. Network has lots of redundancy. They > both access same OCFS2 volumes over Fiber Channel on a SAN. > > Red Hat Enterprise Linux Server release 5.3 (Tikanga) > 2.6.18-128.el5 x86_64 > OCFS2 1.2.9 > > I log in to the switch and isolate node A from the network (shutdown all > ports to node A on the switch), but the node remains connected to SAN > over fiber. > > In that case, node B will reset itself and then boot up. While booting > up, node B will not remount the OCFS2 volumes on the SAN. If I try to > mount them manually on B, mount fails. > > All this time, the OCFS2 volumes are still mounted on node A - I can > access them if I go on A through the console and do some filesystem > tests (df, mount, touch, rm ... ). > > I can only remount the OCFS2 volumes on node B after I bring node A back > online on the Ethernet switch. > > I assume this is actually normal behavior. Am I correct? > >