Con Zyor
2011-Nov-04 21:47 UTC
[Gluster-users] Long client hangs in replicated configuration with 2-nodes acting as own clients
Hello; I have installed glusterfs 3.2.4 to a pair of Red Hat Enterprise Linux 6.1 x86_64 machines with 2GB memory. I am attempting to mirror a directory full of content between the two servers, which also serve and update the content through a webapp via Apache. My issue is that the client mount points hang for 30 minutes or so if either node is brought down. The volfile will be at the end of this e-mail. I setup two bricks, one each on nodes server01 and server02, using ext4 and acl mount options. The vfstab entries on each server look like this /dev/mapper/sysvg-brick01 /brick01 ext4 defaults,nosuid,acl 1 2>From one host, I configure them as a mirror and start the volume:gluster volume create volume01 replica 2 transport tcp server01:/brick01 server02:/brick01 gluster volume start volume01 Then server1 and server2 each mount the volume from themselves via /etc/fstab entry: localhost:/volume01 /glusterfs/vol01 glusterfs defaults,_netdev,acl 0 0 This works, modifications inside /glusterfs/vol01 are seen by the other host. However when I reboot either server01 or server02, the client mount point on the surviving node (/glusterfs/vol01) hangs until the node reboots. If the node never boots, the client mount point on the surviving node hangs for 30 minutes. I have tried reducing frame-timeout to 10 seconds to no avail. Also; once the rebooted server comes back online it fails to mount /glusterfs/vol01, hanging, again for 30 minutes. A subsequent remount succeeds. Cancelling the hung mount with umount -f /glusterfs/vol01 and then re-mounting succeeds. Any ideas what I am doing wrong? Here is the volfile from /var/log/glusterfs/glusterfs-vol01.log 1: volume volume01-client-0 2: type protocol/client 3: option remote-host server01 4: option remote-subvolume /brick01 5: option transport-type tcp 6: option frame-timeout 10 7: end-volume 8: 9: volume volume01-client-1 10: type protocol/client 11: option remote-host server02 12: option remote-subvolume /brick01 13: option transport-type tcp 14: option frame-timeout 10 15: end-volume 16: 17: volume volume01-replicate-0 18: type cluster/replicate 19: subvolumes volume01-client-0 volume01-client-1 20: end-volume 21: 22: volume volume01-write-behind 23: type performance/write-behind 24: subvolumes volume01-replicate-0 25: end-volume 26: 27: volume volume01-read-ahead 28: type performance/read-ahead 29: subvolumes volume01-write-behind 30: end-volume 31: 32: volume volume01-io-cache 33: type performance/io-cache 34: subvolumes volume01-read-ahead 35: end-volume 36: 37: volume volume01-quick-read 38: type performance/quick-read 39: subvolumes volume01-io-cache 40: end-volume 41: 42: volume volume01-stat-prefetch 43: type performance/stat-prefetch 44: subvolumes volume01-quick-read 45: end-volume 46: 47: volume volume01 48: type debug/io-stats 49: option latency-measurement off 50: option count-fop-hits off 51: subvolumes volume01-stat-prefetch 52: end-volume -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20111104/97c6ff60/attachment.html>
krish
2011-Nov-07 11:18 UTC
[Gluster-users] Long client hangs in replicated configuration with 2-nodes acting as own clients
Con Zyor, Could you attach the client and server log files? That would helps us understand what is causing the hang. thanks, kp On 11/05/2011 03:17 AM, Con Zyor wrote:> > Hello; > > I have installed glusterfs 3.2.4 to a pair of Red Hat Enterprise Linux > 6.1 x86_64 machines with 2GB memory. I am attempting to mirror a > directory full of content between the two servers, which also serve > and update the content through a webapp via Apache. My issue is that > the client mount points hang for 30 minutes or so if either node is > brought down. > > The volfile will be at the end of this e-mail. > > I setup two bricks, one each on nodes server01 and server02, using > ext4 and acl mount options. The vfstab entries on each server look > like this > > /dev/mapper/sysvg-brick01 /brick01 ext4 > defaults,nosuid,acl 1 2 > > From one host, I configure them as a mirror and start the volume: > > gluster volume create volume01 replica 2 transport tcp > server01:/brick01 server02:/brick01 > gluster volume start volume01 > > Then server1 and server2 each mount the volume from themselves via > /etc/fstab entry: > > localhost:/volume01 /glusterfs/vol01 glusterfs > defaults,_netdev,acl 0 0 > > This works, modifications inside /glusterfs/vol01 are seen by the > other host. However when I reboot either server01 or server02, the > client mount point on the surviving node (/glusterfs/vol01) hangs > until the node reboots. If the node never boots, the client mount > point on the surviving node hangs for 30 minutes. I have tried > reducing frame-timeout to 10 seconds to no avail. > > Also; once the rebooted server comes back online it fails to mount > /glusterfs/vol01, hanging, again for 30 minutes. A subsequent remount > succeeds. Cancelling the hung mount with umount -f /glusterfs/vol01 > and then re-mounting succeeds. > > Any ideas what I am doing wrong? > > Here is the volfile from /var/log/glusterfs/glusterfs-vol01.log > > 1: volume volume01-client-0 > 2: type protocol/client > 3: option remote-host server01 > 4: option remote-subvolume /brick01 > 5: option transport-type tcp > 6: option frame-timeout 10 > 7: end-volume > 8: > 9: volume volume01-client-1 > 10: type protocol/client > 11: option remote-host server02 > 12: option remote-subvolume /brick01 > 13: option transport-type tcp > 14: option frame-timeout 10 > 15: end-volume > 16: > 17: volume volume01-replicate-0 > 18: type cluster/replicate > 19: subvolumes volume01-client-0 volume01-client-1 > 20: end-volume > 21: > 22: volume volume01-write-behind > 23: type performance/write-behind > 24: subvolumes volume01-replicate-0 > 25: end-volume > 26: > 27: volume volume01-read-ahead > 28: type performance/read-ahead > 29: subvolumes volume01-write-behind > 30: end-volume > 31: > 32: volume volume01-io-cache > 33: type performance/io-cache > 34: subvolumes volume01-read-ahead > 35: end-volume > 36: > 37: volume volume01-quick-read > 38: type performance/quick-read > 39: subvolumes volume01-io-cache > 40: end-volume > 41: > 42: volume volume01-stat-prefetch > 43: type performance/stat-prefetch > 44: subvolumes volume01-quick-read > 45: end-volume > 46: > 47: volume volume01 > 48: type debug/io-stats > 49: option latency-measurement off > 50: option count-fop-hits off > 51: subvolumes volume01-stat-prefetch > 52: end-volume > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20111107/44e9637c/attachment.html>