Dear List, I have one last little problem with setting up an cluster. My gfs Mount will hang as soon as I do an iptables restart on one of the nodes.. First, let me describe my setup: - 4 nodes, all running an updated Centos 5.2 installation - 1 Dell MD3000i ISCSI SAN - All nodes are connected by Dell?s Supplied RDAC driver Everything is running stable when the cluster is started (tested for a week or so), when I did some last changes in my firewall and did an iptables restart it all went down after a while. I reproduced the issue now several times so I am quite sure it has to do with the iptables restart. I have a custom fence script for our ipoman powerswitch, which is all tested and is working fine. When I do iptables restart the following will happen: - After approx 10 seconds the process gfs_controld will go to 100% cpu usage (at all nodes!) - I can still access my gfs Mount - The Group_tool dump gfs tells me: ---------------------- 1234541723 config_no_withdraw 0 1234541723 config_no_plock 0 1234541723 config_plock_rate_limit 100 1234541723 config_plock_ownership 0 1234541723 config_drop_resources_time 10000 1234541723 config_drop_resources_count 10 1234541723 config_drop_resources_age 10000 1234541723 protocol 1.0.0 1234541723 listen 1 1234541723 cpg 5 1234541723 groupd 6 1234541723 uevent 7 1234541723 plocks 10 1234541723 plock cpg message size: 336 bytes 1234541723 setup done 1234541737 client 6: join /setan gfs lock_dlm mars:setan rw /dev/mapper/vg_cluster-lv_cluster 1234541737 mount: /setan gfs lock_dlm mars:setan rw /dev/mapper/vg_cluster-lv_cluster 1234541737 setan cluster name matches: mars 1234541737 setan do_mount: rv 0 1234541737 groupd cb: set_id setan 20004 1234541737 groupd cb: start setan type 2 count 4 members 3 2 1 4 1234541737 setan start 3 init 1 type 2 member_count 4 1234541737 setan add member 3 1234541737 setan add member 2 1234541737 setan add member 1 1234541737 setan add member 4 1234541737 setan total members 4 master_nodeid -1 prev -1 1234541737 setan start_participant_init 1234541737 setan send_options len 1296 "rw" 1234541737 setan start_done 3 1234541737 setan receive_options from 3 len 1296 last_cb 2 1234541737 setan receive_journals from 1 to 3 len 320 count 4 cb 2 1234541737 receive nodeid 1 jid 1 opts 1 1234541737 receive nodeid 2 jid 2 opts 1 1234541737 receive nodeid 3 jid 3 opts 1 1234541737 receive nodeid 4 jid 0 opts 1 1234541737 setan received_our_jid 3 1234541737 setan retrieve_plocks 1234541737 notify_mount_client: nodir not found for lockspace setan 1234541737 notify_mount_client: ccs_disconnect 1234541737 notify_mount_client: hostdata=jid=3:id=131076:first=0 1234541737 groupd cb: finish setan 1234541737 setan finish 3 needs_recovery 0 1234541737 setan set /sys/fs/gfs/mars:setan/lock_module/block to 0 1234541737 setan set open /sys/fs/gfs/mars:setan/lock_module/block error -1 2 1234541737 kernel: add@ mars:setan 1234541737 setan ping_kernel_mount 0 1234541738 kernel: change@ mars:setan 1234541738 setan recovery_done jid 3 ignored, first 0,0 1234541738 client 6: mount_result /setan gfs 0 1234541738 setan got_mount_result: ci 6 result 0 another 0 first_mounter 0 opts 1 1234541738 setan send_mount_status kernel_mount_error 0 first_mounter 0 1234541738 client 6 fd 11 dead 1234541738 setan receive_mount_status from 3 len 288 last_cb 3 1234541738 setan _receive_mount_status from 3 kernel_mount_error 0 first_mounter 0 opts 1 1234541925 client 6: dump 1234542420 client 6: dump 1234542420 client 7 fd 11 read error -1 9 1234542424 client 6: dump 1234542424 client 7 fd 11 read error -1 9 1234542424 client 8 fd 11 read error -1 9 1234542425 client 6: dump 1234542425 client 7 fd 11 read error -1 9 1234542425 client 8 fd 11 read error -1 9 1234542425 client 9 fd 11 read error -1 9 1234542426 client 6: dump 1234542426 client 7 fd 11 read error -1 9 1234542426 client 8 fd 11 read error -1 9 1234542426 client 9 fd 11 read error -1 9 1234542426 client 10 fd 11 read error -1 9 1234542427 client 6: dump 1234542427 client 7 fd 11 read error -1 9 1234542427 client 8 fd 11 read error -1 9 1234542427 client 9 fd 11 read error -1 9 1234542427 client 10 fd 11 read error -1 9 1234542427 client 11 fd 11 read error -1 9 1234542428 client 6: dump ---------------------- - After a while the groupd process will hit 100% as well (on all nodes) - The gfs Mount will be inaccessible after a while, it hangs when trying to open it. - Group_tool still shows that all nodes are participating in the cluster and gfs service, but no problems are reported.. Does anyone has a clue to fix this completly or at least how to recover my system when it happens without a full reboot of the complete cluster? I have tried for a lot of hours and im still very new to clustering, im just testing it before I want to use it in production enviroments. I really appreciate any help! Regards, Sven Config files/settings: --------------------------- [root at badjak ~]# uname -a Linux badjak.somedomain.tld 2.6.18-92.1.22.el5 #1 SMP Tue Dec 16 11:57:43 EST 2008 x86_64 x86_64 x86_64 GNU/Linux ---------------------------- /etc/cluster/cluster.conf: ------------------------------------ <?xml version="1.0"?> <cluster alias="mars" config_version="77" name="mars"> <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/> <clusternodes> <clusternode name="gandaria.somedomain.tld" nodeid="1" votes="1"> <fence> <method name="1"> <device name="ipoman" action="Off" switch="volt" port="3"/> <device name="ipoman" action="Off" switch="ampere" port="9"/> <device name="ipoman" action="On" switch="volt" port="3"/> <device name="ipoman" action="On" switch="ampere" port="9"/> </method> </fence> </clusternode> <clusternode name="goreng.somedomain.tld" nodeid="2" votes="1"> <fence> <method name="1"> <device name="ipoman" action="Off" switch="volt" port="4"/> <device name="ipoman" action="Off" switch="ampere" port="10"/> <device name="ipoman" action="On" switch="volt" port="4"/> <device name="ipoman" action="On" switch="ampere" port="10"/> </method> </fence> </clusternode> <clusternode name="brandal.somedomain.tld" nodeid="4" votes="1"> <fence> <method name="1"> <device name="ipoman" action="Off" switch="volt" port="9"/> <device name="ipoman" action="Off" switch="ampere" port="3"/> <device name="ipoman" action="On" switch="volt" port="9"/> <device name="ipoman" action="On" switch="ampere" port="3"/> </method> </fence> </clusternode> <clusternode name="badjak.somedomain.tld" nodeid="3" votes="1"> <fence> <method name="1"> <device name="ipoman" action="Off" switch="volt" port="10"/> <device name="ipoman" action="Off" switch="ampere" port="4"/> <device name="ipoman" action="On" switch="volt" port="10"/> <device name="ipoman" action="On" switch="ampere" port="4"/> </method> </fence> </clusternode> </clusternodes> <cman/> <fencedevices> <fencedevice agent="fence_ipoman" name="ipoman"/> </fencedevices> <rm> <failoverdomains/> <resources/> </rm> </cluster> ------------------------------------ [root at badjak /]# cat /etc/fstab /dev/VolGroup00/LogVol00 / ext3 defaults 1 1 LABEL=/boot /boot ext3 defaults 1 2 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 /dev/VolGroup00/LogVol01 swap swap defaults 0 0 /dev/vg_cluster/lv_cluster /setan gfs defaults 0 0 ------------------------------------ [root at badjak ~]# group_tool type level name id state fence 0 default 00010001 none [1 2 3 4] dlm 1 clvmd 00010004 none [1 2 3 4] dlm 1 setan 00030004 none [1 2 3 4] dlm 1 rgmanager 00040004 none [1 2 3 4] gfs 2 setan 00020004 none [1 2 3 4] ------------------------------------ [root at badjak ~]# clustat Cluster Status for mars @ Fri Feb 13 17:18:58 2009 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ gandaria.somedomain.tld 1 Online goreng.somedomain.tld 2 Online badjak.somedomain.tld 3 Online, Local brandal.somedomain.tld 4 Online ------------------------------------ [root at badjak ~]# SMdevices PowerVault Modular Disk Storage Manager Devices, Version 09.17.A6.01 Built Tue Mar 20 15:31:11 CST 2007 Copyright 2005-2006 Dell Inc. All rights reserved. Use is subject to license terms /dev/sdb (/dev/sg3) [Storage Array setan, Virtual Disk 1, LUN 0, Virtual Disk ID <6001ec9000f2dc860000043448bf7e20>, Preferred Path (Controller-1): In Use] ------------------------------------ [root at badjak ~]# /etc/init.d/clvmd status clvmd (pid 7433) is running... active volumes: LogVol00 LogVol01 lv_cluster ------------------------------------
On Fri, Feb 13, 2009 at 06:36:22PM +0100, MARS websolutions wrote:> Dear List, > > I have one last little problem with setting up an cluster. My gfs > Mount will hang as soon as I do an iptables restart on one of the > nodes..Undoubtedly someone else with more experience with GFS will give you an answer, but to me this makes me think ip_conntrack stuff gets cleared out and sessions have to reestablish themselves. Ray
>> Dear List, >> >> I have one last little problem with setting up an cluster. My gfs >> Mount will hang as soon as I do an iptables restart on one of the >> nodes..> Undoubtedly someone else with more experience with GFS will give you an > answer, but to me this makes me think ip_conntrack stuff gets cleared > out and sessions have to reestablish themselves. > > RayRay, Thanks for your fast answer and getting me into the right direction. This sounds like a possible solution, but I have no clue how to fix it. I googled already a lot on ip_conntrack + gfs, but don''t see a possible solution coming up. Can someone/you please help me a little bit more with the issue? Thanks a lot! Sven
On Tuesday 17 February 2009, Sven Kaptein | MARS websolutions wrote:> > Undoubtedly someone else with more experience with GFS will give you an > > answer, but to me this makes me think ip_conntrack stuff gets cleared > > out and sessions have to reestablish themselves. > > > > Ray > > Ray, > > Thanks for your fast answer and getting me into the right direction. This > sounds like a possible solution, but I have no clue how to fix it. I > googled already a lot on ip_conntrack + gfs, but don''t see a possible > solution coming up. > > Can someone/you please help me a little bit more with the issue?You could allow traffic more broadly between your GFS-servers. Pro: packets will not depend on conntrack for delivery. Con: large hole in your firewall that you may not be able to live with. /Peter> Thanks a lot! > Sven-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part. Url : http://lists.centos.org/pipermail/centos/attachments/20090217/a6096cec/attachment.bin
on 2-17-2009 3:00 AM Sven Kaptein | MARS websolutions spake the following:>>> Dear List, >>> >>> I have one last little problem with setting up an cluster. My gfs >>> Mount will hang as soon as I do an iptables restart on one of the >>> nodes.. > >> Undoubtedly someone else with more experience with GFS will give you an >> answer, but to me this makes me think ip_conntrack stuff gets cleared >> out and sessions have to reestablish themselves. >> >> Ray > > Ray, > > Thanks for your fast answer and getting me into the right direction. This > sounds like a possible solution, but I have no clue how to fix it. I googled > already a lot on ip_conntrack + gfs, but don''t see a possible solution > coming up. > > Can someone/you please help me a little bit more with the issue? > > Thanks a lot! > SvenAre your GFS mounts and your cluster on different sides of the firewall? Maybe you can do something simple like a tunnel between the clusters and the mounts. Should be easier and safer than punching holes in the firewall. Or put a separate subnet or vlan just for the GFS traffic. -- MailScanner is like deodorant... You hope everybody uses it, and you notice quickly if they don''t!!!! -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 258 bytes Desc: OpenPGP digital signature Url : http://lists.centos.org/pipermail/centos/attachments/20090217/707ff7ca/attachment.bin
>>>> Dear List, >>>> >>>> I have one last little problem with setting up an cluster. My gfs >>>> Mount will hang as soon as I do an iptables restart on one of the >>>> nodes.. >> >>> Undoubtedly someone else with more experience with GFS will give you an >>> answer, but to me this makes me think ip_conntrack stuff gets cleared >>> out and sessions have to reestablish themselves. >>> >>> Ray >> >> Ray, >> >> Thanks for your fast answer and getting me into the right direction. This >> sounds like a possible solution, but I have no clue how to fix it. Igoogled>> already a lot on ip_conntrack + gfs, but don''t see a possible solution >> coming up. >> >> Can someone/you please help me a little bit more with the issue? >> >> Thanks a lot! >> Sven > Are your GFS mounts and your cluster on different sides of the firewall? > > Maybe you can do something simple like a tunnel between the clusters andthe> mounts. Should be easier and safer than punching holes in the firewall. Orput> a separate subnet or vlan just for the GFS traffic.Uhm... I have the cluster running on a different vlan then my ISCSI traffic. Is that a problem? I would not like to put my cluster communication on the other vlan since thats iscsi dedicated now. I have now figured out that it isn''t the restarting IP tables causing the trouble, but the issue is as follows: - Calling Group_tool dump This will cause groupd to run at 100% cpu. Dooing a strace on this process tells me its dooing some poll in an infinite loop: poll([{fd=1, events=POLLIN}, {fd=2, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=10, events=POLLIN}, {fd=12, events=POLLIN}, {fd=14, events=POLLIN}, {fd=18, events=POLLIN}, {fd=17, events=POLLIN}, {fd=20, events=POLLIN}, {fd=21, events=POLLIN, revents=POLLNVAL}, {fd=-1}], 13, -1) = 1 - Calling Group_tool dump gfs This will cause gfs_controld to run at 100% cpu. Exactly the same as the groupd (strace): poll([{fd=2, events=POLLIN}, {fd=3, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=8, events=POLLIN}, {fd=11, events=POLLIN}, {fd=12, events=POLLIN, revents=POLLNVAL}, {fd=12, events=POLLIN, revents=POLLNVAL}], 8, -1) = 2 Sometimes my Mount will hang, but sometimes it will just continue normally. I can imagine this has to do with the amount of data going to the gfs Mount. So I guess it isn''t really an iptables problem.. But im trying to debug that a little bit more as well.. Any clues? Thanks!! Sven