Hi everyone; I have a problem on my 10 nodes cluster with ocfs2 1.2.9 and the OS is RHEL 4.7 AS. 9 nodes can start o2cb service and mount san disks on startup however one node can not do that. My cluster configuration is : node: ip_port = 7777 ip_address = 192.168.5.1 number = 0 name = fa01 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.2 number = 1 name = fa02 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.3 number = 2 name = fa03 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.4 number = 3 name = fa04 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.5 number = 4 name = fa05 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.6 number = 5 name = fa06 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.7 number = 6 name = fa07 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.8 number = 7 name = fa08 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.10 number = 8 name = fa10 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.9 number = 9 name = fa09 cluster = ocfs2 cluster: node_count = 10 name = ocfs2 when i manually try to mount disks i get an error says: "mount.ocfs2: Transport endpoint is not connected while mounting /dev/emcpowerc1 on /oradisk/conf. Check 'dmesg' for more information on this error." And when i check dmesg i see (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! I search similar problems but i can't find anything could u help me? Thanx Mehmet Can ONAL -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20080918/71cba2e2/attachment.html
Hi everyone; The problem below changes into a "o2net_connect_expired" error. After examinig logs and other internet search staff we try o2cb configure again with the command /etc/init.d/o2cb configure Our scenario is that After the installation of RAC the below (also above) problem occurs and after reconfiguring o2cb our problem node can mount san disks. What i want to ask is after a RAC installation is this something that we should do or what could be the relation between this problem and installation of RAC? Only one node has this problem by the way, others could mount disks and if others mount disks problem node con not but if problem node mount disks the others can not? What do you think? Mehmet Can ?NAL ________________________________ From: Mehmet Can ?NAL Sent: Thursday, September 18, 2008 1:17 PM To: ocfs2-users at oss.oracle.com Subject: o2hb_do_disk_heartbeat:982:ERROR Hi everyone; I have a problem on my 10 nodes cluster with ocfs2 1.2.9 and the OS is RHEL 4.7 AS. 9 nodes can start o2cb service and mount san disks on startup however one node can not do that. My cluster configuration is : node: ip_port = 7777 ip_address = 192.168.5.1 number = 0 name = fa01 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.2 number = 1 name = fa02 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.3 number = 2 name = fa03 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.4 number = 3 name = fa04 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.5 number = 4 name = fa05 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.6 number = 5 name = fa06 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.7 number = 6 name = fa07 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.8 number = 7 name = fa08 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.10 number = 8 name = fa10 cluster = ocfs2 node: ip_port = 7777 ip_address = 192.168.5.9 number = 9 name = fa09 cluster = ocfs2 cluster: node_count = 10 name = ocfs2 when i manually try to mount disks i get an error says: "mount.ocfs2: Transport endpoint is not connected while mounting /dev/emcpowerc1 on /oradisk/conf. Check 'dmesg' for more information on this error." And when i check dmesg i see (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": another node is heartbeating in our slot! I search similar problems but i can't find anything could u help me? Thanx Mehmet Can ONAL -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20080918/1165386a/attachment-0001.html
Ensure the cluster.conf is the same across the cluster. If it is not, edit and restart the cluster. The "transport endpoint" error means that the tcpip connect failed. It could be because of incorrect ip, firewall, or a bad cluster.conf. The dmesg errors indicate that the cluster.conf could be mismatched. For example, two different nodes are the same number. Looking at cluster.conf, I notice that ip addresses of node 8 and 9 are not following the convention of other nodes. Now it may be nothing, but could explain the other errors. Mehmet Can ?NAL wrote:> > Hi everyone; > > I have a problem on my 10 nodes cluster with ocfs2 1.2.9 and the OS is > RHEL 4.7 AS. > > 9 nodes can start o2cb service and mount san disks on startup however > one node can not do that. My cluster configuration is : > > node: > > ip_port = 7777 > > ip_address = 192.168.5.1 > > number = 0 > > name = fa01 > > cluster = ocfs2 > > node: > > ip_port = 7777 > > ip_address = 192.168.5.2 > > number = 1 > > name = fa02 > > cluster = ocfs2 > > node: > > ip_port = 7777 > > ip_address = 192.168.5.3 > > number = 2 > > name = fa03 > > cluster = ocfs2 > > node: > > ip_port = 7777 > > ip_address = 192.168.5.4 > > number = 3 > > name = fa04 > > cluster = ocfs2 > > node: > > ip_port = 7777 > > ip_address = 192.168.5.5 > > number = 4 > > name = fa05 > > cluster = ocfs2 > > node: > > ip_port = 7777 > > ip_address = 192.168.5.6 > > number = 5 > > name = fa06 > > cluster = ocfs2 > > node: > > ip_port = 7777 > > ip_address = 192.168.5.7 > > number = 6 > > name = fa07 > > cluster = ocfs2 > > node: > > ip_port = 7777 > > ip_address = 192.168.5.8 > > number = 7 > > name = fa08 > > cluster = ocfs2 > > node: > > ip_port = 7777 > > ip_address = 192.168.5.10 > > number = 8 > > name = fa10 > > cluster = ocfs2 > > node: > > ip_port = 7777 > > ip_address = 192.168.5.9 > > number = 9 > > name = fa09 > > cluster = ocfs2 > > cluster: > > node_count = 10 > > name = ocfs2 > > when i manually try to mount disks i get an error says: > > ?mount.ocfs2: Transport endpoint is not connected while mounting > /dev/emcpowerc1 on /oradisk/conf. Check 'dmesg' for more information > on this error.? > > And when i check dmesg i see > > (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": > another node is heartbeating in our slot! > > (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": > another node is heartbeating in our slot! > > (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": > another node is heartbeating in our slot! > > (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": > another node is heartbeating in our slot! > > (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": > another node is heartbeating in our slot! > > (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": > another node is heartbeating in our slot! > > (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": > another node is heartbeating in our slot! > > (21125,1):o2hb_do_disk_heartbeat:982 ERROR: Device "emcpowerc1": > another node is heartbeating in our slot! > > I search similar problems but i can?t find anything could u help me? > > Thanx > > Mehmet Can ONAL > > ------------------------------------------------------------------------ > > Bu elektronik posta ve ekleri sadece adreste belirtilen kisi veya > kurulusun kullanimi icin gonderilmektedir. Bu mesaj tarafiniza > yanlislikla ulasirsa, lutfen gonderen kisiyi bilgilendiriniz ve mesaji > sisteminizden siliniz. Mesajda ve eklerinde yer alan bilgilerin her ne > sekilde olursa olsun ucuncu kisiler ile paylasilmasi hukuki ve cezai > sorumluluk dogurabilir.Fintek A.S.'nin bu mesaj ve eklerinin icerigi > ve yayimi ile ilgili hicbir sorumlulugu bulunmamaktadir. > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------ > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-users
Apparently Analagous Threads
- Ocfs2-users Digest, Vol 57, Issue 14
- Another node is heartbeating in our slot!
- OCFS2 1.4 + DRBD + iSCSI problem with DLM
- re: o2hb_do_disk_heartbeat:963 ERROR: Device "sdb1" another node is heartbeating in our slot!
- OCFS2 + iscsi: another node is heartbeating in our slot (over scst)