I am trying to mount the same partition from a KVM ubuntu 8.04.1 virtual machine and on an ubuntu 8.04.1 host server. I am able to mount the partition just on fine on two ubuntu host servers, they both talk to each other. The logs on both servers show the other machine mounting and unmounting the drive. However, when I mount the drive in the KVM VM I get no communication to the host servers. I have checked with tcpdump and the VM doesn't even attempt to talk to the other cluster members. The VM just mounts the drive like no one else is on the cluster, even though both the other nodes already have the drive mounted. I have checked and rechecked all the settings, the cluster.conf is the same on all nodes, the drive haa the same uuid and the same label. The only thing that is different is the actual device name. On the host servers it is the AOE device '/dev/etherd/e0.1p11', on the VM the '/dev/etherd/e0.1' device is mapped to '/dev/sdb' so the OCFS2 partition shows up as '/dev/sdb11' The only thing I can think of is that the device names have to be the same between all hosts, but that really doesn't make any sense to me. Any help would be greatly appreciated. Thanks. -- Bret Baptist Senior Network Administrator bbaptist at iexposure.com Internet Exposure, Inc. http://www.iexposure.com (612)676-1946 x17 Providing Internet Services since 1995 Web Development ~ Search Engine Marketing ~ Web Analytics Network Security ~ On Demand Tech Support ~ E-Mail Marketing
Hello Bret, An obvious question, but have you tried disabling the firewall on the KVM VM? Also, are you able to ping the other two Ubuntu nodes from the KVM VM? -----Oorspronkelijk bericht----- Van: ocfs2-users-bounces at oss.oracle.com [mailto:ocfs2-users-bounces at oss.oracle.com] Namens Bret Baptist Verzonden: donderdag 21 augustus 2008 21:32 Aan: ocfs2-users at oss.oracle.com Onderwerp: [Ocfs2-users] VM node won't talk to host I am trying to mount the same partition from a KVM ubuntu 8.04.1 virtual machine and on an ubuntu 8.04.1 host server. I am able to mount the partition just on fine on two ubuntu host servers, they both talk to each other. The logs on both servers show the other machine mounting and unmounting the drive. However, when I mount the drive in the KVM VM I get no communication to the host servers. I have checked with tcpdump and the VM doesn't even attempt to talk to the other cluster members. The VM just mounts the drive like no one else is on the cluster, even though both the other nodes already have the drive mounted. I have checked and rechecked all the settings, the cluster.conf is the same on all nodes, the drive haa the same uuid and the same label. The only thing that is different is the actual device name. On the host servers it is the AOE device '/dev/etherd/e0.1p11', on the VM the '/dev/etherd/e0.1' device is mapped to '/dev/sdb' so the OCFS2 partition shows up as '/dev/sdb11' The only thing I can think of is that the device names have to be the same between all hosts, but that really doesn't make any sense to me. Any help would be greatly appreciated. Thanks. -- Bret Baptist Senior Network Administrator bbaptist at iexposure.com Internet Exposure, Inc. http://www.iexposure.com (612)676-1946 x17 Providing Internet Services since 1995 Web Development ~ Search Engine Marketing ~ Web Analytics Network Security ~ On Demand Tech Support ~ E-Mail Marketing _______________________________________________ Ocfs2-users mailing list Ocfs2-users at oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users
The host servers are also able to connect to the VM server. Here is the cluster.conf: node: ip_port = 7777 ip_address = 10.1.1.20 number = 0 name = wedge cluster = iecluster node: ip_port = 7777 ip_address = 10.1.1.21 number = 1 name = porkins cluster = iecluster node: ip_port = 7777 ip_address = 10.1.1.4 number = 2 name = opennebula cluster = iecluster cluster: node_count = 3 name = iecluster The o2cb configuration: O2CB_HEARTBEAT_THRESHOLD=61 O2CB_IDLE_TIMEOUT_MS=10000 O2CB_KEEPALIVE_DELAY_MS=5000 O2CB_RECONNECT_DELAY_MS=2000 I have the VM connecting to a bridge that is on the host server, in this case 10.1.1.21 is assigned to the bridge br1, the VM opennebula has an IP address of 10.1.1.4 on this bridge as well. Let me know if there is any other details of the set up you would need to know. Thank you very much for the help. Bret. On Thursday 21 August 2008 14:55:43 Herbert van den Bergh wrote:> What about from the host server(s) to the VM? And what does > cluster.conf look like? > > Basically, all nodes need to be able to connect to all others' OCFS2 port. > > Thanks, > Herbert. > > Bret Baptist wrote: > > On Thursday 21 August 2008 14:37:09 Wessel wrote: > >> Hello Bret, > >> > >> An obvious question, but have you tried disabling the firewall on the > >> KVM VM? Also, are you able to ping the other two Ubuntu nodes from the > >> KVM VM? > > > > There is no firewall enabled on the VM, in fact iptables is not even > > installed. > > > > I am able to ping and do other communication from the VM to the host > > server. > > > > > > Bret. > > > >> -----Oorspronkelijk bericht----- > >> Van: ocfs2-users-bounces at oss.oracle.com > >> [mailto:ocfs2-users-bounces at oss.oracle.com] Namens Bret Baptist > >> Verzonden: donderdag 21 augustus 2008 21:32 > >> Aan: ocfs2-users at oss.oracle.com > >> Onderwerp: [Ocfs2-users] VM node won't talk to host > >> > >> I am trying to mount the same partition from a KVM ubuntu 8.04.1 virtual > >> machine and on an ubuntu 8.04.1 host server. > >> > >> I am able to mount the partition just on fine on two ubuntu host > >> servers, they > >> both talk to each other. The logs on both servers show the other > >> machine mounting and unmounting the drive. > >> > >> However, when I mount the drive in the KVM VM I get no communication to > >> the host servers. I have checked with tcpdump and the VM doesn't even > >> attempt to > >> talk to the other cluster members. The VM just mounts the drive like no > >> one > >> > >> else is on the cluster, even though both the other nodes already have > >> the drive mounted. > >> > >> I have checked and rechecked all the settings, the cluster.conf is the > >> same on > >> all nodes, the drive haa the same uuid and the same label. The only > >> thing that is different is the actual device name. On the host servers > >> it is the AOE device '/dev/etherd/e0.1p11', on the VM the > >> '/dev/etherd/e0.1' device is > >> > >> mapped to '/dev/sdb' so the OCFS2 partition shows up as '/dev/sdb11' > >> > >> The only thing I can think of is that the device names have to be the > >> same between all hosts, but that really doesn't make any sense to me. > >> Any help would be greatly appreciated. > >> > >> > >> Thanks.-- Bret Baptist Senior Network Administrator bbaptist at iexposure.com Internet Exposure, Inc. http://www.iexposure.com (612)676-1946 x17 Providing Internet Services since 1995 Web Development ~ Search Engine Marketing ~ Web Analytics Network Security ~ On Demand Tech Support ~ E-Mail Marketing
Bret, Unless I'm misunderstanding you, and what you're saying, I assure you, that's not the case. I have a mounted ocfs2 volume running right now on five servers. Some are /dev/sdb1, others /dev/sdc1, and yet others /dev/sdm1. Michael Bret wrote: Turns out that you DO have to have the same device name on all nodes. Even though the UUID is the same. I pushed the network card for AOE on the host into the VM and used the same device names. Like magic OCFS2 on the VM starts talking to the host. That seems like a pretty serious limitation to me. Does anyone with some knowledge of the code have any input on this short coming? Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20080825/48ecf199/attachment.html
On Thursday 28 August 2008 18:59:07 Sunil Mushran wrote:> If the VM is not seeing the host heartbeat, the issue is not > with heartbeat, but the fact that the VM ios are not hitting > the actual device. Buffered? See if there is some way > to disable buffering in the kvm emulated ide disk.I was thinking we were on to something here. However I tried mounting an OCFS2 file system between two KVM VMs on the host server, accessing the AOE partition through IDE emulation, and everything worked exactly like I would expect. Color me really confused as to why mounting the OCFS2 disk on the host server and a VM running off that server would not work. Bret.> > Bret Baptist wrote: > > I mounted the volume on the host server first. I watched the heartbeat > > debugging. After the mount on the host I saw it doing a heartbeat on the > > device. Kernel logs from mounting the device: > > [112893.823300] ocfs2_dlm: Nodes in domain > > ("2CE50B6318E44D21B18F0A7B93CA27FC"): 1 > > [112893.895672] kjournald starting. Commit interval 5 seconds > > [112893.896247] ocfs2: Mounting device (152,12) on (node 1, slot 0) with > > ordered data mode. > > > > I then mounted the same device mapped into the VM using the KVM emulated > > IDE disk type and showing up in the VM as a SATA drive. I was able to > > mount the drive and the VM thought it was the only cluster member > > mounting the device: [ 2706.845601] ocfs2_dlm: Nodes in domain > > ("2CE50B6318E44D21B18F0A7B93CA27FC"): 2 > > [ 2706.848441] kjournald starting. Commit interval 5 seconds > > [ 2706.849692] ocfs2: Mounting device (8,28) on (node 2, slot 0) with > > ordered data mode. > > > > The debugfs.ocfs2 on the host server (node 1), the first to mount the > > device, started showing the VM heartbeating on the device. Then this > > kernel message was displayed: > > [111004.800566] (5732,1):o2net_connect_expired:1560 ERROR: no connection > > established with node 2 after 10.0 seconds, giving up and returning > > errors. > > > > debugfs.ocfs2 on the VM never showed the host server heartbeating on the > > device at all. Also on the VM (node 2), I received no message about it > > not being able to establish a connection. > > > > >From what I can tell the VM is not even recognizing that the host server > > > is > > > > heartbeating on the device. > > > > You say check the device, I know for a fact that the device is working > > fine. I can connect to the device using the AOE protocol over ethernet on > > the VM and everything works like I expect it. It is just when I map the > > device into the VM using the KVM emulated IDE disk type that I have > > issues. Is there any other debugging we can do to figure out why this > > would be? > > > > Just a note, I also tried accessing the device in the VM using the KVM > > paravirtualized block io driver (virtio_blk). I received the exact same > > results. > > > > > > Thank you very much for your help. > > > > > > Bret. > > > > On Monday 25 August 2008 19:34:55 Sunil Mushran wrote: > >> No, the device names have nothing to do. > >> > >> When you mount, mount.ocfs2 kicks off the heartbeat. When > >> other nodes see a new node heartbeating, o2net attempts to > >> connect to the node. That connect is necessary for the mount > >> to succeed. > >> > >> My investigation would start with disk heartbeat. > >> > >> # watch -d -n2 "debugfs.ocfs2 -R \"hb\" /dev/sdX " > >> > >> Do this on the node that has it mounted. You should see your node > >> heartbeating. When you mount on the other node, you should see > >> that other node heartbeating. If not, check the device. > >> > >> Sunil > >> > >> Bret Baptist wrote: > >>> Turns out that you DO have to have the same device name on all nodes. > >>> Even though the UUID is the same. I pushed the network card for AOE on > >>> the host into the VM and used the same device names. Like magic OCFS2 > >>> on the VM starts talking to the host. That seems like a pretty serious > >>> limitation to me. > >>> > >>> Does anyone with some knowledge of the code have any input on this > >>> short coming? > >>> > >>> > >>> Thank you. > >>> > >>> > >>> Bret. > >>> > >>> On Thursday 21 August 2008 16:37:50 Bret Baptist wrote: > >>>> The host servers are also able to connect to the VM server. > >>>> > >>>> Here is the cluster.conf: > >>>> node: > >>>> ip_port = 7777 > >>>> ip_address = 10.1.1.20 > >>>> number = 0 > >>>> name = wedge > >>>> cluster = iecluster > >>>> > >>>> node: > >>>> ip_port = 7777 > >>>> ip_address = 10.1.1.21 > >>>> number = 1 > >>>> name = porkins > >>>> cluster = iecluster > >>>> > >>>> node: > >>>> ip_port = 7777 > >>>> ip_address = 10.1.1.4 > >>>> number = 2 > >>>> name = opennebula > >>>> cluster = iecluster > >>>> > >>>> cluster: > >>>> node_count = 3 > >>>> name = iecluster > >>>> > >>>> > >>>> The o2cb configuration: > >>>> O2CB_HEARTBEAT_THRESHOLD=61 > >>>> O2CB_IDLE_TIMEOUT_MS=10000 > >>>> O2CB_KEEPALIVE_DELAY_MS=5000 > >>>> O2CB_RECONNECT_DELAY_MS=2000 > >>>> > >>>> > >>>> I have the VM connecting to a bridge that is on the host server, in > >>>> this case 10.1.1.21 is assigned to the bridge br1, the VM opennebula > >>>> has an IP address of 10.1.1.4 on this bridge as well. > >>>> > >>>> Let me know if there is any other details of the set up you would need > >>>> to know. > >>>> > >>>> > >>>> Thank you very much for the help. > >>>> > >>>> > >>>> Bret. > >>>> > >>>> On Thursday 21 August 2008 14:55:43 Herbert van den Bergh wrote: > >>>>> What about from the host server(s) to the VM? And what does > >>>>> cluster.conf look like? > >>>>> > >>>>> Basically, all nodes need to be able to connect to all others' OCFS2 > >>>>> port. > >>>>> > >>>>> Thanks, > >>>>> Herbert. > >>>>> > >>>>> Bret Baptist wrote: > >>>>>> On Thursday 21 August 2008 14:37:09 Wessel wrote: > >>>>>>> Hello Bret, > >>>>>>> > >>>>>>> An obvious question, but have you tried disabling the firewall on > >>>>>>> the KVM VM? Also, are you able to ping the other two Ubuntu nodes > >>>>>>> from the KVM VM? > >>>>>> > >>>>>> There is no firewall enabled on the VM, in fact iptables is not even > >>>>>> installed. > >>>>>> > >>>>>> I am able to ping and do other communication from the VM to the host > >>>>>> server. > >>>>>> > >>>>>> > >>>>>> Bret. > >>>>>> > >>>>>>> -----Oorspronkelijk bericht----- > >>>>>>> Van: ocfs2-users-bounces at oss.oracle.com > >>>>>>> [mailto:ocfs2-users-bounces at oss.oracle.com] Namens Bret Baptist > >>>>>>> Verzonden: donderdag 21 augustus 2008 21:32 > >>>>>>> Aan: ocfs2-users at oss.oracle.com > >>>>>>> Onderwerp: [Ocfs2-users] VM node won't talk to host > >>>>>>> > >>>>>>> I am trying to mount the same partition from a KVM ubuntu 8.04.1 > >>>>>>> virtual machine and on an ubuntu 8.04.1 host server. > >>>>>>> > >>>>>>> I am able to mount the partition just on fine on two ubuntu host > >>>>>>> servers, they > >>>>>>> both talk to each other. The logs on both servers show the other > >>>>>>> machine mounting and unmounting the drive. > >>>>>>> > >>>>>>> However, when I mount the drive in the KVM VM I get no > >>>>>>> communication to the host servers. I have checked with tcpdump and > >>>>>>> the VM doesn't even attempt to > >>>>>>> talk to the other cluster members. The VM just mounts the drive > >>>>>>> like no one > >>>>>>> > >>>>>>> else is on the cluster, even though both the other nodes already > >>>>>>> have the drive mounted. > >>>>>>> > >>>>>>> I have checked and rechecked all the settings, the cluster.conf is > >>>>>>> the same on > >>>>>>> all nodes, the drive haa the same uuid and the same label. The > >>>>>>> only thing that is different is the actual device name. On the > >>>>>>> host servers it is the AOE device '/dev/etherd/e0.1p11', on the VM > >>>>>>> the '/dev/etherd/e0.1' device is > >>>>>>> > >>>>>>> mapped to '/dev/sdb' so the OCFS2 partition shows up as > >>>>>>> '/dev/sdb11' > >>>>>>> > >>>>>>> The only thing I can think of is that the device names have to be > >>>>>>> the same between all hosts, but that really doesn't make any sense > >>>>>>> to me. Any help would be greatly appreciated. > >>>>>>> > >>>>>>> > >>>>>>> Thanks.-- Bret Baptist Senior Network Administrator bbaptist at iexposure.com Internet Exposure, Inc. http://www.iexposure.com (612)676-1946 x17 Providing Internet Services since 1995 Web Development ~ Search Engine Marketing ~ Web Analytics Network Security ~ On Demand Tech Support ~ E-Mail Marketing
On Friday 29 August 2008 18:38:08 Bret Baptist wrote:> On Thursday 28 August 2008 18:59:07 Sunil Mushran wrote: > > If the VM is not seeing the host heartbeat, the issue is not > > with heartbeat, but the fact that the VM ios are not hitting > > the actual device. Buffered? See if there is some way > > to disable buffering in the kvm emulated ide disk. > > I was thinking we were on to something here. However I tried mounting an > OCFS2 file system between two KVM VMs on the host server, accessing the AOE > partition through IDE emulation, and everything worked exactly like I would > expect. > > Color me really confused as to why mounting the OCFS2 disk on the host > server and a VM running off that server would not work. > > > Bret. >Two VMs running on separate host servers have the same issue as the host to VM does. They both think they are the only cluster member mounting the device. -- Bret Baptist Senior Network Administrator bbaptist at iexposure.com Internet Exposure, Inc. http://www.iexposure.com (612)676-1946 x17 Providing Internet Services since 1995 Web Development ~ Search Engine Marketing ~ Web Analytics Network Security ~ On Demand Tech Support ~ E-Mail Marketing