Hello List due to a mistake my post from yesterday has been cut. This is why I try to send my post again and open it as new thread. I hope it will work this time. <---- Original postet mail starts here ----> Hello Marcel, hello Samuel, sorry for my late answer, but I was away for two months and for that I could continue my tests last week. First of all thank you for your patch of the Filesystem RA. It works like a charm but I have some little remarks. What I found out is that the test of the filesystem access with OCF_CHECK_LEVEL is not working with glusterfs. If I use the nvpair OCF_CHECK_LEVEL with a value of 10 I get an err_message with the content: ' 192.168.51.1:/gl_vol0 is not a block device, monitor 10 is noop' If I use the nvpair OCF_CHECK_LEVEL with a value of 20 I get an err_message with the content: ' ERROR: dd said: dd: opening `/virtfs0/.Filesystem_status/res_glusterfs_sp0:0_0_vmhost1': Invalid argument' After that the resource is trying to restart permanently Unfortunately I am not familiar enough with scripting to fix it by myself and to contribute it. Another item I would like to discuss is a bit more general. As Samuel pointed out the Filesystem RA (with native client) needs the gluster node it connects to (by using the device attribute of the Filsesystem RA) up and running only on startup of the client. After that the native client detects by itself if a gluster node is gone or not. This is correct so far but in my setup this could be a SPOF. I would like to build a cluster of three machines (A,B,C) and start a Filesystem RA clone on all three clusternodes. Each of that nodes is a glusterfs server offering a glusterfs replicated share and also a glusterfs client which mounts that share (from server A initially). If all three servers are up there is no problem. Even if one of the servers is going down everything will work fine. But If the node the clients are connected to on startup (Server A) crashes and I have afterwards a need to reboot one of the remaining servers (B or C) , this server is not able to reconnect as client cause node A is still down.>From my point of view it could be a solution to give some nvpairs (e.g.glusterhost1=IPorName1ofNodeA:/glustervolume, glusterhost2=IPorName2ofNodeB:/glustervolume, ... glusterhostN=IPorNameNofNodeN:/glustervolume). These pairs could be pinged and/or tested before the Filesystem RA tries to connect to them. In case that one of these nodes is not reachable or does not respond to the connection attempt the RA could try a connection with the next nvpair. Background: I would like to build a openais/pacemaker cluster consisting of three nodes. On each node should run a gluster server providing a replicated glusterfs share, a glusterfs client (Filesystem RA clone) connected to this share and one or more KVM-VMs Due to load reason the VMs should be distributed over the cluster. In case of crash of one of these servers the affected VM shall fail over to the remaining nodes. I hope I was able to explain my concerns and you or anybody else could give me a hint to solve my problem. Thx in advance Uwe -----Urspr?ngliche Nachricht----- Von: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] Im Auftrag von Marcel Pennewi? Gesendet: Montag, 18. Juli 2011 14:00 An: gluster-users at gluster.org Betreff: Re: [Gluster-users] glusterfs and pacemaker On Monday 18 July 2011 13:26:00 samuel wrote:> I don't know from which version on but, if you use the native client > for mounting the volumes, it's only required to have the IP active in > the mount moment. After that, the native client will transparently > manage node's failure.ACK, that's why we use this shared IP (e.g. for backup issues via nfs). AFAIR glusterFS retrieves Volfile (via shared IP) and connects to the nodes. Marcel _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Marcel Pennewiß
2011-Sep-12 14:53 UTC
[Gluster-users] glusterfs, pacemaker and Filesystem RA
On Monday 12 September 2011 15:14:05 Uwe Weiss wrote:> Hello Marcel, hello Samuel,Hi,> [something about OCF_CHECK_LEVEL] > > Unfortunately I am not familiar enough with scripting to fix it by myself > and to contribute it.I'm not familiar with this, but i'll have a look at it as soon as my glusterfs-virtualmachines will be back after maintenance.> Another item I would like to discuss is a bit more general. > As Samuel pointed out the Filesystem RA (with native client) needs the > gluster node it connects to (by using the device attribute of the > Filsesystem RA) up and running only on startup of the client. > > After that the native client detects by itself if a gluster node is gone or > not. This is correct so far but in my setup this could be a SPOF.AFAIR the client gets the volfile after first connect. Afterward it'll run even if a node gets down (if you're using replicated setup). So first connect / mount is the critical moment. We're using a virtual IP which is assigned to an active node. Works fine here even if only one of the gluster-node is online. Marcel
Possibly Parallel Threads
- GlusterFS, Pacemaker, OCF resource agents on CentOS 7
- mount.ocfs2: Invalid argument while mounting /dev/mapper/xenconfig_part1 on /etc/xen/vm/. Check 'dmesg' for more information on this error.
- [PATCH] 1. changes for vdiskadm on illumos based platform
- GlusterFS, Pacemaker, OCF resource agents on CentOS 7
- GlusterFS, Pacemaker, OCF resource agents on CentOS 7