Goldwyn Rodrigues
2013-Sep-06 03:26 UTC
[Ocfs2-devel] [PATCH 0/6] nocontrold: Eliminating ocfs2_controld
Hi, I am re-sending this patch series because I did not get a response for the previous set. This is an effort of removing ocfs2_controld.pcmk and getting ocfs2 DLM handling up to the times with respect to DLM (>=4.0.1) and corosync (2.3.x). AFAIK, cman also is being phased out for a unified corosync cluster stack. fs/dlm performs all the functions with respect to fencing and node management and provides the API's to do so for ocfs2. For all future references, DLM stands for fs/dlm code. The advantages are: + No need to run an additional userspace daemon (ocfs2_controld) + No contrrold devince handling and controld protocol + Shifting responsibilities of node management to DLM layer + Huge reduction in source code, both in kernel and userspace This feature requires modification in the userspace ocfs2-tools. The changes can be found at: https://github.com/goldwynr/ocfs2-tools branch: nocontrold Currently, not many checks are present in the userspace code, but that would change soon. These changes were developed on linux-stable 3.10.y, though they are applicable at the current upstream as well. If you want to give the entire kernel a spin, the link is: https://github.com/goldwynr/linux-stable branch: nocontrold Review comments/suggestions/criticism welcome. -- Goldwyn
Lars Marowsky-Bree
2013-Sep-06 11:27 UTC
[Ocfs2-devel] [PATCH 0/6] nocontrold: Eliminating ocfs2_controld
On 2013-09-05T22:26:56, Goldwyn Rodrigues <rgoldwyn at suse.de> wrote: Hi Goldwyn, thanks! This looks really good.> This is an effort of removing ocfs2_controld.pcmk and getting ocfs2 DLM > handling up to the times with respect to DLM (>=4.0.1) and corosync > (2.3.x). AFAIK, cman also is being phased out for a unified corosync > cluster stack.That's clearly necessary, also to bring OCFS2 more uptodate with the latest happenings in the GFS2 world; it'll allow both file systems to share exactly the same cluster stack.> https://github.com/goldwynr/ocfs2-tools branch: nocontrold > Currently, not many checks are present in the userspace code, > but that would change soon.There's one question I have; how will this handle - the "old" user-space code starting on a new kernel, - or the "new" user-space code being run on an old kernel? Is there anything we can do to at least provide a meaningful error message in the first case? The second should be easier to handle. Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imend?rffer, HRB 21284 (AG N?rnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde
Goldwyn Rodrigues
2013-Sep-06 19:13 UTC
[Ocfs2-devel] [PATCH 0/6] nocontrold: Eliminating ocfs2_controld
Hi Lars, On 09/06/2013 06:22 AM, Lars Marowsky-Bree wrote:> On 2013-09-05T22:26:56, Goldwyn Rodrigues <rgoldwyn at suse.de> wrote: > > Hi Goldwyn, > > thanks! This looks really good. > >> This is an effort of removing ocfs2_controld.pcmk and getting ocfs2 DLM >> handling up to the times with respect to DLM (>=4.0.1) and corosync >> (2.3.x). AFAIK, cman also is being phased out for a unified corosync >> cluster stack. > > That's clearly necessary, also to bring OCFS2 more uptodate with the > latest happenings in the GFS2 world; it'll allow both file systems to > share exactly the same cluster stack. > >> https://github.com/goldwynr/ocfs2-tools branch: nocontrold >> Currently, not many checks are present in the userspace code, >> but that would change soon. > > There's one question I have; how will this handle > > - the "old" user-space code starting on a new kernel,The ocfs2_controld.pcmk will refuse to start because of absence of the control device created by the kernel. Of course, this would deny mounts as well.> - or the "new" user-space code being run on an old kernel?The kernel code will fail citing the reason: The userspace daemon is not present. The userspace complains (ESRCH): mount.ocfs2: No such process while mounting /dev/sdc1 on /mnt. Check 'dmesg' for more information on this error.> > Is there anything we can do to at least provide a meaningful error > message in the first case? The second should be easier to handle.Yes, we can capture the error code and ask the user to upgrade in the second case. However, for the first case mount.ocfs2 would give a cluster connect failure because ocfs2_controld is not present. On a different note, we should consider increasing the kernel module version shown in dmesg to be in sync with the userspace tools and/or possibly increase the version number of both tools and kernel module. -- Goldwyn
Mark Fasheh
2013-Sep-06 19:40 UTC
[Ocfs2-devel] [PATCH 0/6] nocontrold: Eliminating ocfs2_controld
Firstly, thanks for developing this Goldwyn. I've been looking at the patches, and plan to review the series (in kernel) for you. Quick question - do you have a pointer handy to the development stream on the dlm / gfs side of things? It would be instructive to see how it was done in more than one place. --Mark On Thu, Sep 05, 2013 at 10:26:56PM -0500, Goldwyn Rodrigues wrote:> Hi, > > I am re-sending this patch series because I did not get a response > for the previous set. > > This is an effort of removing ocfs2_controld.pcmk and getting ocfs2 DLM > handling up to the times with respect to DLM (>=4.0.1) and corosync > (2.3.x). AFAIK, cman also is being phased out for a unified corosync > cluster stack. > > fs/dlm performs all the functions with respect to fencing and node > management and provides the API's to do so for ocfs2. For all future > references, DLM stands for fs/dlm code. > > The advantages are: > + No need to run an additional userspace daemon (ocfs2_controld) > + No contrrold devince handling and controld protocol > + Shifting responsibilities of node management to DLM layer > + Huge reduction in source code, both in kernel and userspace > > This feature requires modification in the userspace ocfs2-tools. > The changes can be found at: > https://github.com/goldwynr/ocfs2-tools branch: nocontrold > Currently, not many checks are present in the userspace code, > but that would change soon. > > These changes were developed on linux-stable 3.10.y, though they > are applicable at the current upstream as well. If you want to give > the entire kernel a spin, the link is: > > https://github.com/goldwynr/linux-stable branch: nocontrold > > Review comments/suggestions/criticism welcome. > > -- > Goldwyn-- Mark Fasheh