Gang He
2016-May-20 08:12 UTC
[Ocfs2-devel] [PATCH] ocfs2: adjust arguments in dlm_new_lockspace called by kernel
We encountered a bug from the customer, the user did a fsck.ocfs2 on the file system and exited unusually, the lockspace (with LVB size = 32) was left in the kernel space, next, the user mounted this file system, the kernel module did not create a new lockspace (LVB size = 64) via calling dlm_new_lockspace() function in mounting stage, just used the existing lockspace, created by the user space tool, this would lead the user was not able to mount this file system from the other nodes, with the error message likes, dlm: 032F5......: config mismatch: 64,0 nodeid 177127961: 32,0 (mount.ocfs2,26981,46):ocfs2_dlm_init:2995 ERROR: status = -71 ocfs2_mount_volume:1881 ERROR: status = -71 ocfs2_fill_super:1236 ERROR: status = -71 The user was very difficult to find the root cause, then, we brought out this patch to relieve such problem. First, we add one more flag in calling dlm_new_lockspace() function, to make sure the lockspace is created by kernel module itself, and this change will not affect the backward compatibility. Second, the obvious error message is reported in the kernel log, let the user be more easy to find the root cause. Gang He (1): ocfs2: insure dlm lockspace is created by kernel module fs/ocfs2/stack_user.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) -- 1.8.5.6
Gang He
2016-May-20 08:12 UTC
[Ocfs2-devel] [PATCH] ocfs2: insure dlm lockspace is created by kernel module
This patch will be used to insure the dlm lockspace is created by kernel module when mounting a ocfs2 file system. There are two ways to create a lockspace, from user space and kernel space, but the same name lockspaces probably have different lvblen lengths/flags. To avoid this mix using, we add one more flag DLM_LSFL_NEWEXCL, it will make sure the dlm lockspace is created by kernel module when mounting. Secondly, if a user space program (ocfs2-tools) is running on a file system, the user tries to mount this file system in the cluster, DLM module will return a -EEXIST or -EPROTO errno, we should give the user a obvious error message, then, the user can let that user space tool exit before mounting the file system again. Signed-off-by: Gang He <ghe at suse.com> --- fs/ocfs2/stack_user.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/fs/ocfs2/stack_user.c b/fs/ocfs2/stack_user.c index ced70c8..c9e828e 100644 --- a/fs/ocfs2/stack_user.c +++ b/fs/ocfs2/stack_user.c @@ -1007,10 +1007,17 @@ static int user_cluster_connect(struct ocfs2_cluster_connection *conn) lc->oc_type = NO_CONTROLD; rc = dlm_new_lockspace(conn->cc_name, conn->cc_cluster_name, - DLM_LSFL_FS, DLM_LVB_LEN, + DLM_LSFL_FS | DLM_LSFL_NEWEXCL, DLM_LVB_LEN, &ocfs2_ls_ops, conn, &ops_rv, &fsdlm); - if (rc) + if (rc) { + if (rc == -EEXIST || rc == -EPROTO) + printk(KERN_ERR "ocfs2: Unable to create the " + "lockspace %s (%d), because a ocfs2-tools " + "program is running on this file system " + "with the same name lockspace\n", + conn->cc_name, rc); goto out; + } if (ops_rv == -EOPNOTSUPP) { lc->oc_type = WITH_CONTROLD; -- 1.8.5.6