Sunil Mushran
2009-Feb-03 20:37 UTC
[Ocfs2-devel] [PATCH 1/4] ocfs2/dlm: Retract fix for race between purge and migrate
Mainline commit d4f7e650e55af6b235871126f747da88600e8040 attempts to delay the dlm_thread from sending the drop ref message if the lockres is being migrated. The problem is that we make the dlm_thread wait for the migration to complete. This causes a deadlock as dlm_thread also participates in the lockres migration process. A better fix for the original oss bugzilla#1012 is in testing. Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com> --- fs/ocfs2/dlm/dlmthread.c | 3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/fs/ocfs2/dlm/dlmthread.c b/fs/ocfs2/dlm/dlmthread.c index d129520..4060bb3 100644 --- a/fs/ocfs2/dlm/dlmthread.c +++ b/fs/ocfs2/dlm/dlmthread.c @@ -181,8 +181,7 @@ static int dlm_purge_lockres(struct dlm_ctxt *dlm, spin_lock(&res->spinlock); /* This ensures that clear refmap is sent after the set */ - __dlm_wait_on_lockres_flags(res, (DLM_LOCK_RES_SETREF_INPROG | - DLM_LOCK_RES_MIGRATING)); + __dlm_wait_on_lockres_flags(res, DLM_LOCK_RES_SETREF_INPROG); spin_unlock(&res->spinlock); /* clear our bit from the master's refmap, ignore errors */ -- 1.5.6.3
Sunil Mushran
2009-Feb-03 20:37 UTC
[Ocfs2-devel] [PATCH 2/4] ocfs2: Cleanup the lockname print in dlmglue.c
The dentry lock has a different format than other locks. This patch fixes ocfs2_log_dlm_error() macro to make it print the dentry lock correctly. Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com> --- fs/ocfs2/dlmglue.c | 11 ++++++++--- 1 files changed, 8 insertions(+), 3 deletions(-) diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c index 206a237..7219a86 100644 --- a/fs/ocfs2/dlmglue.c +++ b/fs/ocfs2/dlmglue.c @@ -320,9 +320,14 @@ static void ocfs2_schedule_blocked_lock(struct ocfs2_super *osb, struct ocfs2_lock_res *lockres); static inline void ocfs2_recover_from_dlm_error(struct ocfs2_lock_res *lockres, int convert); -#define ocfs2_log_dlm_error(_func, _err, _lockres) do { \ - mlog(ML_ERROR, "DLM error %d while calling %s on resource %s\n", \ - _err, _func, _lockres->l_name); \ +#define ocfs2_log_dlm_error(_func, _err, _lockres) do { \ + if ((_lockres)->l_type != OCFS2_LOCK_TYPE_DENTRY) \ + mlog(ML_ERROR, "DLM error %d while calling %s on resource %s\n", \ + _err, _func, _lockres->l_name); \ + else \ + mlog(ML_ERROR, "DLM error %d while calling %s on resource %.*s%08x\n", \ + _err, _func, OCFS2_DENTRY_LOCK_INO_START - 1, (_lockres)->l_name, \ + (unsigned int)ocfs2_get_dentry_lock_ino(_lockres)); \ } while (0) static int ocfs2_downconvert_thread(void *arg); static void ocfs2_downconvert_on_unlock(struct ocfs2_super *osb, -- 1.5.6.3
Sunil Mushran
2009-Feb-03 20:37 UTC
[Ocfs2-devel] [PATCH 3/4] ocfs2/dlm: Use ast_lock to protect ast_list
The code was using dlm->spinlock instead of dlm->ast_lock to protect the ast_list. This patch fixes the issue. Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com> --- fs/ocfs2/dlm/dlmunlock.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c index 86ca085..fcf879e 100644 --- a/fs/ocfs2/dlm/dlmunlock.c +++ b/fs/ocfs2/dlm/dlmunlock.c @@ -117,11 +117,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm, else BUG_ON(res->owner == dlm->node_num); - spin_lock(&dlm->spinlock); + spin_lock(&dlm->ast_lock); /* We want to be sure that we're not freeing a lock * that still has AST's pending... */ in_use = !list_empty(&lock->ast_list); - spin_unlock(&dlm->spinlock); + spin_unlock(&dlm->ast_lock); if (in_use) { mlog(ML_ERROR, "lockres %.*s: Someone is calling dlmunlock " "while waiting for an ast!", res->lockname.len, -- 1.5.6.3
Sunil Mushran
2009-Feb-03 20:37 UTC
[Ocfs2-devel] [PATCH 4/4] ocfs2/dlm: Make dlm_assert_master_handler() kill itself instead of the asserter
In dlm_assert_master_handler(), if we get an incorrect assert master from a node that, we reply with EINVAL asking the asserter to die. The problem is that an assert is sent after so many hoops, it is invariably the node that thinks the asserter is wrong, is actually wrong. So instead of killing the asserter, this patch kills the assertee. This patch papers over a race that is still being addressed. Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com> --- fs/ocfs2/dlm/dlmmaster.c | 12 ++++++------ 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c index 54e182a..0a28139 100644 --- a/fs/ocfs2/dlm/dlmmaster.c +++ b/fs/ocfs2/dlm/dlmmaster.c @@ -1849,12 +1849,12 @@ int dlm_assert_master_handler(struct o2net_msg *msg, u32 len, void *data, if (!mle) { if (res->owner != DLM_LOCK_RES_OWNER_UNKNOWN && res->owner != assert->node_idx) { - mlog(ML_ERROR, "assert_master from " - "%u, but current owner is " - "%u! (%.*s)\n", - assert->node_idx, res->owner, - namelen, name); - goto kill; + mlog(ML_ERROR, "DIE! Mastery assert from %u, " + "but current owner is %u! (%.*s)\n", + assert->node_idx, res->owner, namelen, + name); + __dlm_print_one_lock_resource(res); + BUG(); } } else if (mle->type != DLM_MLE_MIGRATION) { if (res->owner != DLM_LOCK_RES_OWNER_UNKNOWN) { -- 1.5.6.3
Joel Becker
2009-Feb-11 07:31 UTC
[Ocfs2-devel] [PATCH 1/4] ocfs2/dlm: Retract fix for race between purge and migrate
looks good. On Tue, Feb 03, 2009 at 12:37:13PM -0800, Sunil Mushran wrote:> Mainline commit d4f7e650e55af6b235871126f747da88600e8040 attempts to delay > the dlm_thread from sending the drop ref message if the lockres is being > migrated. The problem is that we make the dlm_thread wait for the migration > to complete. This causes a deadlock as dlm_thread also participates in the > lockres migration process. > > A better fix for the original oss bugzilla#1012 is in testing. > > Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com> > --- > fs/ocfs2/dlm/dlmthread.c | 3 +-- > 1 files changed, 1 insertions(+), 2 deletions(-) > > diff --git a/fs/ocfs2/dlm/dlmthread.c b/fs/ocfs2/dlm/dlmthread.c > index d129520..4060bb3 100644 > --- a/fs/ocfs2/dlm/dlmthread.c > +++ b/fs/ocfs2/dlm/dlmthread.c > @@ -181,8 +181,7 @@ static int dlm_purge_lockres(struct dlm_ctxt *dlm, > > spin_lock(&res->spinlock); > /* This ensures that clear refmap is sent after the set */ > - __dlm_wait_on_lockres_flags(res, (DLM_LOCK_RES_SETREF_INPROG | > - DLM_LOCK_RES_MIGRATING)); > + __dlm_wait_on_lockres_flags(res, DLM_LOCK_RES_SETREF_INPROG); > spin_unlock(&res->spinlock); > > /* clear our bit from the master's refmap, ignore errors */ > -- > 1.5.6.3 > > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-devel-- "Behind every successful man there's a lot of unsuccessful years." - Bob Brown Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127
Joel Becker
2009-Feb-11 07:31 UTC
[Ocfs2-devel] [PATCH 2/4] ocfs2: Cleanup the lockname print in dlmglue.c
sobby On Tue, Feb 03, 2009 at 12:37:14PM -0800, Sunil Mushran wrote:> The dentry lock has a different format than other locks. This patch fixes > ocfs2_log_dlm_error() macro to make it print the dentry lock correctly. > > Signed-off-by: Sunil Mushran <sunil.mushran at oracle.com> > --- > fs/ocfs2/dlmglue.c | 11 ++++++++--- > 1 files changed, 8 insertions(+), 3 deletions(-) > > diff --git a/fs/ocfs2/dlmglue.c b/fs/ocfs2/dlmglue.c > index 206a237..7219a86 100644 > --- a/fs/ocfs2/dlmglue.c > +++ b/fs/ocfs2/dlmglue.c > @@ -320,9 +320,14 @@ static void ocfs2_schedule_blocked_lock(struct ocfs2_super *osb, > struct ocfs2_lock_res *lockres); > static inline void ocfs2_recover_from_dlm_error(struct ocfs2_lock_res *lockres, > int convert); > -#define ocfs2_log_dlm_error(_func, _err, _lockres) do { \ > - mlog(ML_ERROR, "DLM error %d while calling %s on resource %s\n", \ > - _err, _func, _lockres->l_name); \ > +#define ocfs2_log_dlm_error(_func, _err, _lockres) do { \ > + if ((_lockres)->l_type != OCFS2_LOCK_TYPE_DENTRY) \ > + mlog(ML_ERROR, "DLM error %d while calling %s on resource %s\n", \ > + _err, _func, _lockres->l_name); \ > + else \ > + mlog(ML_ERROR, "DLM error %d while calling %s on resource %.*s%08x\n", \ > + _err, _func, OCFS2_DENTRY_LOCK_INO_START - 1, (_lockres)->l_name, \ > + (unsigned int)ocfs2_get_dentry_lock_ino(_lockres)); \ > } while (0) > static int ocfs2_downconvert_thread(void *arg); > static void ocfs2_downconvert_on_unlock(struct ocfs2_super *osb, > -- > 1.5.6.3 > > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel at oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-devel-- "There is no sincerer love than the love of food." - George Bernard Shaw Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127