Hi Mark,
I change a little to your patch. I remove the signal prcess code,
because kernel thread doesn't receive signals by default. This patch
fixes the halt problem in thread ocfs_submit_thread(). But, there is
another dead lock on osb->publish_lock in thread ocfs_volume_thread()
and routine ocfs_journal_set_unmounted(). I am tracking this problem now.
Following is the revised patch for journal.c
--------------------------------------------------------------
--- journal.c.old 2004-03-25 10:44:20.000000000 +0800
+++ journal.c 2004-03-25 10:57:53.000000000 +0800
@@ -1765,7 +1765,10 @@
LOG_TRACE_STR("FLUSH_EVENT: timed out");
break;
case -EINTR:
- finish = 1;
+ /* journal shutdown has asked me to do
+ * one last commit cache and then exit */
+ if (journal->state == OCFS_JOURNAL_IN_SHUTDOWN)
+ finish = 1;
LOG_TRACE_STR("FLUSH_EVENT: interrupted");
break;
case 0:
@@ -1778,7 +1781,7 @@
if ((OcfsGlobalCtxt.flags & OCFS_FLAG_SHUTDOWN_VOL_THREAD) ||
(osb->osb_flags & OCFS_OSB_FLAGS_BEING_DISMOUNTED))
- break;
+ finish = 1;
//if (!osb->needs_flush && status != 0)
// continue;
@@ -1788,18 +1791,13 @@
if (down_trylock(&osb->trans_lock) != 0) {
LOG_TRACE_ARGS("commit thread: trylock failed,
miss=%d\n",
misses);
- if (++misses < OCFS_COMMIT_MISS_MAX)
+ if (++misses < OCFS_COMMIT_MISS_MAX && finish == 0)
continue;
LOG_TRACE_ARGS("commit thread: about to down\n");
down(&osb->trans_lock);
misses = 0;
}
- /* journal shutdown has asked me to do one last commit cache */
- /* this commit cache will leave trans lock held! */
- if (journal->state == OCFS_JOURNAL_IN_SHUTDOWN)
- finish = 1;
-
status = ocfs_commit_cache (osb, false);
if (status < 0)
LOG_ERROR_STATUS(status);
On Thu, Mar 25, 2004 at 05:26:13PM +0800, Sonic Zhang wrote:> Hi Mark, > > I change a little to your patch. I remove the signal prcess code, > because kernel thread doesn't receive signals by default.Ours does! If you look in ocfs_daemonize, you'll see the following: /* Block all signals except SIGKILL, SIGSTOP, SIGHUP and SIGINT */ #ifdef HAVE_NPTL spin_lock_irq (¤t->sighand->siglock); tmpsig = current->blocked; siginitsetinv (¤t->blocked, SHUTDOWN_SIGS); recalc_sigpending (); spin_unlock_irq (¤t->sighand->siglock); You need that signal processing bit, to dequeue pending signals from our task. Otherwise, ocfs_wait will trigger the signal_pending condition and return -EINTR on every call!>This patch > fixes the halt problem in thread ocfs_submit_thread(). But, there is > another dead lock on osb->publish_lock in thread ocfs_volume_thread() > and routine ocfs_journal_set_unmounted(). I am tracking this problem now.I'd like more info on this. --Mark -- Mark Fasheh Software Developer, Oracle Corp mark.fasheh@oracle.com
Hi Mark,
Finally, I found the second halt is caused by starvation when routine
ocfs_joutnal_set_unmounted() acquiring the lock osb->publish_lock. In
thread ocfs_volume_thread(), the delta jiffies to sleep between up() and
down() in schedule_timeout() is too short. Routine
ocfs_joutnal_set_unmounted() has no chance to check if lock
osb->publish_lock is released between it is releases and reacquired by
thread ocfs_volume_thread. So routine ocfs_journal_set_unmounted()
always waits in loop. After I change the delta jiffies from 50 to 500,
kernel 2.6 won't halt when it reboots after a OCFS volume is mounted.
I also add a line to release the lock in a branch to symbol
"finally".
This may remove latent dead lock. In addition, I clear the reference
point OcfsIpcCtxt.task before thread ocfs_recv_thread() exits. This
prevents invalid access to the task structure in routine
ocfs_dismount_volume() when rebooting.
Here is my patch to file nm.c.
-------------------------------------------------------------------
--- ocfs2.old/src/nm.c.old 2004-03-26 15:21:32.000000000 +0800
+++ ocfs2/src/nm.c 2004-03-26 15:21:06.000000000 +0800
@@ -119,6 +119,8 @@
OcfsIpcCtxt.recv_sock = NULL;
}
+ OcfsIpcCtxt.task = NULL;
+
/* signal main thread of ipcdlm's exit */
complete (&(OcfsIpcCtxt.complete));
@@ -227,6 +229,12 @@
//#define OCFS_BH_SEM_PRUNE_LIMIT 60 // prune everything each 30
seconds
#define OCFS_BH_SEM_PRUNE_LIMIT 60000 // 8 hours :)
+#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0)
+#define OCFS_SCHEDULE_TIMEOUT_JIFFIES 500
+#else
+#define OCFS_SCHEDULE_TIMEOUT_JIFFIES 50
+#endif
+
/*
* ocfs_volume_thread()
*
@@ -409,6 +417,7 @@
OCFS_BH_PUT_DATA(bh);
status = ocfs_write_bh(osb, bh, 0, NULL);
if (status < 0) {
+ up(&(osb->publish_lock));
LOG_ERROR_STATUS (status);
goto finally;
}
@@ -425,7 +434,7 @@
goto finally;
}
}
- osb->hbt = 50 + jiffies;
+ osb->hbt = OCFS_SCHEDULE_TIMEOUT_JIFFIES + jiffies;
finally:
status = 0;
@@ -435,7 +444,7 @@
break;
j = jiffies;
if (time_after (j, (unsigned long) (osb->hbt))) {
- osb->hbt = 50 + j;
+ osb->hbt = OCFS_SCHEDULE_TIMEOUT_JIFFIES + j;
}
set_current_state (TASK_INTERRUPTIBLE);
schedule_timeout (osb->hbt - j);