Hi Mark, I change a little to your patch. I remove the signal prcess code, because kernel thread doesn't receive signals by default. This patch fixes the halt problem in thread ocfs_submit_thread(). But, there is another dead lock on osb->publish_lock in thread ocfs_volume_thread() and routine ocfs_journal_set_unmounted(). I am tracking this problem now. Following is the revised patch for journal.c -------------------------------------------------------------- --- journal.c.old 2004-03-25 10:44:20.000000000 +0800 +++ journal.c 2004-03-25 10:57:53.000000000 +0800 @@ -1765,7 +1765,10 @@ LOG_TRACE_STR("FLUSH_EVENT: timed out"); break; case -EINTR: - finish = 1; + /* journal shutdown has asked me to do + * one last commit cache and then exit */ + if (journal->state == OCFS_JOURNAL_IN_SHUTDOWN) + finish = 1; LOG_TRACE_STR("FLUSH_EVENT: interrupted"); break; case 0: @@ -1778,7 +1781,7 @@ if ((OcfsGlobalCtxt.flags & OCFS_FLAG_SHUTDOWN_VOL_THREAD) || (osb->osb_flags & OCFS_OSB_FLAGS_BEING_DISMOUNTED)) - break; + finish = 1; //if (!osb->needs_flush && status != 0) // continue; @@ -1788,18 +1791,13 @@ if (down_trylock(&osb->trans_lock) != 0) { LOG_TRACE_ARGS("commit thread: trylock failed, miss=%d\n", misses); - if (++misses < OCFS_COMMIT_MISS_MAX) + if (++misses < OCFS_COMMIT_MISS_MAX && finish == 0) continue; LOG_TRACE_ARGS("commit thread: about to down\n"); down(&osb->trans_lock); misses = 0; } - /* journal shutdown has asked me to do one last commit cache */ - /* this commit cache will leave trans lock held! */ - if (journal->state == OCFS_JOURNAL_IN_SHUTDOWN) - finish = 1; - status = ocfs_commit_cache (osb, false); if (status < 0) LOG_ERROR_STATUS(status);
On Thu, Mar 25, 2004 at 05:26:13PM +0800, Sonic Zhang wrote:> Hi Mark, > > I change a little to your patch. I remove the signal prcess code, > because kernel thread doesn't receive signals by default.Ours does! If you look in ocfs_daemonize, you'll see the following: /* Block all signals except SIGKILL, SIGSTOP, SIGHUP and SIGINT */ #ifdef HAVE_NPTL spin_lock_irq (¤t->sighand->siglock); tmpsig = current->blocked; siginitsetinv (¤t->blocked, SHUTDOWN_SIGS); recalc_sigpending (); spin_unlock_irq (¤t->sighand->siglock); You need that signal processing bit, to dequeue pending signals from our task. Otherwise, ocfs_wait will trigger the signal_pending condition and return -EINTR on every call!>This patch > fixes the halt problem in thread ocfs_submit_thread(). But, there is > another dead lock on osb->publish_lock in thread ocfs_volume_thread() > and routine ocfs_journal_set_unmounted(). I am tracking this problem now.I'd like more info on this. --Mark -- Mark Fasheh Software Developer, Oracle Corp mark.fasheh@oracle.com
Hi Mark, Finally, I found the second halt is caused by starvation when routine ocfs_joutnal_set_unmounted() acquiring the lock osb->publish_lock. In thread ocfs_volume_thread(), the delta jiffies to sleep between up() and down() in schedule_timeout() is too short. Routine ocfs_joutnal_set_unmounted() has no chance to check if lock osb->publish_lock is released between it is releases and reacquired by thread ocfs_volume_thread. So routine ocfs_journal_set_unmounted() always waits in loop. After I change the delta jiffies from 50 to 500, kernel 2.6 won't halt when it reboots after a OCFS volume is mounted. I also add a line to release the lock in a branch to symbol "finally". This may remove latent dead lock. In addition, I clear the reference point OcfsIpcCtxt.task before thread ocfs_recv_thread() exits. This prevents invalid access to the task structure in routine ocfs_dismount_volume() when rebooting. Here is my patch to file nm.c. ------------------------------------------------------------------- --- ocfs2.old/src/nm.c.old 2004-03-26 15:21:32.000000000 +0800 +++ ocfs2/src/nm.c 2004-03-26 15:21:06.000000000 +0800 @@ -119,6 +119,8 @@ OcfsIpcCtxt.recv_sock = NULL; } + OcfsIpcCtxt.task = NULL; + /* signal main thread of ipcdlm's exit */ complete (&(OcfsIpcCtxt.complete)); @@ -227,6 +229,12 @@ //#define OCFS_BH_SEM_PRUNE_LIMIT 60 // prune everything each 30 seconds #define OCFS_BH_SEM_PRUNE_LIMIT 60000 // 8 hours :) +#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0) +#define OCFS_SCHEDULE_TIMEOUT_JIFFIES 500 +#else +#define OCFS_SCHEDULE_TIMEOUT_JIFFIES 50 +#endif + /* * ocfs_volume_thread() * @@ -409,6 +417,7 @@ OCFS_BH_PUT_DATA(bh); status = ocfs_write_bh(osb, bh, 0, NULL); if (status < 0) { + up(&(osb->publish_lock)); LOG_ERROR_STATUS (status); goto finally; } @@ -425,7 +434,7 @@ goto finally; } } - osb->hbt = 50 + jiffies; + osb->hbt = OCFS_SCHEDULE_TIMEOUT_JIFFIES + jiffies; finally: status = 0; @@ -435,7 +444,7 @@ break; j = jiffies; if (time_after (j, (unsigned long) (osb->hbt))) { - osb->hbt = 50 + j; + osb->hbt = OCFS_SCHEDULE_TIMEOUT_JIFFIES + j; } set_current_state (TASK_INTERRUPTIBLE); schedule_timeout (osb->hbt - j);