I had an issue where if the mount failed after the threads had started that the thread would crash later on. This patch makes sure that the NM thread is terminated if the mount fails and the thread has been started. I didn't really see an easy way to terminate the listener thread :( John Index: super.c ==================================================================--- super.c (revision 803) +++ super.c (working copy) @@ -868,6 +868,7 @@ ocfs_vol_label *vol_label = NULL; int child_pid, i; struct buffer_head *bhs[] = { NULL, NULL }; + int nm_thread_created = 0; // keep track if the NM thread has been created LOG_ENTRY (); @@ -973,6 +974,7 @@ } else { init_completion (&osb->dlm_complete); } + nm_thread_created = 1; /* Launch the commit thread */ osb->commit = (ocfs_commit_task *)ocfs_malloc(sizeof(ocfs_commit_task)); @@ -1080,6 +1082,22 @@ OCFS_BH_PUT_DATA(bhs[1]); brelse(bhs[1]); } + // If we failed to mount and we created the NM thread then we need to + // have it terminate + if (status<0 && nm_thread_created) + { + /* Dismount */ + OCFS_SET_FLAG (osb->osb_flags, OCFS_OSB_FLAGS_BEING_DISMOUNTED); + osb->vol_state = VOLUME_BEING_DISMOUNTED; + + /* Wait for this volume's NM thread to exit */ + if (osb->dlm_task) { + LOG_TRACE_STR ("Waiting for ocfs2nm to exit...."); + send_sig (SIGINT, osb->dlm_task, 0); + wait_for_completion (&(osb->dlm_complete)); + osb->dlm_task = NULL; + } + } LOG_EXIT_STATUS (status); return status; } /* ocfs_mount_volume */
On Tue, Mar 23, 2004 at 10:29:55AM -0800, John L. Villalovos wrote:> I had an issue where if the mount failed after the threads had started that > the thread would crash later on.Do you have any more info on the crash? I'd be interested in what the NM thread did to kill itself... Or is this that bug where one of the devices fields on a bh was not set?> This patch makes sure that the NM thread is terminated if the mount fails > and the thread has been started. > > I didn't really see an easy way to terminate the listener thread :(Well, it's supposed to be happening in ocfs_dismount_volume, which *does* get called if ocfs_mount_volume returns an error. Why isn't it happening there? Is it because "osb->dlm_task" hasn't been set yet? --Mark -- Mark Fasheh Software Developer, Oracle Corp mark.fasheh@oracle.com
Mark Fasheh wrote:> On Tue, Mar 23, 2004 at 10:29:55AM -0800, John L. Villalovos wrote: >> I had an issue where if the mount failed after the threads had >> started that the thread would crash later on. > Do you have any more info on the crash? I'd be interested in what > the NM thread did to kill itself... Or is this that bug where > one of the > devices fields on a bh was not set?Yes it was the one in the message titled: Crash in ocfs_volume_thread The NULL pointer is in bh->b_bdev.>> This patch makes sure that the NM thread is terminated if the mount >> fails and the thread has been started. >> >> I didn't really see an easy way to terminate the listener thread :( > Well, it's supposed to be happening in ocfs_dismount_volume, > which *does* > get called if ocfs_mount_volume returns an error. Why isn't > it happening > there? Is it because "osb->dlm_task" hasn't been set yet?I'm not sure?? I just noticed that when I did a "ps ax" I saw an ocfs2lsnr (or something like that) running. This was after the mount failed. John