Ling, Xiaofeng
2004-Aug-23 03:05 UTC
[Ocfs2-devel] [PATCH] fix big 116 -- umount cause crash after some operation
before umount, all the lock shall be released first. this patch can resolve the problem ------------------------------------------------ Index: super.c ==================================================================--- super.c (revision 1370) +++ super.c (working copy) @@ -224,6 +224,7 @@ tid_t target; sb->s_dirt = 0; + ocfs_commit_cache(OCFS2_SB(sb)); target = log_start_commit(OCFS2_SB(sb)->journal->k_journal, NULL); log_wait_commit(OCFS2_SB(sb)->journal->k_journal, target); return 0; @@ -234,6 +235,7 @@ tid_t target; sb->s_dirt = 0; + ocfs_commit_cache(OCFS2_SB(sb)); if (journal_start_commit(OCFS2_SB(sb)->journal->k_journal, &target)) { if (wait) log_wait_commit(OCFS2_SB(sb)->journal->k_journal, Index: journal.c ==================================================================--- journal.c (revision 1370) +++ journal.c (working copy) @@ -61,7 +61,7 @@ struct inode *inode); static int ocfs_recover_node(struct _ocfs_super *osb, int node_num); static int __ocfs_recovery_thread(void *arg); -static int ocfs_commit_cache (ocfs_super * osb); +int ocfs_commit_cache (ocfs_super * osb); static int ocfs_wait_on_mount(ocfs_super *osb); static void ocfs_handle_move_locks(ocfs_journal *journal, ocfs_journal_handle *handle); @@ -149,7 +149,7 @@ * This is in journal.c for lack of a better place. * */ -static int ocfs_commit_cache(ocfs_super *osb) +int ocfs_commit_cache(ocfs_super *osb) { int status = 0, tmpstat; ocfs_journal * journal = NULL; ------------------- Ling Xiaofeng(Daniel) Intel China Software Lab. iNet: 8-752-1243 8621-52574545-1243(O) xfling@users.sourceforge.net Opinions are my own and don't represent those of my employer
Mark Fasheh
2004-Aug-23 14:16 UTC
[Ocfs2-devel] [PATCH] fix big 116 -- umount cause crash after some operation
On Mon, Aug 23, 2004 at 04:05:57PM +0800, Ling, Xiaofeng wrote:> before umount, all the lock shall be released first. > this patch can resolve the problemThis isn't the way we want to handle this. The way it's supposed to work is that the umount process sends a signal to the commit thread to shutdown and then waits on it's last set of checkpoints / releases. I'd much rather fix what's broken than patch over it :) I'm actually looking at this bug right now and I think it's due to our signal handling code... --Mark> > ------------------------------------------------ > Index: super.c > ==================================================================> --- super.c (revision 1370) > +++ super.c (working copy) > @@ -224,6 +224,7 @@ > tid_t target; > > sb->s_dirt = 0; > + ocfs_commit_cache(OCFS2_SB(sb)); > target = log_start_commit(OCFS2_SB(sb)->journal->k_journal, NULL); > log_wait_commit(OCFS2_SB(sb)->journal->k_journal, target); > return 0; > @@ -234,6 +235,7 @@ > tid_t target; > > sb->s_dirt = 0; > + ocfs_commit_cache(OCFS2_SB(sb)); > if (journal_start_commit(OCFS2_SB(sb)->journal->k_journal, &target)) > { > if (wait) > log_wait_commit(OCFS2_SB(sb)->journal->k_journal, > Index: journal.c > ==================================================================> --- journal.c (revision 1370) > +++ journal.c (working copy) > @@ -61,7 +61,7 @@ > struct inode *inode); > static int ocfs_recover_node(struct _ocfs_super *osb, int node_num); > static int __ocfs_recovery_thread(void *arg); > -static int ocfs_commit_cache (ocfs_super * osb); > +int ocfs_commit_cache (ocfs_super * osb); > static int ocfs_wait_on_mount(ocfs_super *osb); > static void ocfs_handle_move_locks(ocfs_journal *journal, > ocfs_journal_handle *handle); > @@ -149,7 +149,7 @@ > * This is in journal.c for lack of a better place. > * > */ > -static int ocfs_commit_cache(ocfs_super *osb) > +int ocfs_commit_cache(ocfs_super *osb) > { > int status = 0, tmpstat; > ocfs_journal * journal = NULL; > > > ------------------- > Ling Xiaofeng(Daniel) > > Intel China Software Lab. > iNet: 8-752-1243 > 8621-52574545-1243(O) > > xfling@users.sourceforge.net > Opinions are my own and don't represent those of my employer > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel@oss.oracle.com > http://oss.oracle.com/mailman/listinfo/ocfs2-devel-- Mark Fasheh Software Developer, Oracle Corp mark.fasheh@oracle.com
Ling, Xiaofeng
2004-Aug-23 22:42 UTC
[Ocfs2-devel] [PATCH] fix big 116 -- umount cause crash after some operation
The current process is=20 ocfs_clear_inode-> ocfs_dismount_volumn-> ocfs_journal_shutdown-> wake_up(&osb->flush_event) although it seems this can release the lock. but the bug still exist. The bug is triggered in submit_bh,=20 in kernel, the process is do_umount-> fsync_super-> ocfs_sync_fs, sync_blockdev, sync_inodes_sb-> ocfs_clear_inode So I guess the bug is triggered in sync_blockdev before ocfs_clear_inode then put that in ocfs_clear_inode is too later. I've tried to move ocfs_dismount_volumn or ocfs_journal_shutdown to ocfs_sync_fs, but that will cause other problems. and if just put ocfs_commit_cache in it, it's ok. If you don't want an extra call to ocfs_commit_cache, we need to find a better place to put ocfs_dismount_volumn. =20>-----Original Message----- >From: Mark Fasheh [mailto:mark.fasheh@oracle.com]=20 >Sent: 2004=C4=EA8=D4=C224=C8=D5 3:17 >To: Ling, Xiaofeng >Cc: ocfs2-devel@oss.oracle.com >Subject: Re: [Ocfs2-devel] [PATCH] fix big 116 -- umount cause=20 >crash after some operation > >On Mon, Aug 23, 2004 at 04:05:57PM +0800, Ling, Xiaofeng wrote: >> before umount, all the lock shall be released first. >> this patch can resolve the problem >This isn't the way we want to handle this. The way it's=20 >supposed to work is >that the umount process sends a signal to the commit thread to=20 >shutdown and >then waits on it's last set of checkpoints / releases. I'd=20 >much rather fix >what's broken than patch over it :) > >I'm actually looking at this bug right now and I think it's due to our >signal handling code... > --Mark > >>=20 >> ------------------------------------------------ >> Index: super.c >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> --- super.c (revision 1370) >> +++ super.c (working copy) >> @@ -224,6 +224,7 @@ >> tid_t target; >>=20 >> sb->s_dirt =3D 0; >> + ocfs_commit_cache(OCFS2_SB(sb)); >> target =3D=20 >log_start_commit(OCFS2_SB(sb)->journal->k_journal, NULL); >> log_wait_commit(OCFS2_SB(sb)->journal->k_journal, target); >> return 0; >> @@ -234,6 +235,7 @@ >> tid_t target; >>=20 >> sb->s_dirt =3D 0; >> + ocfs_commit_cache(OCFS2_SB(sb)); >> if=20 >(journal_start_commit(OCFS2_SB(sb)->journal->k_journal, &target)) >> { >> if (wait) >> log_wait_commit(OCFS2_SB(sb)->journal->k_journal, >> Index: journal.c >> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >> --- journal.c (revision 1370) >> +++ journal.c (working copy) >> @@ -61,7 +61,7 @@ >> struct inode *inode); >> static int ocfs_recover_node(struct _ocfs_super *osb, int node_num); >> static int __ocfs_recovery_thread(void *arg); >> -static int ocfs_commit_cache (ocfs_super * osb); >> +int ocfs_commit_cache (ocfs_super * osb); >> static int ocfs_wait_on_mount(ocfs_super *osb); >> static void ocfs_handle_move_locks(ocfs_journal *journal, >> ocfs_journal_handle *handle); >> @@ -149,7 +149,7 @@ >> * This is in journal.c for lack of a better place. >> * >> */ >> -static int ocfs_commit_cache(ocfs_super *osb) >> +int ocfs_commit_cache(ocfs_super *osb) >> { >> int status =3D 0, tmpstat; >> ocfs_journal * journal =3D NULL; >>=20 >>=20 >> ------------------- >> Ling Xiaofeng(Daniel) >>=20 >> Intel China Software Lab. >> iNet: 8-752-1243 >> 8621-52574545-1243(O) >>=20 >> xfling@users.sourceforge.net >> Opinions are my own and don't represent those of my employer=20 >> _______________________________________________ >> Ocfs2-devel mailing list >> Ocfs2-devel@oss.oracle.com >> http://oss.oracle.com/mailman/listinfo/ocfs2-devel >-- >Mark Fasheh >Software Developer, Oracle Corp >mark.fasheh@oracle.com >
Ling, Xiaofeng
2004-Aug-24 20:04 UTC
[Ocfs2-devel] [PATCH] fix big 116 -- umount cause crash after some operation
>Do you have a kernel stack trace? I want to make sure we're >talking about >the same bug (s) here. The 1st one I dealt with was only that >commit thread >was trying to release the locks with a signal pending which >caused the code >in dlm.c to freak out. I removed signal handling from it >altogether and just >use our ocfs_wait function instead. This seemed to alleviate >that problem. >Issue #2 I see during umount is that occasionally the ocfs2 nm >thread will >crash in __make_request:So what we see is not the same. The bug 116 is easy to reproduce, just do a command line as root#mkdir /ocfs/aaa;umount /ocfs follow the umount to mkdir tightly. =========================================== call dismount now (4577) ENTRY: ocfs_put_super(0xdb40ba00) (4577) TRACE: ocfs_put_super() put super... do nothing! DONE!!!! (4577) EXIT : ocfs_put_super() VFS: Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day... ------------[ cut here ]------------ kernel BUG at fs/buffer.c:2703! invalid operand: 0000 [#2] CPU: 0 EIP: 0060:[<c014bb80>] Not tainted EFLAGS: 00010202 (2.6.6kdb) EIP is at submit_bh+0x153/0x16d eax: 00000010 ebx: 00000008 ecx: d5811ea0 edx: d5811ea0 esi: d5811ea0 edi: 00001000 ebp: 00000000 esp: d75cbe48 ds: 007b es: 007b ss: 0068 Process ocfs2nm-0 (pid: 4572, threadinfo=d75ca000 task=dfb21230) Stack: c0373a72 d75cbe9c 00000046 df45dc00 df45dc00 d5811820 00000000 00000008 d5811ea0 00001000 00000000 e09dc3e6 00000000 d5811ea0 000000d8 3f5f0e04 00003c26 dfb213d8 03ecda38 d75cbeb0 d75ca000 d75ca000 d75ca000 d75ca000 Call Trace: [<c0373a72>] schedule+0x30e/0x50c [<e09dc3e6>] ocfs_read_bhs+0x6e3/0x849 [ocfs2] [<e09f5c16>] ocfs_volume_thread+0x2a6/0x75c [ocfs2] [<c0118f21>] exit_notify+0x2b0/0x772 [<c01196a8>] do_exit+0x2c5/0x34d [<c01197cf>] do_group_exit+0x52/0x72 [<c0105d4a>] ret_from_fork+0x6/0x14 [<e09f5970>] ocfs_volume_thread+0x0/0x75c [ocfs2] [<e09f5970>] ocfs_volume_thread+0x0/0x75c [ocfs2] [<c0104291>] kernel_thread_helper+0x5/0xb Code: 0f 0b 8f 0a 99 6d 39 c0 e9 cb fe ff ff 0f 0b 8e 0a 99 6d 39 ds: 007b es: 007b ss: 0068 Process ocfs2cmt-0 (pid: 4575, threadinfo=dd0a4000 task=df2c01b0) Stack: 00000000 dd0a5d94 00000282 40016863 00000000 d8064d80 d941a924 db8cb000 d3044888 0000010d d8064d80 c014bc7a 00000001 d3044888 db8cb000 dfbcd354 db8cb000 dfbcd354 c018fd2f d3044888 d941a924 c018d92e d3044888 0000000b Call Trace: [<c014bc7a>] sync_dirty_buffer+0x5c/0xaf [<c018fd2f>] journal_update_superblock+0x60/0xa1 [<c018d92e>] __try_to_free_cp_buf+0x46/0x4d [<c018de05>] cleanup_journal_tail+0x7a/0xe1 [<c018dcec>] log_do_checkpoint+0xd6/0x175 [<c0131ba6>] buffered_rmqueue+0xc4/0x15c [<c0131cdc>] __alloc_pages+0x9e/0x316 [<c013ab73>] do_anonymous_page+0xdc/0x190 [<c013ac8c>] do_no_page+0x65/0x2d9 [<c018a605>] journal_lock_updates+0xbd/0xcf [<c0115671>] autoremove_wake_function+0x0/0x4f [<c01904f5>] journal_flush+0x1c3/0x1f0 [<c0373a72>] schedule+0x30e/0x50c [<e09ea42f>] ocfs_commit_cache+0x1ab/0x51b [ocfs2] [<c01188de>] daemonize+0xd5/0xda [<e09eef8b>] ocfs_commit_thread+0x117/0x46f [ocfs2] [<c0114708>] default_wake_function+0x0/0x26 [<c0114708>] default_wake_function+0x0/0x26 [<c0114708>] default_wake_function+0x0/0x26 [<c0114708>] default_wake_function+0x0/0x26 [<e0a002ed>] ocfs_timeout_func+0x0/0x1e [ocfs2] [<c0118f21>] exit_notify+0x2b0/0x772 [<c0373796>] schedule+0x32/0x50c [<c01196a8>] do_exit+0x2c5/0x34d [<c0105e8e>] work_resched+0x5/0x16 [<e09eee74>] ocfs_commit_thread+0x0/0x46f [ocfs2] [<e09eee74>] ocfs_commit_thread+0x0/0x46f [ocfs2] [<c0104291>] kernel_thread_helper+0x5/0xb Code: 0f 0b 8f 0a 99 6d 39 c0 e9 cb fe ff ff 0f 0b 8e 0a 99 6d 39 ------------[ cut here ]------------ kernel BUG at fs/buffer.c:2703! invalid operand: 0000 [#3] CPU: 0 EIP: 0060:[<c014bb80>] Not tainted EFLAGS: 00010202 (2.6.6kdb) EIP is at submit_bh+0x153/0x16d eax: 00000010 ebx: db8cb000 ecx: d3044888 edx: d3044888 esi: d3044888 edi: 0000010d ebp: d8064d80 esp: dd0a5c4c
Ling, Xiaofeng
2004-Aug-26 01:43 UTC
[Ocfs2-devel] [PATCH] fix big 116 -- umount cause crash after some operation
Ok. seems put ocfs_dismount_volume in put_super can resolve half of the problem. But there still have busy inode when rmdir and umount at once. see message: "VFS:Busy inodes after unmount. Self-destruct in 5 seconds. Have a nice day..." rmmod and re-insert will cause segmentation fault due the slab problem. for mkdir, it may also happen, but not every time.>-----Original Message----- >From: Mark Fasheh [mailto:mark.fasheh@oracle.com]=20 >Sent: 2004=C4=EA8=D4=C226=C8=D5 1:40 >To: Ling, Xiaofeng >Cc: ocfs2-devel@oss.oracle.com >Subject: Re: [Ocfs2-devel] [PATCH] fix big 116 -- umount cause=20 >crash after some operation > >On Wed, Aug 25, 2004 at 09:04:39AM +0800, Ling, Xiaofeng wrote: >> So what we see is not the same.=20 >> The bug 116 is easy to reproduce, just do a command line as >> root#mkdir /ocfs/aaa;umount /ocfs >> follow the umount to mkdir tightly. >Ok, I think I see what's going on here. > >if I do this: > >mkdir /ocfs2/foobar/baz; umount /ocfs2 > >assuming that foobar was created a while ago so the commit=20 >thread has no >work currently then the umount goes through fine, we wait on the commit >thread and he flushes everything. > >however, > >mkdir /ocfs/foo; umount /ocfs2 > >will crash things almost immediately. What's the only difference there? >Basically that the root inode is in the commit thread on the=20 >second case. > >What I believe is happening is that umount succeeds without calling >clear_inode on the root inode because we still have a=20 >reference on it, so >none of the actual work for umounting has been done, yet umount exits. >That'd explain the busy inodes message we always get and the=20 >random crashes >in one or the other ocfs2 thread. > >Can you try this patch and see if it reproduces? Running with=20 >this, I can't >get it to crash any more :) We did dismount from clear_inode=20 >for historical >reasons having to do with some data structures we used to alloc on the >inodes. We no longer have those issues due to our sane=20 >handling of inode >private nowadays, so I think it's fine to actually umount from=20 >our put_super >method like other file systems :) > --Mark > >-- >Mark Fasheh >Software Developer, Oracle Corp >mark.fasheh@oracle.com > >Index: super.c >=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >--- super.c (revision 1383) >+++ super.c (working copy) >@@ -712,7 +712,8 @@ static void ocfs_put_super (struct super > LOG_ENTRY_ARGS ("(0x%p)\n", sb); >=20 > ocfs_sync_blockdev(sb); >- LOG_TRACE_STR ("put super... do nothing! DONE!!!!"); >+ ocfs_dismount_volume (sb); >+ > LOG_EXIT (); >=20 > LOG_CLEAR_CONTEXT(); >Index: inode.c >=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >--- inode.c (revision 1383) >+++ inode.c (working copy) >@@ -757,15 +757,6 @@ void ocfs_clear_inode (struct inode *ino > ocfs_extent_map_destroy (&OCFS_I(inode)->ip_ext_map); > ocfs_extent_map_init (&OCFS_I(inode)->ip_ext_map); >=20 >- if (inode =3D=3D osb->root_inode) { >- LOG_TRACE_STR("this is the root inode, doing=20 >cleanup now!"); >- ocfs_sync_blockdev(inode->i_sb); >- LOG_TRACE_STR ("syncing past root inode"); >- LOG_TRACE_STR ("calling dismount"); >- ocfs_dismount_volume (inode->i_sb); >- goto bail; >- } >- > down(&recovery_list_sem); > list_del(&OCFS_I(inode)->ip_recovery_list); > up(&recovery_list_sem); >