Hi. FreeBSD''s WITNESS mechanism for detecting lock order reversals reports LOR here: lock order reversal: 1st 0xc3f7738c zfs:dbuf (zfs:dbuf) @ /zoo/pjd/zfstest/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c:410 2nd 0xc3fefcc0 zfs:zn (zfs:zn) @ /zoo/pjd/zfstest/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:68 KDB: stack backtrace: db_trace_self_wrapper(c05ee0fe) at db_trace_self_wrapper+0x25 kdb_backtrace(0,ffffffff,c065f510,c065f358,c061e82c,...) at kdb_backtrace+0x29 witness_checkorder(c3fefcc0,9,c3c44a6b,44) at witness_checkorder+0x586 _sx_xlock(c3fefcc0,c3c44a6b,44,0,c3fefca8,...) at _sx_xlock+0x6e znode_pageout_func(c3f77350,c3fefca8,e6172a08,c3bd8005,c3f77350,...) at znode_pageout_func+0x10a dbuf_evict_user(c3f77350,0,c3f77848,c3f77848,c3f59c94,...) at dbuf_evict_user+0x4e dbuf_clear(c3f77350,c3bea63b,e6172b18,c3bea64c,c3f77350,...) at dbuf_clear+0x31 dbuf_evict(c3f77350,c0650f70,0,c05ed336,2ab,...) at dbuf_evict+0xe dnode_evict_dbufs(c3f59c94,0,2,0,0,...) at dnode_evict_dbufs+0x24c dnode_sync_free(c3f59c94,c4075900,c3c1c873,0,c37ef000,...) at dnode_sync_free+0xf5 dnode_sync(c3f59c94,0,c3f45000,c4075900,0,...) at dnode_sync+0x36a dmu_objset_sync_dnodes(c3b47000,c3b4713c,c4075900,21c,c375b54c,...) at dmu_objset_sync_dnodes+0x81 dmu_objset_sync(c3b47000,c4075900,e6172c50,c3bf1b2c,c397ac00,...) at dmu_objset_sync+0x50 dsl_dataset_sync(c397ac00,c4075900,0,c397ac00,e6172c54,...) at dsl_dataset_sync+0x14 dsl_pool_sync(c375b400,6,0,c061e82c,c375b4ac,...) at dsl_pool_sync+0x78 spa_sync(c37ef000,6,0,7,0,...) at spa_sync+0x2a2 txg_sync_thread(c375b400,e6172d38) at txg_sync_thread+0x1df fork_exit(c3c037e8,c375b400,e6172d38) at fork_exit+0xac fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xe6172d6c, ebp = 0 --- Bascially it first recorded that db_mtx is locked before z_lock and the backtrace above is from where is it locked in an different order. I think it is harmless, because znode is not visible at this point and can''t be referenced, which means deadlock is not possible here, right? If I''m right, sorry for the noice, just wanted to be 100% sure. -- Pawel Jakub Dawidek http://www.wheel.pl pjd at FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-code/attachments/20061122/8fef67c0/attachment.bin>
Pawel Jakub Dawidek wrote:> Hi. > > FreeBSD''s WITNESS mechanism for detecting lock order reversals reports > LOR here: > > lock order reversal: > 1st 0xc3f7738c zfs:dbuf (zfs:dbuf) @ /zoo/pjd/zfstest/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c:410 > 2nd 0xc3fefcc0 zfs:zn (zfs:zn) @ /zoo/pjd/zfstest/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:68 > KDB: stack backtrace: > db_trace_self_wrapper(c05ee0fe) at db_trace_self_wrapper+0x25 > kdb_backtrace(0,ffffffff,c065f510,c065f358,c061e82c,...) at kdb_backtrace+0x29 > witness_checkorder(c3fefcc0,9,c3c44a6b,44) at witness_checkorder+0x586 > _sx_xlock(c3fefcc0,c3c44a6b,44,0,c3fefca8,...) at _sx_xlock+0x6e > znode_pageout_func(c3f77350,c3fefca8,e6172a08,c3bd8005,c3f77350,...) at znode_pageout_func+0x10a > dbuf_evict_user(c3f77350,0,c3f77848,c3f77848,c3f59c94,...) at dbuf_evict_user+0x4e > dbuf_clear(c3f77350,c3bea63b,e6172b18,c3bea64c,c3f77350,...) at dbuf_clear+0x31 > dbuf_evict(c3f77350,c0650f70,0,c05ed336,2ab,...) at dbuf_evict+0xe > dnode_evict_dbufs(c3f59c94,0,2,0,0,...) at dnode_evict_dbufs+0x24c > dnode_sync_free(c3f59c94,c4075900,c3c1c873,0,c37ef000,...) at dnode_sync_free+0xf5 > dnode_sync(c3f59c94,0,c3f45000,c4075900,0,...) at dnode_sync+0x36a > dmu_objset_sync_dnodes(c3b47000,c3b4713c,c4075900,21c,c375b54c,...) at dmu_objset_sync_dnodes+0x81 > dmu_objset_sync(c3b47000,c4075900,e6172c50,c3bf1b2c,c397ac00,...) at dmu_objset_sync+0x50 > dsl_dataset_sync(c397ac00,c4075900,0,c397ac00,e6172c54,...) at dsl_dataset_sync+0x14 > dsl_pool_sync(c375b400,6,0,c061e82c,c375b4ac,...) at dsl_pool_sync+0x78 > spa_sync(c37ef000,6,0,7,0,...) at spa_sync+0x2a2 > txg_sync_thread(c375b400,e6172d38) at txg_sync_thread+0x1df > fork_exit(c3c037e8,c375b400,e6172d38) at fork_exit+0xac > fork_trampoline() at fork_trampoline+0x8 > --- trap 0x1, eip = 0, esp = 0xe6172d6c, ebp = 0 --- > > Bascially it first recorded that db_mtx is locked before z_lock and the > backtrace above is from where is it locked in an different order. >Um, I think you have that backwards: normally we obtain the z_lock first, then obtain the db_mtx; but in the above stack, we are obtaining the z_lock *after* we hold the db_mtx.> I think it is harmless, because znode is not visible at this point and > can''t be referenced, which means deadlock is not possible here, right? >Yea, I think this is OK. The z_lock is being used here to prevent a race with the force-unmount code in zfs_inactive(). In that code path we are not going to try to obtain the db_mtx while holding the z_lock.> If I''m right, sorry for the noice, just wanted to be 100% sure.No problem. Thanks for bringing this to our attention. -Mark
On Wed, Nov 22, 2006 at 08:35:44AM -0700, Mark Maybee wrote:> Pawel Jakub Dawidek wrote:[...]> >Bascially it first recorded that db_mtx is locked before z_lock and the > >backtrace above is from where is it locked in an different order. > Um, I think you have that backwards: normally we obtain the z_lock > first, then obtain the db_mtx; but in the above stack, we are obtaining > the z_lock *after* we hold the db_mtx.Grr, yes you are right, sorry.> >I think it is harmless, because znode is not visible at this point and > >can''t be referenced, which means deadlock is not possible here, right? > Yea, I think this is OK. The z_lock is being used here to prevent a > race with the force-unmount code in zfs_inactive(). In that code path > we are not going to try to obtain the db_mtx while holding the z_lock.I had another one, can you analize it? lock order reversal: 1st 0xc44b9b00 zfs:dbuf (zfs:dbuf) @ /zoo/pjd/zfstest/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1644 2nd 0xc45be898 zfs:dbufs (zfs:dbufs) @ /zoo/pjd/zfstest/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c:357 KDB: stack backtrace: db_trace_self_wrapper(c05ee0fe) at db_trace_self_wrapper+0x25 kdb_backtrace(0,ffffffff,c065f538,c065f5b0,c061e82c,...) at kdb_backtrace+0x29 witness_checkorder(c45be898,9,c4113f3c,165) at witness_checkorder+0x586 _sx_xlock(c45be898,c4113f3c,165,c41249e0,0,...) at _sx_xlock+0x6e dnode_evict_dbufs(c45be730,0,c4406b40,0,c45be730,...) at dnode_evict_dbufs+0x39 dmu_objset_evict_dbufs(e6258bd0,0,4,c397c000,c40d9918,...) at dmu_objset_evict_dbufs+0x131 dmu_objset_evict(c397c200,c397c000,c397c600,c397c200,e6258c04,...) at dmu_objset_evict+0xb1 dsl_dataset_evict(c44b9ac4,c397c200,e6258c1c,c40aed02,c44b9ac4,...) at dsl_dataset_evict+0x4e dbuf_evict_user(c44b9ac4,c40ab3ee,1,0,e6258c2c,...) at dbuf_evict_user+0x4e dbuf_rele(c44b9ac4,c397c200,e6258c50,c40c7c5c,c397c200,...) at dbuf_rele+0x72 dsl_dataset_sync(c397c200,c3bfea80,0,c397c200,e6258c54,...) at dsl_dataset_sync+0x44 dsl_pool_sync(c397c600,148,0,c061e82c,c397c6ac,...) at dsl_pool_sync+0x78 spa_sync(c37ef000,148,0,149,0,...) at spa_sync+0x2a2 txg_sync_thread(c397c600,e6258d38) at txg_sync_thread+0x1df fork_exit(c40d9918,c397c600,e6258d38) at fork_exit+0xac fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xe6258d6c, ebp = 0 --- The backtrace is from where lock order is different than the first recorded. -- Pawel Jakub Dawidek http://www.wheel.pl pjd at FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-code/attachments/20061122/5f1562c9/attachment.bin>
Pawel Jakub Dawidek wrote:> I had another one, can you analize it? > > lock order reversal: > 1st 0xc44b9b00 zfs:dbuf (zfs:dbuf) @ /zoo/pjd/zfstest/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1644 > 2nd 0xc45be898 zfs:dbufs (zfs:dbufs) @ /zoo/pjd/zfstest/sys/modules/zfs/../../contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c:357 > KDB: stack backtrace: > db_trace_self_wrapper(c05ee0fe) at db_trace_self_wrapper+0x25 > kdb_backtrace(0,ffffffff,c065f538,c065f5b0,c061e82c,...) at kdb_backtrace+0x29 > witness_checkorder(c45be898,9,c4113f3c,165) at witness_checkorder+0x586 > _sx_xlock(c45be898,c4113f3c,165,c41249e0,0,...) at _sx_xlock+0x6e > dnode_evict_dbufs(c45be730,0,c4406b40,0,c45be730,...) at dnode_evict_dbufs+0x39 > dmu_objset_evict_dbufs(e6258bd0,0,4,c397c000,c40d9918,...) at dmu_objset_evict_dbufs+0x131 > dmu_objset_evict(c397c200,c397c000,c397c600,c397c200,e6258c04,...) at dmu_objset_evict+0xb1 > dsl_dataset_evict(c44b9ac4,c397c200,e6258c1c,c40aed02,c44b9ac4,...) at dsl_dataset_evict+0x4e > dbuf_evict_user(c44b9ac4,c40ab3ee,1,0,e6258c2c,...) at dbuf_evict_user+0x4e > dbuf_rele(c44b9ac4,c397c200,e6258c50,c40c7c5c,c397c200,...) at dbuf_rele+0x72 > dsl_dataset_sync(c397c200,c3bfea80,0,c397c200,e6258c54,...) at dsl_dataset_sync+0x44 > dsl_pool_sync(c397c600,148,0,c061e82c,c397c6ac,...) at dsl_pool_sync+0x78 > spa_sync(c37ef000,148,0,149,0,...) at spa_sync+0x2a2 > txg_sync_thread(c397c600,e6258d38) at txg_sync_thread+0x1df > fork_exit(c40d9918,c397c600,e6258d38) at fork_exit+0xac > fork_trampoline() at fork_trampoline+0x8 > --- trap 0x1, eip = 0, esp = 0xe6258d6c, ebp = 0 --- > > The backtrace is from where lock order is different than the first > recorded. >I''m not sure, since your line numbers don''t quite match up with mine, but I think this is a complaint about the dn_dbufs_mtx vs the db_mtx? If so, in this case its OK. The db_mtx we grab in dbuf_rele() is the "parent" dbuf for the dataset we are evicting from dsl_dataset_evict(). So its not possible that any of the locks we attempt to grab in dnode_evict_dbufs() could cause a deadlock with it. -Mark