On 2010-10-27, at 21:18, Jacques-Charles Lafoucriere wrote:> I have found a bug in layout lock (the bug was seen with test 118k, this is the last known). > > A simpler reproducer is to make an rm during a long file write. > > A lock timeout is trigged because during the writes the client hold the layout lock which is in the same lock as a lookup (muliple inode_bits in the same lock). So when the MDS try to get an LCK_EX on the object (before calling mdo_unlink), the lock is not freed because of the ref count.The client should only be holding a reference on the layout lock for 1MB chunks of IO. Between each IO the layout lock reference should be dropped, and if there was a blocking callback on the lock the client should also cancel the lock at that time.> A solution is the request a LCK_CR on the object before the mdo_unlink (the directory is still protected by a strong lock). Is it a good solution ? Do you have another one ?We discussed this issue recently, and the preferred solution is to release the layout lock as soon as the OST extent locks are referenced, since we don''t actually require the layout lock once we hold the object extent lock(s). We discussed this before, and it is a bit tricky, because the ll_layout_lock_get() and ll_layout_lock_put() currently wrap the IO function. One proposal is to refcount the lsm structure under the layout lock, and then drop the last lsm reference in the LOV code after the object lock is held, and that would release the lsm lock. Cheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc.
Jacques-Charles Lafoucriere
2010-Oct-30 10:48 UTC
[Lustre-devel] layout lock bug with 118k
On 10/29/2010 05:26 PM, Andreas Dilger wrote:> On 2010-10-27, at 21:18, Jacques-Charles Lafoucriere wrote: > >> I have found a bug in layout lock (the bug was seen with test 118k, this is the last known). >> >> A simpler reproducer is to make an rm during a long file write. >> >> A lock timeout is trigged because during the writes the client hold the layout lock which is in the same lock as a lookup (muliple inode_bits in the same lock). So when the MDS try to get an LCK_EX on the object (before calling mdo_unlink), the lock is not freed because of the ref count. >> > The client should only be holding a reference on the layout lock for 1MB chunks of IO. Between each IO the layout lock reference should be dropped, and if there was a blocking callback on the lock the client should also cancel the lock at that time. > >The client hold the layout lock only around the IO. So between I/O''s, the lock should be canceled. The issue comes from that the same lock is also referenced because of the other inodes bits.>> A solution is the request a LCK_CR on the object before the mdo_unlink (the directory is still protected by a strong lock). Is it a good solution ? Do you have another one ? >> > We discussed this issue recently, and the preferred solution is to release the layout lock as soon as the OST extent locks are referenced, since we don''t actually require the layout lock once we hold the object extent lock(s). > > We discussed this before, and it is a bit tricky, because the ll_layout_lock_get() and ll_layout_lock_put() currently wrap the IO function. One proposal is to refcount the lsm structure under the layout lock, and then drop the last lsm reference in the LOV code after the object lock is held, and that would release the lsm lock. >I will see how to do this> Cheers, Andreas > -- > Andreas Dilger > Lustre Technical Lead > Oracle Corporation Canada Inc. > > >