Pawel Jakub Dawidek
2009-Feb-15 08:23 UTC
[zfs-code] 6551866 deadlock between zfs_write(), zfs_freesp(), and zfs_putapage()
On Wed, Jan 28, 2009 at 10:06:05AM -0800, Mark.Maybee at Sun.COM wrote:> Author: Mark Maybee <Mark.Maybee at Sun.COM> > Repository: /hg/onnv/onnv-gate > Latest revision: 7e4ce9158df3e94022ea0f7bffe7df5a4e23b04f > Total changesets: 1 > Log message: > 6551866 deadlock between zfs_write(), zfs_freesp(), and zfs_putapage() > 6504953 zfs_getpage() misunderstands VOP_GETPAGE() interface > 6702206 ZFS read/writer lock contention throttles sendfile() benchmark > 6780491 Zone on a ZFS filesystem has poor fork/exec performance > 6747596 assertion failed: DVA_EQUAL(BP_IDENTITY(&zio->io_bp_orig), BP_IDENTITY(zio->io_bp))); > > Files: > update: usr/src/uts/common/fs/zfs/arc.c > update: usr/src/uts/common/fs/zfs/sys/zfs_znode.h > update: usr/src/uts/common/fs/zfs/zfs_rlock.c > update: usr/src/uts/common/fs/zfs/zfs_vnops.c > update: usr/src/uts/common/fs/zfs/zfs_znode.cI think after this commit, the comment above update_pages() is no longer true: * On Write: If we find a memory mapped page, we write to *both* * the page and the dmu buffer. -- Pawel Jakub Dawidek http://www.wheel.pl pjd at FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-code/attachments/20090215/f4ed3170/attachment.bin>
Jürgen Keil
2009-Feb-17 11:22 UTC
[zfs-code] 6551866 deadlock between zfs_write(), zfs_freesp(), and zfs_putapage()
It seems there is a bug introduced by the putback for
author: Mark Maybee <Mark.Maybee at Sun.COM>
date: Wed Jan 28 11:04:37 2009 -0700 (2 weeks ago)
files: usr/src/uts/common/fs/zfs/arc.c
usr/src/uts/common/fs/zfs/sys/zfs_znode.h
usr/src/uts/common/fs/zfs/zfs_rlock.c usr/src/uts/common/fs/zfs/zfs_vnops.c
usr/src/uts/common/fs/zfs/zfs_znode.c
description:
6551866 deadlock between zfs_write(), zfs_freesp(), and zfs_putapage()
6504953 zfs_getpage() misunderstands VOP_GETPAGE() interface
6702206 ZFS read/writer lock contention throttles sendfile() benchmark
6780491 Zone on a ZFS filesystem has poor fork/exec performance
6747596 assertion failed: DVA_EQUAL(BP_IDENTITY(&zio->io_bp_orig),
BP_IDENTITY(zio->io_bp)));
zfs_vnops, zfs_putpage()
3695 /*
3696 * Align this request to the file block size in case we kluster.
3697 * XXX - this can result in pretty aggresive locking, which can
3698 * impact simultanious read/write access. One option might be
3699 * to break up long requests (len == 0) into block-by-block
3700 * operations to get narrower locking.
3701 */
3702 blksz = zp->z_blksz;
3703 if (ISP2(blksz))
3704 io_off = P2ALIGN_TYPED(off, blksz, u_offset_t);
3705 else
3706 io_off = 0;
3707 if (len > 0 && ISP2(blksz))
3708 io_len = P2ROUNDUP_TYPED(len + (io_off - off), blksz, size_t);
3709 else
3710 io_len = 0;
3711
3712 if (io_len == 0) {
3713 /*
3714 * Search the entire vp list for pages >= io_off.
3715 */
3716 rl = zfs_range_lock(zp, io_off, UINT64_MAX, RL_WRITER);
3717 error = pvn_vplist_dirty(vp, io_off, zfs_putapage, flags, cr);
3718 goto out;
3719 }
3720 rl = zfs_range_lock(zp, io_off, io_len, RL_WRITER);
Line 3708:
"len + (io_off - off)" looks wrong, this should be
"len + (off - io_off)". The P2ALIGN_TYPED() macro at line 3704
should round down "off", i.e. io_off <= off.
Test case:
/files2/media/osol-0906-106a-global-x86.iso is a file on a zfs filesystem
# mount -F hsfs /files2/media/osol-0906-106a-global-x86.iso /mnt
# time mkisofs -r -o /dev/null /mnt
<<< very slow >>>
1.22u 5.05s 59:37.46 0.1%
On snv_104 the same test completes in 21 seconds.
It has become 180x slower...
diff --git a/usr/src/uts/common/fs/zfs/zfs_vnops.c
b/usr/src/uts/common/fs/zfs/zfs_vnops.c
--- a/usr/src/uts/common/fs/zfs/zfs_vnops.c
+++ b/usr/src/uts/common/fs/zfs/zfs_vnops.c
@@ -3705,7 +3705,7 @@
else
io_off = 0;
if (len > 0 && ISP2(blksz))
- io_len = P2ROUNDUP_TYPED(len + (io_off - off), blksz, size_t);
+ io_len = P2ROUNDUP_TYPED(len + (off - io_off), blksz, size_t);
else
io_len = 0;
--
This message posted from opensolaris.org
Mark Maybee
2009-Feb-17 21:58 UTC
[zfs-code] 6551866 deadlock between zfs_write(), zfs_freesp(), and zfs_putapage()
Gack! Absolutely correct J?rgen. I have filed 6806627 to track this. -Mark J?rgen Keil wrote:> It seems there is a bug introduced by the putback for > > author: Mark Maybee <Mark.Maybee at Sun.COM> > date: Wed Jan 28 11:04:37 2009 -0700 (2 weeks ago) > files: usr/src/uts/common/fs/zfs/arc.c usr/src/uts/common/fs/zfs/sys/zfs_znode.h > usr/src/uts/common/fs/zfs/zfs_rlock.c usr/src/uts/common/fs/zfs/zfs_vnops.c > usr/src/uts/common/fs/zfs/zfs_znode.c > description: > 6551866 deadlock between zfs_write(), zfs_freesp(), and zfs_putapage() > 6504953 zfs_getpage() misunderstands VOP_GETPAGE() interface > 6702206 ZFS read/writer lock contention throttles sendfile() benchmark > 6780491 Zone on a ZFS filesystem has poor fork/exec performance > 6747596 assertion failed: DVA_EQUAL(BP_IDENTITY(&zio->io_bp_orig), BP_IDENTITY(zio->io_bp))); > > zfs_vnops, zfs_putpage() > > > 3695 /* > 3696 * Align this request to the file block size in case we kluster. > 3697 * XXX - this can result in pretty aggresive locking, which can > 3698 * impact simultanious read/write access. One option might be > 3699 * to break up long requests (len == 0) into block-by-block > 3700 * operations to get narrower locking. > 3701 */ > 3702 blksz = zp->z_blksz; > 3703 if (ISP2(blksz)) > 3704 io_off = P2ALIGN_TYPED(off, blksz, u_offset_t); > 3705 else > 3706 io_off = 0; > 3707 if (len > 0 && ISP2(blksz)) > 3708 io_len = P2ROUNDUP_TYPED(len + (io_off - off), blksz, size_t); > 3709 else > 3710 io_len = 0; > 3711 > 3712 if (io_len == 0) { > 3713 /* > 3714 * Search the entire vp list for pages >= io_off. > 3715 */ > 3716 rl = zfs_range_lock(zp, io_off, UINT64_MAX, RL_WRITER); > 3717 error = pvn_vplist_dirty(vp, io_off, zfs_putapage, flags, cr); > 3718 goto out; > 3719 } > 3720 rl = zfs_range_lock(zp, io_off, io_len, RL_WRITER); > > > Line 3708: > "len + (io_off - off)" looks wrong, this should be > "len + (off - io_off)". The P2ALIGN_TYPED() macro at line 3704 > should round down "off", i.e. io_off <= off. > > > Test case: > > /files2/media/osol-0906-106a-global-x86.iso is a file on a zfs filesystem > > # mount -F hsfs /files2/media/osol-0906-106a-global-x86.iso /mnt > # time mkisofs -r -o /dev/null /mnt > <<< very slow >>> > 1.22u 5.05s 59:37.46 0.1% > > On snv_104 the same test completes in 21 seconds. > > It has become 180x slower... > > > diff --git a/usr/src/uts/common/fs/zfs/zfs_vnops.c b/usr/src/uts/common/fs/zfs/zfs_vnops.c > --- a/usr/src/uts/common/fs/zfs/zfs_vnops.c > +++ b/usr/src/uts/common/fs/zfs/zfs_vnops.c > @@ -3705,7 +3705,7 @@ > else > io_off = 0; > if (len > 0 && ISP2(blksz)) > - io_len = P2ROUNDUP_TYPED(len + (io_off - off), blksz, size_t); > + io_len = P2ROUNDUP_TYPED(len + (off - io_off), blksz, size_t); > else > io_len = 0;