Pawel Jakub Dawidek
2009-Feb-15 08:23 UTC
[zfs-code] 6551866 deadlock between zfs_write(), zfs_freesp(), and zfs_putapage()
On Wed, Jan 28, 2009 at 10:06:05AM -0800, Mark.Maybee at Sun.COM wrote:> Author: Mark Maybee <Mark.Maybee at Sun.COM> > Repository: /hg/onnv/onnv-gate > Latest revision: 7e4ce9158df3e94022ea0f7bffe7df5a4e23b04f > Total changesets: 1 > Log message: > 6551866 deadlock between zfs_write(), zfs_freesp(), and zfs_putapage() > 6504953 zfs_getpage() misunderstands VOP_GETPAGE() interface > 6702206 ZFS read/writer lock contention throttles sendfile() benchmark > 6780491 Zone on a ZFS filesystem has poor fork/exec performance > 6747596 assertion failed: DVA_EQUAL(BP_IDENTITY(&zio->io_bp_orig), BP_IDENTITY(zio->io_bp))); > > Files: > update: usr/src/uts/common/fs/zfs/arc.c > update: usr/src/uts/common/fs/zfs/sys/zfs_znode.h > update: usr/src/uts/common/fs/zfs/zfs_rlock.c > update: usr/src/uts/common/fs/zfs/zfs_vnops.c > update: usr/src/uts/common/fs/zfs/zfs_znode.cI think after this commit, the comment above update_pages() is no longer true: * On Write: If we find a memory mapped page, we write to *both* * the page and the dmu buffer. -- Pawel Jakub Dawidek http://www.wheel.pl pjd at FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-code/attachments/20090215/f4ed3170/attachment.bin>
Jürgen Keil
2009-Feb-17 11:22 UTC
[zfs-code] 6551866 deadlock between zfs_write(), zfs_freesp(), and zfs_putapage()
It seems there is a bug introduced by the putback for author: Mark Maybee <Mark.Maybee at Sun.COM> date: Wed Jan 28 11:04:37 2009 -0700 (2 weeks ago) files: usr/src/uts/common/fs/zfs/arc.c usr/src/uts/common/fs/zfs/sys/zfs_znode.h usr/src/uts/common/fs/zfs/zfs_rlock.c usr/src/uts/common/fs/zfs/zfs_vnops.c usr/src/uts/common/fs/zfs/zfs_znode.c description: 6551866 deadlock between zfs_write(), zfs_freesp(), and zfs_putapage() 6504953 zfs_getpage() misunderstands VOP_GETPAGE() interface 6702206 ZFS read/writer lock contention throttles sendfile() benchmark 6780491 Zone on a ZFS filesystem has poor fork/exec performance 6747596 assertion failed: DVA_EQUAL(BP_IDENTITY(&zio->io_bp_orig), BP_IDENTITY(zio->io_bp))); zfs_vnops, zfs_putpage() 3695 /* 3696 * Align this request to the file block size in case we kluster. 3697 * XXX - this can result in pretty aggresive locking, which can 3698 * impact simultanious read/write access. One option might be 3699 * to break up long requests (len == 0) into block-by-block 3700 * operations to get narrower locking. 3701 */ 3702 blksz = zp->z_blksz; 3703 if (ISP2(blksz)) 3704 io_off = P2ALIGN_TYPED(off, blksz, u_offset_t); 3705 else 3706 io_off = 0; 3707 if (len > 0 && ISP2(blksz)) 3708 io_len = P2ROUNDUP_TYPED(len + (io_off - off), blksz, size_t); 3709 else 3710 io_len = 0; 3711 3712 if (io_len == 0) { 3713 /* 3714 * Search the entire vp list for pages >= io_off. 3715 */ 3716 rl = zfs_range_lock(zp, io_off, UINT64_MAX, RL_WRITER); 3717 error = pvn_vplist_dirty(vp, io_off, zfs_putapage, flags, cr); 3718 goto out; 3719 } 3720 rl = zfs_range_lock(zp, io_off, io_len, RL_WRITER); Line 3708: "len + (io_off - off)" looks wrong, this should be "len + (off - io_off)". The P2ALIGN_TYPED() macro at line 3704 should round down "off", i.e. io_off <= off. Test case: /files2/media/osol-0906-106a-global-x86.iso is a file on a zfs filesystem # mount -F hsfs /files2/media/osol-0906-106a-global-x86.iso /mnt # time mkisofs -r -o /dev/null /mnt <<< very slow >>> 1.22u 5.05s 59:37.46 0.1% On snv_104 the same test completes in 21 seconds. It has become 180x slower... diff --git a/usr/src/uts/common/fs/zfs/zfs_vnops.c b/usr/src/uts/common/fs/zfs/zfs_vnops.c --- a/usr/src/uts/common/fs/zfs/zfs_vnops.c +++ b/usr/src/uts/common/fs/zfs/zfs_vnops.c @@ -3705,7 +3705,7 @@ else io_off = 0; if (len > 0 && ISP2(blksz)) - io_len = P2ROUNDUP_TYPED(len + (io_off - off), blksz, size_t); + io_len = P2ROUNDUP_TYPED(len + (off - io_off), blksz, size_t); else io_len = 0; -- This message posted from opensolaris.org
Mark Maybee
2009-Feb-17 21:58 UTC
[zfs-code] 6551866 deadlock between zfs_write(), zfs_freesp(), and zfs_putapage()
Gack! Absolutely correct J?rgen. I have filed 6806627 to track this. -Mark J?rgen Keil wrote:> It seems there is a bug introduced by the putback for > > author: Mark Maybee <Mark.Maybee at Sun.COM> > date: Wed Jan 28 11:04:37 2009 -0700 (2 weeks ago) > files: usr/src/uts/common/fs/zfs/arc.c usr/src/uts/common/fs/zfs/sys/zfs_znode.h > usr/src/uts/common/fs/zfs/zfs_rlock.c usr/src/uts/common/fs/zfs/zfs_vnops.c > usr/src/uts/common/fs/zfs/zfs_znode.c > description: > 6551866 deadlock between zfs_write(), zfs_freesp(), and zfs_putapage() > 6504953 zfs_getpage() misunderstands VOP_GETPAGE() interface > 6702206 ZFS read/writer lock contention throttles sendfile() benchmark > 6780491 Zone on a ZFS filesystem has poor fork/exec performance > 6747596 assertion failed: DVA_EQUAL(BP_IDENTITY(&zio->io_bp_orig), BP_IDENTITY(zio->io_bp))); > > zfs_vnops, zfs_putpage() > > > 3695 /* > 3696 * Align this request to the file block size in case we kluster. > 3697 * XXX - this can result in pretty aggresive locking, which can > 3698 * impact simultanious read/write access. One option might be > 3699 * to break up long requests (len == 0) into block-by-block > 3700 * operations to get narrower locking. > 3701 */ > 3702 blksz = zp->z_blksz; > 3703 if (ISP2(blksz)) > 3704 io_off = P2ALIGN_TYPED(off, blksz, u_offset_t); > 3705 else > 3706 io_off = 0; > 3707 if (len > 0 && ISP2(blksz)) > 3708 io_len = P2ROUNDUP_TYPED(len + (io_off - off), blksz, size_t); > 3709 else > 3710 io_len = 0; > 3711 > 3712 if (io_len == 0) { > 3713 /* > 3714 * Search the entire vp list for pages >= io_off. > 3715 */ > 3716 rl = zfs_range_lock(zp, io_off, UINT64_MAX, RL_WRITER); > 3717 error = pvn_vplist_dirty(vp, io_off, zfs_putapage, flags, cr); > 3718 goto out; > 3719 } > 3720 rl = zfs_range_lock(zp, io_off, io_len, RL_WRITER); > > > Line 3708: > "len + (io_off - off)" looks wrong, this should be > "len + (off - io_off)". The P2ALIGN_TYPED() macro at line 3704 > should round down "off", i.e. io_off <= off. > > > Test case: > > /files2/media/osol-0906-106a-global-x86.iso is a file on a zfs filesystem > > # mount -F hsfs /files2/media/osol-0906-106a-global-x86.iso /mnt > # time mkisofs -r -o /dev/null /mnt > <<< very slow >>> > 1.22u 5.05s 59:37.46 0.1% > > On snv_104 the same test completes in 21 seconds. > > It has become 180x slower... > > > diff --git a/usr/src/uts/common/fs/zfs/zfs_vnops.c b/usr/src/uts/common/fs/zfs/zfs_vnops.c > --- a/usr/src/uts/common/fs/zfs/zfs_vnops.c > +++ b/usr/src/uts/common/fs/zfs/zfs_vnops.c > @@ -3705,7 +3705,7 @@ > else > io_off = 0; > if (len > 0 && ISP2(blksz)) > - io_len = P2ROUNDUP_TYPED(len + (io_off - off), blksz, size_t); > + io_len = P2ROUNDUP_TYPED(len + (off - io_off), blksz, size_t); > else > io_len = 0;