Hello all, I''m trying to understand the ZFS IO scheduler ( http://www.eall.com.br/blog/?p=1170 ), and why sometimes the system seems to be "stalled" for some seconds, and every application that needs some IO (most read, i think), have serious problems. What can be a big problem in iSCSI or NFS soft mounts. Looking at the code, i could get to the zio_taskq_threads structure, and to this bug report: http://bugs.opensolaris.org/bugdatabase/printableBug.do?bug_id=6826241 And seems like it was already integrated to newer releases (i don''t know since when)... Somebody could explain the real diff between the ISSUE and INTR, READ and WRITE changes, and maybe why in the first implementation were the same value for both? ;-) Another move that i did not fully understand very well, was the time between txg syncs, from 5s to 30s, what i think can make this problem worst, because we will have more data to commit. Well to much questions... ;-) PS: Where i can find the patches and attachments from the bugs.opensolaris.org? The files mention attach, but i can not find them. Thanks a lot for your time! Leal [ http://www.eall.com.br/blog ] -- This message posted from opensolaris.org
wow, that hasn''t been a recognized problem since this past april? i''ve been seeing it for a -long- time. i think i first reported it back in december. are people actively working on it? On Tue, Jun 16, 2009 at 10:24 AM, Marcelo Leal<no-reply at opensolaris.org> wrote:> Hello all, > ?I''m trying to understand the ZFS IO scheduler ( http://www.eall.com.br/blog/?p=1170 ), and why sometimes the system seems to be "stalled" for some seconds, and every application that needs some IO (most read, i think), have serious problems. What can be a big problem in iSCSI or NFS soft mounts. > ?Looking at the code, i could get to the zio_taskq_threads structure, and to this bug report: > ?http://bugs.opensolaris.org/bugdatabase/printableBug.do?bug_id=6826241 > ?And seems like it was already integrated to newer releases (i don''t know since when)... > ?Somebody could explain the real diff between the ISSUE and INTR, READ and WRITE changes, and maybe why in the first implementation were the same value for both? ;-) > ?Another move that i did not fully understand very well, was the time between txg syncs, from 5s to 30s, what i think can make this problem worst, because we will have more data to commit. > ?Well to much questions... ;-) > > ?PS: Where i can find the patches and attachments from the bugs.opensolaris.org? The files mention attach, but i can not find them. > > ?Thanks a lot for your time! > > ?Leal > [ http://www.eall.com.br/blog ] > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
We''re definitely working on problems contributing to such ''picket fencing''. But beware to equate symptoms and root caused issues. We already know that picket fencing is multicause and we''re tracking the ones we know about : there is something related to taskq cpu scheduling and something related to sync writes I/O completion handling. We also plan to study if reads are not subject to other specific effects. This is on the top of our list of performance issues. -r Le 16 juin 09 ? 17:15, milosz a ?crit :> wow, that hasn''t been a recognized problem since this past april? > i''ve been seeing it for a -long- time. i think i first reported it > back in december. are people actively working on it? > > On Tue, Jun 16, 2009 at 10:24 AM, Marcelo Leal<no-reply at opensolaris.org > > wrote: >> Hello all, >> I''m trying to understand the ZFS IO scheduler ( http://www.eall.com.br/blog/?p=1170 >> ), and why sometimes the system seems to be "stalled" for some >> seconds, and every application that needs some IO (most read, i >> think), have serious problems. What can be a big problem in iSCSI >> or NFS soft mounts. >> Looking at the code, i could get to the zio_taskq_threads >> structure, and to this bug report: >> http://bugs.opensolaris.org/bugdatabase/printableBug.do?bug_id=6826241 >> And seems like it was already integrated to newer releases (i >> don''t know since when)... >> Somebody could explain the real diff between the ISSUE and INTR, >> READ and WRITE changes, and maybe why in the first implementation >> were the same value for both? ;-) >> Another move that i did not fully understand very well, was the >> time between txg syncs, from 5s to 30s, what i think can make this >> problem worst, because we will have more data to commit. >> Well to much questions... ;-) >> >> PS: Where i can find the patches and attachments from the >> bugs.opensolaris.org? The files mention attach, but i can not find >> them. >> >> Thanks a lot for your time! >> >> Leal >> [ http://www.eall.com.br/blog ] >> -- >> This message posted from opensolaris.org >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2431 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090623/20e4f516/attachment.bin>