Hello, I am having problems with ZFS stalling when writing, any help in troubleshooting would be appreciated. Every 5 seconds or so the write bandwidth drops to zero, then picks up a few seconds later (see the zpool iostat at the bottom of this message). I am running SXDE, snv_55b. My test consists of copying a 1gb file (with cp) between two drives, one 80GB PATA, one 500GB SATA. The first drive is the system drive (UFS), the second is for data. I have configured the data drive with UFS and it does not exhibit the stalling problem and it runs in almost half the time. I have tried many different ZFS settings as well: atime=off, compression=off, checksums=off, zil_disable=1 all to no effect. CPU jumps to about 25% system time during the stalls, and hovers around 5% when data is being transferred. # zpool iostat 1 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- tank 183M 464G 0 17 1.12K 1.93M tank 183M 464G 0 457 0 57.2M tank 183M 464G 0 445 0 55.7M tank 183M 464G 0 405 0 50.7M tank 366M 464G 0 226 0 4.97M tank 366M 464G 0 0 0 0 tank 366M 464G 0 0 0 0 tank 366M 464G 0 0 0 0 tank 366M 464G 0 200 0 25.0M tank 366M 464G 0 431 0 54.0M tank 366M 464G 0 445 0 55.7M tank 366M 464G 0 423 0 53.0M tank 574M 463G 0 270 0 18.1M tank 574M 463G 0 0 0 0 tank 574M 463G 0 0 0 0 tank 574M 463G 0 0 0 0 tank 574M 463G 0 164 0 20.5M tank 574M 463G 0 504 0 63.1M tank 574M 463G 0 405 0 50.7M tank 753M 463G 0 404 0 42.6M tank 753M 463G 0 0 0 0 tank 753M 463G 0 0 0 0 tank 753M 463G 0 0 0 0 tank 753M 463G 0 343 0 42.9M tank 753M 463G 0 476 0 59.5M tank 753M 463G 0 465 0 50.4M tank 907M 463G 0 68 0 390K tank 907M 463G 0 0 0 0 tank 907M 463G 0 11 0 1.40M tank 907M 463G 0 451 0 56.4M tank 907M 463G 0 492 0 61.5M tank 1.01G 463G 0 139 0 7.94M tank 1.01G 463G 0 0 0 0 Thanks, Jesse DeFer This message posted from opensolaris.org
Jesse, This isn''t a stall -- it''s just the natural rhythm of pushing out transaction groups. ZFS collects work (transactions) until either the transaction group is full (measured in terms of how much memory the system has), or five seconds elapse -- whichever comes first. Your data would seem to suggest that the read side isn''t delivering data as fast as ZFS can write it. However, it''s possible that there''s some sort of ''breathing'' effect that''s hurting performance. One simple experiment you could try: patch txg_time to 1. That will cause ZFS to push transaction groups every second instead of the default of every 5 seconds. If this helps (or if it doesn''t), please let us know. Thanks, Jeff Jesse DeFer wrote:> Hello, > > I am having problems with ZFS stalling when writing, any help in troubleshooting would be appreciated. Every 5 seconds or so the write bandwidth drops to zero, then picks up a few seconds later (see the zpool iostat at the bottom of this message). I am running SXDE, snv_55b. > > My test consists of copying a 1gb file (with cp) between two drives, one 80GB PATA, one 500GB SATA. The first drive is the system drive (UFS), the second is for data. I have configured the data drive with UFS and it does not exhibit the stalling problem and it runs in almost half the time. I have tried many different ZFS settings as well: atime=off, compression=off, checksums=off, zil_disable=1 all to no effect. CPU jumps to about 25% system time during the stalls, and hovers around 5% when data is being transferred. > > # zpool iostat 1 > capacity operations bandwidth > pool used avail read write read write > ---------- ----- ----- ----- ----- ----- ----- > tank 183M 464G 0 17 1.12K 1.93M > tank 183M 464G 0 457 0 57.2M > tank 183M 464G 0 445 0 55.7M > tank 183M 464G 0 405 0 50.7M > tank 366M 464G 0 226 0 4.97M > tank 366M 464G 0 0 0 0 > tank 366M 464G 0 0 0 0 > tank 366M 464G 0 0 0 0 > tank 366M 464G 0 200 0 25.0M > tank 366M 464G 0 431 0 54.0M > tank 366M 464G 0 445 0 55.7M > tank 366M 464G 0 423 0 53.0M > tank 574M 463G 0 270 0 18.1M > tank 574M 463G 0 0 0 0 > tank 574M 463G 0 0 0 0 > tank 574M 463G 0 0 0 0 > tank 574M 463G 0 164 0 20.5M > tank 574M 463G 0 504 0 63.1M > tank 574M 463G 0 405 0 50.7M > tank 753M 463G 0 404 0 42.6M > tank 753M 463G 0 0 0 0 > tank 753M 463G 0 0 0 0 > tank 753M 463G 0 0 0 0 > tank 753M 463G 0 343 0 42.9M > tank 753M 463G 0 476 0 59.5M > tank 753M 463G 0 465 0 50.4M > tank 907M 463G 0 68 0 390K > tank 907M 463G 0 0 0 0 > tank 907M 463G 0 11 0 1.40M > tank 907M 463G 0 451 0 56.4M > tank 907M 463G 0 492 0 61.5M > tank 1.01G 463G 0 139 0 7.94M > tank 1.01G 463G 0 0 0 0 > > Thanks, > Jesse DeFer > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
one question, is there a way to stop the default txg push behaviour (push at regular timestep-- default is 5sec) but instead push them "on the fly"...I would imagine this is better in the case of an application doing big sequential write (video streaming... ) s. On 3/5/07, Jeff Bonwick <jeff.bonwick at sun.com> wrote:> Jesse, > > This isn''t a stall -- it''s just the natural rhythm of pushing out > transaction groups. ZFS collects work (transactions) until either > the transaction group is full (measured in terms of how much memory > the system has), or five seconds elapse -- whichever comes first. > > Your data would seem to suggest that the read side isn''t delivering > data as fast as ZFS can write it. However, it''s possible that > there''s some sort of ''breathing'' effect that''s hurting performance. > One simple experiment you could try: patch txg_time to 1. That > will cause ZFS to push transaction groups every second instead of > the default of every 5 seconds. If this helps (or if it doesn''t), > please let us know. > > Thanks, > > Jeff > > Jesse DeFer wrote: > > Hello, > > > > I am having problems with ZFS stalling when writing, any help in troubleshooting would be appreciated. Every 5 seconds or so the write bandwidth drops to zero, then picks up a few seconds later (see the zpool iostat at the bottom of this message). I am running SXDE, snv_55b. > > > > My test consists of copying a 1gb file (with cp) between two drives, one 80GB PATA, one 500GB SATA. The first drive is the system drive (UFS), the second is for data. I have configured the data drive with UFS and it does not exhibit the stalling problem and it runs in almost half the time. I have tried many different ZFS settings as well: atime=off, compression=off, checksums=off, zil_disable=1 all to no effect. CPU jumps to about 25% system time during the stalls, and hovers around 5% when data is being transferred. > > > > # zpool iostat 1 > > capacity operations bandwidth > > pool used avail read write read write > > ---------- ----- ----- ----- ----- ----- ----- > > tank 183M 464G 0 17 1.12K 1.93M > > tank 183M 464G 0 457 0 57.2M > > tank 183M 464G 0 445 0 55.7M > > tank 183M 464G 0 405 0 50.7M > > tank 366M 464G 0 226 0 4.97M > > tank 366M 464G 0 0 0 0 > > tank 366M 464G 0 0 0 0 > > tank 366M 464G 0 0 0 0 > > tank 366M 464G 0 200 0 25.0M > > tank 366M 464G 0 431 0 54.0M > > tank 366M 464G 0 445 0 55.7M > > tank 366M 464G 0 423 0 53.0M > > tank 574M 463G 0 270 0 18.1M > > tank 574M 463G 0 0 0 0 > > tank 574M 463G 0 0 0 0 > > tank 574M 463G 0 0 0 0 > > tank 574M 463G 0 164 0 20.5M > > tank 574M 463G 0 504 0 63.1M > > tank 574M 463G 0 405 0 50.7M > > tank 753M 463G 0 404 0 42.6M > > tank 753M 463G 0 0 0 0 > > tank 753M 463G 0 0 0 0 > > tank 753M 463G 0 0 0 0 > > tank 753M 463G 0 343 0 42.9M > > tank 753M 463G 0 476 0 59.5M > > tank 753M 463G 0 465 0 50.4M > > tank 907M 463G 0 68 0 390K > > tank 907M 463G 0 0 0 0 > > tank 907M 463G 0 11 0 1.40M > > tank 907M 463G 0 451 0 56.4M > > tank 907M 463G 0 492 0 61.5M > > tank 1.01G 463G 0 139 0 7.94M > > tank 1.01G 463G 0 0 0 0 > > > > Thanks, > > Jesse DeFer > > > > > > This message posted from opensolaris.org > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
zfs-discuss-bounces at opensolaris.org wrote on 03/05/2007 03:56:28 AM:> one question, > is there a way to stop the default txg push behaviour (push at regular > timestep-- default is 5sec) but instead push them "on the fly"...I > would imagine this is better in the case of an application doing big > sequential write (video streaming... ) > > s.I do not believe you would want to do that under any workload -- txg allow for optimized writes. I am wondering if this stall behavior (is it really stalling, or just a visual stat issue) is more related to txg maxsize (calculated from memory/arc size) vs txg_time. txg_time adjusting may cloud the real issue if it is due to a bottleneck while evacing a txg or if the txg maxsize is miscalculated so that people are hitting a state where txg is _almost_ hitting maxsize in 5 seconds (txg_time default), and blocking the next txg while evacing -- in which case the core issue is the txg evac / maxsize. Any thoughts? -Wade> > On 3/5/07, Jeff Bonwick <jeff.bonwick at sun.com> wrote: > > Jesse, > > > > This isn''t a stall -- it''s just the natural rhythm of pushing out > > transaction groups. ZFS collects work (transactions) until either > > the transaction group is full (measured in terms of how much memory > > the system has), or five seconds elapse -- whichever comes first. > > > > Your data would seem to suggest that the read side isn''t delivering > > data as fast as ZFS can write it. However, it''s possible that > > there''s some sort of ''breathing'' effect that''s hurting performance. > > One simple experiment you could try: patch txg_time to 1. That > > will cause ZFS to push transaction groups every second instead of > > the default of every 5 seconds. If this helps (or if it doesn''t), > > please let us know. > > > > Thanks, > > > > Jeff > > > > Jesse DeFer wrote: > > > Hello, > > > > > > I am having problems with ZFS stalling when writing, any help in > troubleshooting would be appreciated. Every 5 seconds or so the > write bandwidth drops to zero, then picks up a few seconds later > (see the zpool iostat at the bottom of this message). I am running > SXDE, snv_55b. > > > > > > My test consists of copying a 1gb file (with cp) between two > drives, one 80GB PATA, one 500GB SATA. The first drive is the > system drive (UFS), the second is for data. I have configured the > data drive with UFS and it does not exhibit the stalling problem and > it runs in almost half the time. I have tried many different ZFS > settings as well: atime=off, compression=off, checksums=off, > zil_disable=1 all to no effect. CPU jumps to about 25% system time > during the stalls, and hovers around 5% when data is being transferred. > > > > > > # zpool iostat 1 > > > capacity operations bandwidth > > > pool used avail read write read write > > > ---------- ----- ----- ----- ----- ----- ----- > > > tank 183M 464G 0 17 1.12K 1.93M > > > tank 183M 464G 0 457 0 57.2M > > > tank 183M 464G 0 445 0 55.7M > > > tank 183M 464G 0 405 0 50.7M > > > tank 366M 464G 0 226 0 4.97M > > > tank 366M 464G 0 0 0 0 > > > tank 366M 464G 0 0 0 0 > > > tank 366M 464G 0 0 0 0 > > > tank 366M 464G 0 200 0 25.0M > > > tank 366M 464G 0 431 0 54.0M > > > tank 366M 464G 0 445 0 55.7M > > > tank 366M 464G 0 423 0 53.0M > > > tank 574M 463G 0 270 0 18.1M > > > tank 574M 463G 0 0 0 0 > > > tank 574M 463G 0 0 0 0 > > > tank 574M 463G 0 0 0 0 > > > tank 574M 463G 0 164 0 20.5M > > > tank 574M 463G 0 504 0 63.1M > > > tank 574M 463G 0 405 0 50.7M > > > tank 753M 463G 0 404 0 42.6M > > > tank 753M 463G 0 0 0 0 > > > tank 753M 463G 0 0 0 0 > > > tank 753M 463G 0 0 0 0 > > > tank 753M 463G 0 343 0 42.9M > > > tank 753M 463G 0 476 0 59.5M > > > tank 753M 463G 0 465 0 50.4M > > > tank 907M 463G 0 68 0 390K > > > tank 907M 463G 0 0 0 0 > > > tank 907M 463G 0 11 0 1.40M > > > tank 907M 463G 0 451 0 56.4M > > > tank 907M 463G 0 492 0 61.5M > > > tank 1.01G 463G 0 139 0 7.94M > > > tank 1.01G 463G 0 0 0 0 > > > > > > Thanks, > > > Jesse DeFer > > > > > > > > > This message posted from opensolaris.org > > > _______________________________________________ > > > zfs-discuss mailing list > > > zfs-discuss at opensolaris.org > > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> zfs-discuss-bounces at opensolaris.org wrote on > 03/05/2007 03:56:28 AM: > > > one question, > > is there a way to stop the default txg push > behaviour (push at regular > > timestep-- default is 5sec) but instead push them > "on the fly"...I > > would imagine this is better in the case of an > application doing big > > sequential write (video streaming... ) > > > > s. > > > I do not believe you would want to do that under any > workload -- txg allow > for optimized writes. I am wondering if this stall > behavior (is it really > stalling, or just a visual stat issue) is more > related to txg maxsize > (calculated from memory/arc size) vs txg_time. > txg_time adjusting may > loud the real issue if it is due to a bottleneck > while evacing a txg or if > the txg maxsize is miscalculated so that people are > hitting a state where > txg is _almost_ hitting maxsize in 5 seconds > (txg_time default), and > blocking the next txg while evacing -- in which case > the core issue is the > txg evac / maxsize. > > Any thoughts? > > -WadeWall time for my two tests is 24s for UFS and 42s for ZFS, so it doesn''t appear to be a stat visualization problem. I am currently attempting to change txg_size, but am having trouble setting up a build environment. Jesse> > > > On 3/5/07, Jeff Bonwick <jeff.bonwick at sun.com> > wrote: > > > Jesse, > > > > > > This isn''t a stall -- it''s just the natural > rhythm of pushing out > > > transaction groups. ZFS collects work > (transactions) until either > > > the transaction group is full (measured in terms > of how much memory > > > the system has), or five seconds elapse -- > whichever comes first. > > > > > > Your data would seem to suggest that the read > side isn''t delivering > > > data as fast as ZFS can write it. However, it''s > possible that > > > there''s some sort of ''breathing'' effect that''s > hurting performance. > > > One simple experiment you could try: patch > txg_time to 1. That > > > will cause ZFS to push transaction groups every > second instead of > > > the default of every 5 seconds. If this helps > (or if it doesn''t), > > > please let us know. > > > > > > Thanks, > > > > > > Jeff > > > > > > Jesse DeFer wrote: > > > > Hello, > > > > > > > > I am having problems with ZFS stalling when > writing, any help in > > troubleshooting would be appreciated. Every 5 > seconds or so the > > write bandwidth drops to zero, then picks up a few > seconds later > > (see the zpool iostat at the bottom of this > message). I am running > > SXDE, snv_55b. > > > > > > > > My test consists of copying a 1gb file (with > cp) between two > > drives, one 80GB PATA, one 500GB SATA. The first > drive is the > > system drive (UFS), the second is for data. I have > configured the > > data drive with UFS and it does not exhibit the > stalling problem and > > it runs in almost half the time. I have tried many > different ZFS > > settings as well: atime=off, compression=off, > checksums=off, > > zil_disable=1 all to no effect. CPU jumps to about > 25% system time > > during the stalls, and hovers around 5% when data > is being transferred. > > > > > > > > # zpool iostat 1 > > > > capacity operations > bandwidth > > pool used avail read write read > write > > > ---------- ----- ----- ----- ----- ----- > ----- > > > tank 183M 464G 0 17 1.12K > 1.93M > > > tank 183M 464G 0 457 0 > 57.2M > > > tank 183M 464G 0 445 0 > 55.7M > > > tank 183M 464G 0 405 0 > 50.7M > > > tank 366M 464G 0 226 0 > 4.97M > > > tank 366M 464G 0 0 0 > 0 > tank 366M 464G 0 0 0 0 > > > tank 366M 464G 0 0 0 > 0 > tank 366M 464G 0 200 0 25.0M > > > > tank 366M 464G 0 431 0 > 54.0M > > > tank 366M 464G 0 445 0 > 55.7M > > > tank 366M 464G 0 423 0 > 53.0M > > > tank 574M 463G 0 270 0 > 18.1M > > > tank 574M 463G 0 0 0 > 0 > tank 574M 463G 0 0 0 0 > > > tank 574M 463G 0 0 0 > 0 > tank 574M 463G 0 164 0 20.5M > > > > tank 574M 463G 0 504 0 > 63.1M > > > tank 574M 463G 0 405 0 > 50.7M > > > tank 753M 463G 0 404 0 > 42.6M > > > tank 753M 463G 0 0 0 > 0 > tank 753M 463G 0 0 0 0 > > > > tank 753M 463G 0 0 0 > 0 > tank 753M 463G 0 343 0 42.9M > > > tank 753M 463G 0 476 0 > 59.5M > > > tank 753M 463G 0 465 0 > 50.4M > > > tank 907M 463G 0 68 0 > 390K > > tank 907M 463G 0 0 0 > 0 > tank 907M 463G 0 11 0 1.40M > > > > tank 907M 463G 0 451 0 > 56.4M > > > tank 907M 463G 0 492 0 > 61.5M > > > tank 1.01G 463G 0 139 0 > 7.94M > > > tank 1.01G 463G 0 0 0 > 0 > > > > Thanks, > > > > Jesse DeFer > > > >This message posted from opensolaris.org
Jesse, You can change txg_time with mdb echo "txg_time/W0t1" | mdb -kw -r
OK, I tried it with txg_time set to 1 and am seeing less predictable results. The first time I ran the test it completed in 27 seconds (vs 24s for ufs or 42s with txg_time=5). Further tests ran from 27s to 43s, about half the time greater than 40s. zpool iostat doesn''t show the large no-writes gaps, but it is still very bursty and peak bandwidth is lower. Here is a 29s run: tank 113K 464G 0 0 0 0 tank 113K 464G 0 226 0 28.2M tank 40.1M 464G 0 441 0 46.9M tank 88.2M 464G 0 384 0 39.8M tank 136M 464G 0 445 0 47.4M tank 184M 464G 0 412 0 43.4M tank 232M 464G 0 411 0 43.2M tank 272M 464G 0 402 0 42.1M tank 320M 464G 0 435 0 46.3M tank 368M 464G 0 366 63.4K 37.7M tank 408M 464G 0 494 0 53.6M tank 456M 464G 0 360 0 36.8M tank 496M 464G 0 420 0 44.5M tank 544M 463G 0 439 0 46.8M tank 585M 463G 0 370 0 38.2M tank 633M 463G 0 407 0 42.6M tank 673M 463G 0 457 0 49.0M tank 713M 463G 0 368 0 37.9M tank 761M 463G 0 443 0 47.2M tank 801M 463G 0 380 63.4K 39.4M tank 844M 463G 0 444 63.4K 47.4M tank 879M 463G 0 184 0 14.9M tank 879M 463G 0 339 0 33.4M tank 913M 463G 0 215 0 26.5M tank 944M 463G 0 393 63.4K 36.4M tank 976M 463G 0 171 63.4K 10.5M tank 1008M 463G 0 237 63.4K 21.6M tank 1008M 463G 0 312 0 31.5M tank 1.02G 463G 0 137 0 9.05M tank 1.05G 463G 0 313 0 23.4M tank 1.05G 463G 0 0 0 0 Jesse> Jesse, > > This isn''t a stall -- it''s just the natural rhythm of > pushing out > transaction groups. ZFS collects work (transactions) > until either > the transaction group is full (measured in terms of > how much memory > the system has), or five seconds elapse -- whichever > comes first. > > Your data would seem to suggest that the read side > isn''t delivering > data as fast as ZFS can write it. However, it''s > possible that > there''s some sort of ''breathing'' effect that''s > hurting performance. > One simple experiment you could try: patch txg_time > to 1. That > will cause ZFS to push transaction groups every > second instead of > the default of every 5 seconds. If this helps (or if > it doesn''t), > please let us know. > > Thanks, > > Jeff > > Jesse DeFer wrote: > > Hello, > > > > I am having problems with ZFS stalling when > writing, any help in troubleshooting would be > appreciated. Every 5 seconds or so the write > bandwidth drops to zero, then picks up a few seconds > later (see the zpool iostat at the bottom of this > message). I am running SXDE, snv_55b. > > > > My test consists of copying a 1gb file (with cp) > between two drives, one 80GB PATA, one 500GB SATA. > The first drive is the system drive (UFS), the > second is for data. I have configured the data > drive with UFS and it does not exhibit the stalling > problem and it runs in almost half the time. I have > tried many different ZFS settings as well: > atime=off, compression=off, checksums=off, > zil_disable=1 all to no effect. CPU jumps to about > 25% system time during the stalls, and hovers around > 5% when data is being transferred. > > # zpool iostat 1 > capacity operations bandwidth > sed avail read write read write > > ---------- ----- ----- ----- ----- ----- > ----- > tank 183M 464G 0 17 1.12K 1.93M > tank 183M 464G 0 457 0 57.2M > tank 183M 464G 0 445 0 55.7M > tank 183M 464G 0 405 0 50.7M > tank 366M 464G 0 226 0 4.97M > tank 366M 464G 0 0 0 0 > tank 366M 464G 0 0 0 0 > tank 366M 464G 0 0 0 0 > tank 366M 464G 0 200 0 25.0M > tank 366M 464G 0 431 0 54.0M > tank 366M 464G 0 445 0 55.7M > tank 366M 464G 0 423 0 53.0M > tank 574M 463G 0 270 0 18.1M > tank 574M 463G 0 0 0 0 > tank 574M 463G 0 0 0 0 > tank 574M 463G 0 0 0 0 > tank 574M 463G 0 164 0 20.5M > tank 574M 463G 0 504 0 63.1M > tank 574M 463G 0 405 0 50.7M > tank 753M 463G 0 404 0 42.6M > tank 753M 463G 0 0 0 0 > tank 753M 463G 0 0 0 0 > tank 753M 463G 0 0 0 0 > tank 753M 463G 0 343 0 42.9M > tank 753M 463G 0 476 0 59.5M > tank 753M 463G 0 465 0 50.4M > tank 907M 463G 0 68 0 390K > tank 907M 463G 0 0 0 0 > tank 907M 463G 0 11 0 1.40M > tank 907M 463G 0 451 0 56.4M > tank 907M 463G 0 492 0 61.5M > tank 1.01G 463G 0 139 0 7.94M > tank 1.01G 463G 0 0 0 0 > > Thanks, > Jesse DeFerThis message posted from opensolaris.org
I observed better predictable thoughput if I use a IO generator that can do throttling (xdd or vdbench) s. On 3/11/07, Jesse DeFer <opensolaris at dotd.com> wrote:> OK, I tried it with txg_time set to 1 and am seeing less predictable results. The first time I ran the test it completed in 27 seconds (vs 24s for ufs or 42s with txg_time=5). Further tests ran from 27s to 43s, about half the time greater than 40s. > > zpool iostat doesn''t show the large no-writes gaps, but it is still very bursty and peak bandwidth is lower. Here is a 29s run: > > tank 113K 464G 0 0 0 0 > tank 113K 464G 0 226 0 28.2M > tank 40.1M 464G 0 441 0 46.9M > tank 88.2M 464G 0 384 0 39.8M > tank 136M 464G 0 445 0 47.4M > tank 184M 464G 0 412 0 43.4M > tank 232M 464G 0 411 0 43.2M > tank 272M 464G 0 402 0 42.1M > tank 320M 464G 0 435 0 46.3M > tank 368M 464G 0 366 63.4K 37.7M > tank 408M 464G 0 494 0 53.6M > tank 456M 464G 0 360 0 36.8M > tank 496M 464G 0 420 0 44.5M > tank 544M 463G 0 439 0 46.8M > tank 585M 463G 0 370 0 38.2M > tank 633M 463G 0 407 0 42.6M > tank 673M 463G 0 457 0 49.0M > tank 713M 463G 0 368 0 37.9M > tank 761M 463G 0 443 0 47.2M > tank 801M 463G 0 380 63.4K 39.4M > tank 844M 463G 0 444 63.4K 47.4M > tank 879M 463G 0 184 0 14.9M > tank 879M 463G 0 339 0 33.4M > tank 913M 463G 0 215 0 26.5M > tank 944M 463G 0 393 63.4K 36.4M > tank 976M 463G 0 171 63.4K 10.5M > tank 1008M 463G 0 237 63.4K 21.6M > tank 1008M 463G 0 312 0 31.5M > tank 1.02G 463G 0 137 0 9.05M > tank 1.05G 463G 0 313 0 23.4M > tank 1.05G 463G 0 0 0 0 > > > Jesse > > > Jesse, > > > > This isn''t a stall -- it''s just the natural rhythm of > > pushing out > > transaction groups. ZFS collects work (transactions) > > until either > > the transaction group is full (measured in terms of > > how much memory > > the system has), or five seconds elapse -- whichever > > comes first. > > > > Your data would seem to suggest that the read side > > isn''t delivering > > data as fast as ZFS can write it. However, it''s > > possible that > > there''s some sort of ''breathing'' effect that''s > > hurting performance. > > One simple experiment you could try: patch txg_time > > to 1. That > > will cause ZFS to push transaction groups every > > second instead of > > the default of every 5 seconds. If this helps (or if > > it doesn''t), > > please let us know. > > > > Thanks, > > > > Jeff > > > > Jesse DeFer wrote: > > > Hello, > > > > > > I am having problems with ZFS stalling when > > writing, any help in troubleshooting would be > > appreciated. Every 5 seconds or so the write > > bandwidth drops to zero, then picks up a few seconds > > later (see the zpool iostat at the bottom of this > > message). I am running SXDE, snv_55b. > > > > > > My test consists of copying a 1gb file (with cp) > > between two drives, one 80GB PATA, one 500GB SATA. > > The first drive is the system drive (UFS), the > > second is for data. I have configured the data > > drive with UFS and it does not exhibit the stalling > > problem and it runs in almost half the time. I have > > tried many different ZFS settings as well: > > atime=off, compression=off, checksums=off, > > zil_disable=1 all to no effect. CPU jumps to about > > 25% system time during the stalls, and hovers around > > 5% when data is being transferred. > > > > # zpool iostat 1 > > capacity operations bandwidth > > sed avail read write read write > > > ---------- ----- ----- ----- ----- ----- > > ----- > > tank 183M 464G 0 17 1.12K 1.93M > > tank 183M 464G 0 457 0 57.2M > > tank 183M 464G 0 445 0 55.7M > > tank 183M 464G 0 405 0 50.7M > > tank 366M 464G 0 226 0 4.97M > > tank 366M 464G 0 0 0 0 > > tank 366M 464G 0 0 0 0 > > tank 366M 464G 0 0 0 0 > > tank 366M 464G 0 200 0 25.0M > > tank 366M 464G 0 431 0 54.0M > > tank 366M 464G 0 445 0 55.7M > > tank 366M 464G 0 423 0 53.0M > > tank 574M 463G 0 270 0 18.1M > > tank 574M 463G 0 0 0 0 > > tank 574M 463G 0 0 0 0 > > tank 574M 463G 0 0 0 0 > > tank 574M 463G 0 164 0 20.5M > > tank 574M 463G 0 504 0 63.1M > > tank 574M 463G 0 405 0 50.7M > > tank 753M 463G 0 404 0 42.6M > > tank 753M 463G 0 0 0 0 > > tank 753M 463G 0 0 0 0 > > tank 753M 463G 0 0 0 0 > > tank 753M 463G 0 343 0 42.9M > > tank 753M 463G 0 476 0 59.5M > > tank 753M 463G 0 465 0 50.4M > > tank 907M 463G 0 68 0 390K > > tank 907M 463G 0 0 0 0 > > tank 907M 463G 0 11 0 1.40M > > tank 907M 463G 0 451 0 56.4M > > tank 907M 463G 0 492 0 61.5M > > tank 1.01G 463G 0 139 0 7.94M > > tank 1.01G 463G 0 0 0 0 > > > > Thanks, > > Jesse DeFer > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Working with a small txg_time means we are hit by the pool sync overhead more often. This is why the per second throughpuot has smaller peak values. With txg_time = 5, we have another problem which is that depending on timing of the pool sync, some txg can end up with too little data in them and sync quickly. We''re closing in (I hope) on fixing both issues: 6429205 each zpool needs to monitor its throughput and throttle heavy writers 6415647 Sequential writing is jumping http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id -r Jesse DeFer writes: > OK, I tried it with txg_time set to 1 and am seeing less predictable > results. The first time I ran the test it completed in 27 seconds (vs > 24s for ufs or 42s with txg_time=5). Further tests ran from 27s to > 43s, about half the time greater than 40s. > > zpool iostat doesn''t show the large no-writes gaps, but it is still very bursty and peak bandwidth is lower. Here is a 29s run: > > tank 113K 464G 0 0 0 0 > tank 113K 464G 0 226 0 28.2M > tank 40.1M 464G 0 441 0 46.9M > tank 88.2M 464G 0 384 0 39.8M > tank 136M 464G 0 445 0 47.4M > tank 184M 464G 0 412 0 43.4M > tank 232M 464G 0 411 0 43.2M > tank 272M 464G 0 402 0 42.1M > tank 320M 464G 0 435 0 46.3M > tank 368M 464G 0 366 63.4K 37.7M > tank 408M 464G 0 494 0 53.6M > tank 456M 464G 0 360 0 36.8M > tank 496M 464G 0 420 0 44.5M > tank 544M 463G 0 439 0 46.8M > tank 585M 463G 0 370 0 38.2M > tank 633M 463G 0 407 0 42.6M > tank 673M 463G 0 457 0 49.0M > tank 713M 463G 0 368 0 37.9M > tank 761M 463G 0 443 0 47.2M > tank 801M 463G 0 380 63.4K 39.4M > tank 844M 463G 0 444 63.4K 47.4M > tank 879M 463G 0 184 0 14.9M > tank 879M 463G 0 339 0 33.4M > tank 913M 463G 0 215 0 26.5M > tank 944M 463G 0 393 63.4K 36.4M > tank 976M 463G 0 171 63.4K 10.5M > tank 1008M 463G 0 237 63.4K 21.6M > tank 1008M 463G 0 312 0 31.5M > tank 1.02G 463G 0 137 0 9.05M > tank 1.05G 463G 0 313 0 23.4M > tank 1.05G 463G 0 0 0 0 > > > Jesse > > > Jesse, > > > > This isn''t a stall -- it''s just the natural rhythm of > > pushing out > > transaction groups. ZFS collects work (transactions) > > until either > > the transaction group is full (measured in terms of > > how much memory > > the system has), or five seconds elapse -- whichever > > comes first. > > > > Your data would seem to suggest that the read side > > isn''t delivering > > data as fast as ZFS can write it. However, it''s > > possible that > > there''s some sort of ''breathing'' effect that''s > > hurting performance. > > One simple experiment you could try: patch txg_time > > to 1. That > > will cause ZFS to push transaction groups every > > second instead of > > the default of every 5 seconds. If this helps (or if > > it doesn''t), > > please let us know. > > > > Thanks, > > > > Jeff > > > > Jesse DeFer wrote: > > > Hello, > > > > > > I am having problems with ZFS stalling when > > writing, any help in troubleshooting would be > > appreciated. Every 5 seconds or so the write > > bandwidth drops to zero, then picks up a few seconds > > later (see the zpool iostat at the bottom of this > > message). I am running SXDE, snv_55b. > > > > > > My test consists of copying a 1gb file (with cp) > > between two drives, one 80GB PATA, one 500GB SATA. > > The first drive is the system drive (UFS), the > > second is for data. I have configured the data > > drive with UFS and it does not exhibit the stalling > > problem and it runs in almost half the time. I have > > tried many different ZFS settings as well: > > atime=off, compression=off, checksums=off, > > zil_disable=1 all to no effect. CPU jumps to about > > 25% system time during the stalls, and hovers around > > 5% when data is being transferred. > > > > # zpool iostat 1 > > capacity operations bandwidth > > sed avail read write read write > > > ---------- ----- ----- ----- ----- ----- > > ----- > > tank 183M 464G 0 17 1.12K 1.93M > > tank 183M 464G 0 457 0 57.2M > > tank 183M 464G 0 445 0 55.7M > > tank 183M 464G 0 405 0 50.7M > > tank 366M 464G 0 226 0 4.97M > > tank 366M 464G 0 0 0 0 > > tank 366M 464G 0 0 0 0 > > tank 366M 464G 0 0 0 0 > > tank 366M 464G 0 200 0 25.0M > > tank 366M 464G 0 431 0 54.0M > > tank 366M 464G 0 445 0 55.7M > > tank 366M 464G 0 423 0 53.0M > > tank 574M 463G 0 270 0 18.1M > > tank 574M 463G 0 0 0 0 > > tank 574M 463G 0 0 0 0 > > tank 574M 463G 0 0 0 0 > > tank 574M 463G 0 164 0 20.5M > > tank 574M 463G 0 504 0 63.1M > > tank 574M 463G 0 405 0 50.7M > > tank 753M 463G 0 404 0 42.6M > > tank 753M 463G 0 0 0 0 > > tank 753M 463G 0 0 0 0 > > tank 753M 463G 0 0 0 0 > > tank 753M 463G 0 343 0 42.9M > > tank 753M 463G 0 476 0 59.5M > > tank 753M 463G 0 465 0 50.4M > > tank 907M 463G 0 68 0 390K > > tank 907M 463G 0 0 0 0 > > tank 907M 463G 0 11 0 1.40M > > tank 907M 463G 0 451 0 56.4M > > tank 907M 463G 0 492 0 61.5M > > tank 1.01G 463G 0 139 0 7.94M > > tank 1.01G 463G 0 0 0 0 > > > > Thanks, > > Jesse DeFer > > > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss