thr3ads.net - zfs discuss - [zfs-discuss] ZFS stalling problem [Mar 2007]

If this information is useful, please help other people find it:
Share via:

Jesse DeFer

2007-Mar-03 14:44 UTC

[zfs-discuss] ZFS stalling problem

Hello,

I am having problems with ZFS stalling when writing, any help in troubleshooting
would be appreciated.  Every 5 seconds or so the write bandwidth drops to zero,
then picks up a few seconds later (see the zpool iostat at the bottom of this
message).  I am running SXDE, snv_55b.

My test consists of copying a 1gb file (with cp) between two drives, one 80GB
PATA, one 500GB SATA.  The first drive is the system drive (UFS), the second is
for data.  I have configured the data drive with UFS and it does not exhibit the
stalling problem and it runs in almost half the time.  I have tried many
different ZFS settings as well: atime=off, compression=off, checksums=off,
zil_disable=1 all to no effect.  CPU jumps to about 25% system time during the
stalls, and hovers around 5% when data is being transferred.

# zpool iostat 1
               capacity     operations    bandwidth
pool         used  avail   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
tank         183M   464G      0     17  1.12K  1.93M
tank         183M   464G      0    457      0  57.2M
tank         183M   464G      0    445      0  55.7M
tank         183M   464G      0    405      0  50.7M
tank         366M   464G      0    226      0  4.97M
tank         366M   464G      0      0      0      0
tank         366M   464G      0      0      0      0
tank         366M   464G      0      0      0      0
tank         366M   464G      0    200      0  25.0M
tank         366M   464G      0    431      0  54.0M
tank         366M   464G      0    445      0  55.7M
tank         366M   464G      0    423      0  53.0M
tank         574M   463G      0    270      0  18.1M
tank         574M   463G      0      0      0      0
tank         574M   463G      0      0      0      0
tank         574M   463G      0      0      0      0
tank         574M   463G      0    164      0  20.5M
tank         574M   463G      0    504      0  63.1M
tank         574M   463G      0    405      0  50.7M
tank         753M   463G      0    404      0  42.6M
tank         753M   463G      0      0      0      0
tank         753M   463G      0      0      0      0
tank         753M   463G      0      0      0      0
tank         753M   463G      0    343      0  42.9M
tank         753M   463G      0    476      0  59.5M
tank         753M   463G      0    465      0  50.4M
tank         907M   463G      0     68      0   390K
tank         907M   463G      0      0      0      0
tank         907M   463G      0     11      0  1.40M
tank         907M   463G      0    451      0  56.4M
tank         907M   463G      0    492      0  61.5M
tank        1.01G   463G      0    139      0  7.94M
tank        1.01G   463G      0      0      0      0

Thanks,
Jesse DeFer
 
 
This message posted from opensolaris.org

Jeff Bonwick

2007-Mar-05 00:08 UTC

head link

[zfs-discuss] ZFS stalling problem

Jesse,

This isn''t a stall -- it''s just the natural rhythm of pushing
out
transaction groups.  ZFS collects work (transactions) until either
the transaction group is full (measured in terms of how much memory
the system has), or five seconds elapse -- whichever comes first.

Your data would seem to suggest that the read side isn''t delivering
data as fast as ZFS can write it.  However, it''s possible that
there''s some sort of ''breathing'' effect
that''s hurting performance.
One simple experiment you could try: patch txg_time to 1.  That
will cause ZFS to push transaction groups every second instead of
the default of every 5 seconds.  If this helps (or if it doesn''t),
please let us know.

Thanks,

Jeff

Jesse DeFer wrote:> Hello,
> 
> I am having problems with ZFS stalling when writing, any help in
troubleshooting would be appreciated.  Every 5 seconds or so the write bandwidth
drops to zero, then picks up a few seconds later (see the zpool iostat at the
bottom of this message).  I am running SXDE, snv_55b.
> 
> My test consists of copying a 1gb file (with cp) between two drives, one
80GB PATA, one 500GB SATA.  The first drive is the system drive (UFS), the
second is for data.  I have configured the data drive with UFS and it does not
exhibit the stalling problem and it runs in almost half the time.  I have tried
many different ZFS settings as well: atime=off, compression=off, checksums=off,
zil_disable=1 all to no effect.  CPU jumps to about 25% system time during the
stalls, and hovers around 5% when data is being transferred.
> 
> # zpool iostat 1
>                capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> tank         183M   464G      0     17  1.12K  1.93M
> tank         183M   464G      0    457      0  57.2M
> tank         183M   464G      0    445      0  55.7M
> tank         183M   464G      0    405      0  50.7M
> tank         366M   464G      0    226      0  4.97M
> tank         366M   464G      0      0      0      0
> tank         366M   464G      0      0      0      0
> tank         366M   464G      0      0      0      0
> tank         366M   464G      0    200      0  25.0M
> tank         366M   464G      0    431      0  54.0M
> tank         366M   464G      0    445      0  55.7M
> tank         366M   464G      0    423      0  53.0M
> tank         574M   463G      0    270      0  18.1M
> tank         574M   463G      0      0      0      0
> tank         574M   463G      0      0      0      0
> tank         574M   463G      0      0      0      0
> tank         574M   463G      0    164      0  20.5M
> tank         574M   463G      0    504      0  63.1M
> tank         574M   463G      0    405      0  50.7M
> tank         753M   463G      0    404      0  42.6M
> tank         753M   463G      0      0      0      0
> tank         753M   463G      0      0      0      0
> tank         753M   463G      0      0      0      0
> tank         753M   463G      0    343      0  42.9M
> tank         753M   463G      0    476      0  59.5M
> tank         753M   463G      0    465      0  50.4M
> tank         907M   463G      0     68      0   390K
> tank         907M   463G      0      0      0      0
> tank         907M   463G      0     11      0  1.40M
> tank         907M   463G      0    451      0  56.4M
> tank         907M   463G      0    492      0  61.5M
> tank        1.01G   463G      0    139      0  7.94M
> tank        1.01G   463G      0      0      0      0
> 
> Thanks,
> Jesse DeFer
>  
>  
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Selim Daoud

2007-Mar-05 09:56 UTC

head link

[zfs-discuss] ZFS stalling problem

one question,
is there a way to stop the default txg push behaviour (push at regular
timestep-- default is 5sec) but instead push them "on the fly"...I
would imagine this is better in the case of an application doing big
sequential write (video streaming... )

s.

On 3/5/07, Jeff Bonwick <jeff.bonwick at sun.com>
wrote:> Jesse,
>
> This isn''t a stall -- it''s just the natural rhythm of
pushing out
> transaction groups.  ZFS collects work (transactions) until either
> the transaction group is full (measured in terms of how much memory
> the system has), or five seconds elapse -- whichever comes first.
>
> Your data would seem to suggest that the read side isn''t
delivering
> data as fast as ZFS can write it.  However, it''s possible that
> there''s some sort of ''breathing'' effect
that''s hurting performance.
> One simple experiment you could try: patch txg_time to 1.  That
> will cause ZFS to push transaction groups every second instead of
> the default of every 5 seconds.  If this helps (or if it doesn''t),
> please let us know.
>
> Thanks,
>
> Jeff
>
> Jesse DeFer wrote:
> > Hello,
> >
> > I am having problems with ZFS stalling when writing, any help in
troubleshooting would be appreciated.  Every 5 seconds or so the write bandwidth
drops to zero, then picks up a few seconds later (see the zpool iostat at the
bottom of this message).  I am running SXDE, snv_55b.
> >
> > My test consists of copying a 1gb file (with cp) between two drives,
one 80GB PATA, one 500GB SATA.  The first drive is the system drive (UFS), the
second is for data.  I have configured the data drive with UFS and it does not
exhibit the stalling problem and it runs in almost half the time.  I have tried
many different ZFS settings as well: atime=off, compression=off, checksums=off,
zil_disable=1 all to no effect.  CPU jumps to about 25% system time during the
stalls, and hovers around 5% when data is being transferred.
> >
> > # zpool iostat 1
> >                capacity     operations    bandwidth
> > pool         used  avail   read  write   read  write
> > ----------  -----  -----  -----  -----  -----  -----
> > tank         183M   464G      0     17  1.12K  1.93M
> > tank         183M   464G      0    457      0  57.2M
> > tank         183M   464G      0    445      0  55.7M
> > tank         183M   464G      0    405      0  50.7M
> > tank         366M   464G      0    226      0  4.97M
> > tank         366M   464G      0      0      0      0
> > tank         366M   464G      0      0      0      0
> > tank         366M   464G      0      0      0      0
> > tank         366M   464G      0    200      0  25.0M
> > tank         366M   464G      0    431      0  54.0M
> > tank         366M   464G      0    445      0  55.7M
> > tank         366M   464G      0    423      0  53.0M
> > tank         574M   463G      0    270      0  18.1M
> > tank         574M   463G      0      0      0      0
> > tank         574M   463G      0      0      0      0
> > tank         574M   463G      0      0      0      0
> > tank         574M   463G      0    164      0  20.5M
> > tank         574M   463G      0    504      0  63.1M
> > tank         574M   463G      0    405      0  50.7M
> > tank         753M   463G      0    404      0  42.6M
> > tank         753M   463G      0      0      0      0
> > tank         753M   463G      0      0      0      0
> > tank         753M   463G      0      0      0      0
> > tank         753M   463G      0    343      0  42.9M
> > tank         753M   463G      0    476      0  59.5M
> > tank         753M   463G      0    465      0  50.4M
> > tank         907M   463G      0     68      0   390K
> > tank         907M   463G      0      0      0      0
> > tank         907M   463G      0     11      0  1.40M
> > tank         907M   463G      0    451      0  56.4M
> > tank         907M   463G      0    492      0  61.5M
> > tank        1.01G   463G      0    139      0  7.94M
> > tank        1.01G   463G      0      0      0      0
> >
> > Thanks,
> > Jesse DeFer
> >
> >
> > This message posted from opensolaris.org
> > _______________________________________________
> > zfs-discuss mailing list
> > zfs-discuss at opensolaris.org
> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Wade.Stuart at fallon.com

2007-Mar-05 16:34 UTC

head link

[zfs-discuss] ZFS stalling problem

zfs-discuss-bounces at opensolaris.org wrote on 03/05/2007 03:56:28 AM:
> one question,
> is there a way to stop the default txg push behaviour (push at regular
> timestep-- default is 5sec) but instead push them "on the
fly"...I
> would imagine this is better in the case of an application doing big
> sequential write (video streaming... )
>
> s.

I do not believe you would want to do that under any workload -- txg allow
for optimized writes.  I am wondering if this stall behavior (is it really
stalling,  or just a visual stat issue) is more related to txg maxsize
(calculated from memory/arc size) vs txg_time.  txg_time adjusting may
cloud the real issue if it is due to a bottleneck while evacing a txg or if
the txg maxsize is miscalculated so that people are hitting a state where
txg is _almost_ hitting maxsize in 5 seconds (txg_time default), and
blocking the next txg while evacing -- in which case the core issue is the
txg evac / maxsize.

Any thoughts?

-Wade
>
> On 3/5/07, Jeff Bonwick <jeff.bonwick at sun.com> wrote:
> > Jesse,
> >
> > This isn''t a stall -- it''s just the natural rhythm
of pushing out
> > transaction groups.  ZFS collects work (transactions) until either
> > the transaction group is full (measured in terms of how much memory
> > the system has), or five seconds elapse -- whichever comes first.
> >
> > Your data would seem to suggest that the read side isn''t
delivering
> > data as fast as ZFS can write it.  However, it''s possible
that
> > there''s some sort of ''breathing'' effect
that''s hurting performance.
> > One simple experiment you could try: patch txg_time to 1.  That
> > will cause ZFS to push transaction groups every second instead of
> > the default of every 5 seconds.  If this helps (or if it
doesn''t),
> > please let us know.
> >
> > Thanks,
> >
> > Jeff
> >
> > Jesse DeFer wrote:
> > > Hello,
> > >
> > > I am having problems with ZFS stalling when writing, any help in
> troubleshooting would be appreciated.  Every 5 seconds or so the
> write bandwidth drops to zero, then picks up a few seconds later
> (see the zpool iostat at the bottom of this message).  I am running
> SXDE, snv_55b.
> > >
> > > My test consists of copying a 1gb file (with cp) between two
> drives, one 80GB PATA, one 500GB SATA.  The first drive is the
> system drive (UFS), the second is for data.  I have configured the
> data drive with UFS and it does not exhibit the stalling problem and
> it runs in almost half the time.  I have tried many different ZFS
> settings as well: atime=off, compression=off, checksums=off,
> zil_disable=1 all to no effect.  CPU jumps to about 25% system time
> during the stalls, and hovers around 5% when data is being transferred.
> > >
> > > # zpool iostat 1
> > >                capacity     operations    bandwidth
> > > pool         used  avail   read  write   read  write
> > > ----------  -----  -----  -----  -----  -----  -----
> > > tank         183M   464G      0     17  1.12K  1.93M
> > > tank         183M   464G      0    457      0  57.2M
> > > tank         183M   464G      0    445      0  55.7M
> > > tank         183M   464G      0    405      0  50.7M
> > > tank         366M   464G      0    226      0  4.97M
> > > tank         366M   464G      0      0      0      0
> > > tank         366M   464G      0      0      0      0
> > > tank         366M   464G      0      0      0      0
> > > tank         366M   464G      0    200      0  25.0M
> > > tank         366M   464G      0    431      0  54.0M
> > > tank         366M   464G      0    445      0  55.7M
> > > tank         366M   464G      0    423      0  53.0M
> > > tank         574M   463G      0    270      0  18.1M
> > > tank         574M   463G      0      0      0      0
> > > tank         574M   463G      0      0      0      0
> > > tank         574M   463G      0      0      0      0
> > > tank         574M   463G      0    164      0  20.5M
> > > tank         574M   463G      0    504      0  63.1M
> > > tank         574M   463G      0    405      0  50.7M
> > > tank         753M   463G      0    404      0  42.6M
> > > tank         753M   463G      0      0      0      0
> > > tank         753M   463G      0      0      0      0
> > > tank         753M   463G      0      0      0      0
> > > tank         753M   463G      0    343      0  42.9M
> > > tank         753M   463G      0    476      0  59.5M
> > > tank         753M   463G      0    465      0  50.4M
> > > tank         907M   463G      0     68      0   390K
> > > tank         907M   463G      0      0      0      0
> > > tank         907M   463G      0     11      0  1.40M
> > > tank         907M   463G      0    451      0  56.4M
> > > tank         907M   463G      0    492      0  61.5M
> > > tank        1.01G   463G      0    139      0  7.94M
> > > tank        1.01G   463G      0      0      0      0
> > >
> > > Thanks,
> > > Jesse DeFer
> > >
> > >
> > > This message posted from opensolaris.org
> > > _______________________________________________
> > > zfs-discuss mailing list
> > > zfs-discuss at opensolaris.org
> > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> >
> > _______________________________________________
> > zfs-discuss mailing list
> > zfs-discuss at opensolaris.org
> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> >
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Jesse DeFer

2007-Mar-06 11:39 UTC

head link

[zfs-discuss] Re: ZFS stalling problem

> zfs-discuss-bounces at opensolaris.org wrote on
> 03/05/2007 03:56:28 AM:
> 
> > one question,
> > is there a way to stop the default txg push
> behaviour (push at regular
> > timestep-- default is 5sec) but instead push them
> "on the fly"...I
> > would imagine this is better in the case of an
> application doing big
> > sequential write (video streaming... )
> >
> > s.
> 
> 
> I do not believe you would want to do that under any
> workload -- txg allow
> for optimized writes.  I am wondering if this stall
> behavior (is it really
> stalling,  or just a visual stat issue) is more
> related to txg maxsize
> (calculated from memory/arc size) vs txg_time.
>  txg_time adjusting may
> loud the real issue if it is due to a bottleneck
> while evacing a txg or if
> the txg maxsize is miscalculated so that people are
> hitting a state where
> txg is _almost_ hitting maxsize in 5 seconds
> (txg_time default), and
> blocking the next txg while evacing -- in which case
> the core issue is the
> txg evac / maxsize.
> 
> Any thoughts?
> 
> -Wade
Wall time for my two tests is 24s for UFS and 42s for ZFS, so it
doesn''t appear to be a stat visualization problem.  I am currently
attempting to change txg_size, but am having trouble setting up a build
environment.

Jesse
> >
> > On 3/5/07, Jeff Bonwick <jeff.bonwick at sun.com>
> wrote:
> > > Jesse,
> > >
> > > This isn''t a stall -- it''s just the natural
> rhythm of pushing out
> > > transaction groups.  ZFS collects work
> (transactions) until either
> > > the transaction group is full (measured in terms
> of how much memory
> > > the system has), or five seconds elapse --
> whichever comes first.
> > >
> > > Your data would seem to suggest that the read
> side isn''t delivering
> > > data as fast as ZFS can write it.  However, it''s
> possible that
> > > there''s some sort of ''breathing''
effect that''s
> hurting performance.
> > > One simple experiment you could try: patch
> txg_time to 1.  That
> > > will cause ZFS to push transaction groups every
> second instead of
> > > the default of every 5 seconds.  If this helps
> (or if it doesn''t),
> > > please let us know.
> > >
> > > Thanks,
> > >
> > > Jeff
> > >
> > > Jesse DeFer wrote:
> > > > Hello,
> > > >
> > > > I am having problems with ZFS stalling when
> writing, any help in
> > troubleshooting would be appreciated.  Every 5
> seconds or so the
> > write bandwidth drops to zero, then picks up a few
> seconds later
> > (see the zpool iostat at the bottom of this
> message).  I am running
> > SXDE, snv_55b.
> > > >
> > > > My test consists of copying a 1gb file (with
> cp) between two
> > drives, one 80GB PATA, one 500GB SATA.  The first
> drive is the
> > system drive (UFS), the second is for data.  I have
> configured the
> > data drive with UFS and it does not exhibit the
> stalling problem and
> > it runs in almost half the time.  I have tried many
> different ZFS
> > settings as well: atime=off, compression=off,
> checksums=off,
> > zil_disable=1 all to no effect.  CPU jumps to about
> 25% system time
> > during the stalls, and hovers around 5% when data
> is being transferred.
> > > >
> > > > # zpool iostat 1
> > > >                capacity     operations
>    bandwidth
> > pool         used  avail   read  write   read
>   write
> > > ----------  -----  -----  -----  -----  -----
>  -----
> > > tank         183M   464G      0     17  1.12K
>   1.93M
> > > tank         183M   464G      0    457      0
>  57.2M
> > > tank         183M   464G      0    445      0
>   55.7M
> > > tank         183M   464G      0    405      0
>  50.7M
> > > tank         366M   464G      0    226      0
>   4.97M
> > > tank         366M   464G      0      0      0
>      0
>  tank         366M   464G      0      0      0      0
> > > tank         366M   464G      0      0      0
>       0
> tank         366M   464G      0    200      0  25.0M
> > > > tank         366M   464G      0    431      0
>  54.0M
> > > tank         366M   464G      0    445      0
>   55.7M
> > > tank         366M   464G      0    423      0
>  53.0M
> > > tank         574M   463G      0    270      0
>   18.1M
> > > tank         574M   463G      0      0      0
>      0
>  tank         574M   463G      0      0      0      0
> > > tank         574M   463G      0      0      0
>       0
> tank         574M   463G      0    164      0  20.5M
> > > > tank         574M   463G      0    504      0
>  63.1M
> > > tank         574M   463G      0    405      0
>   50.7M
> > > tank         753M   463G      0    404      0
>  42.6M
> > > tank         753M   463G      0      0      0
>       0
> tank         753M   463G      0      0      0      0
> > > > tank         753M   463G      0      0      0
>      0
>  tank         753M   463G      0    343      0  42.9M
> > > tank         753M   463G      0    476      0
>   59.5M
> > > tank         753M   463G      0    465      0
>  50.4M
> > > tank         907M   463G      0     68      0
>    390K
> > tank         907M   463G      0      0      0
>       0
> tank         907M   463G      0     11      0  1.40M
> > > > tank         907M   463G      0    451      0
>  56.4M
> > > tank         907M   463G      0    492      0
>   61.5M
> > > tank        1.01G   463G      0    139      0
>  7.94M
> > > tank        1.01G   463G      0      0      0
>       0
> > > > Thanks,
> > > > Jesse DeFer
> > > > 
 
This message posted from opensolaris.org

Roch - PAE

2007-Mar-06 11:51 UTC

head link

[zfs-discuss] Re: ZFS stalling problem

Jesse, You can change txg_time with mdb

	echo "txg_time/W0t1" | mdb -kw


-r

Jesse DeFer

2007-Mar-11 21:16 UTC

head link

[zfs-discuss] Re: ZFS stalling problem

OK, I tried it with txg_time set to 1 and am seeing less predictable results. 
The first time I ran the test it completed in 27 seconds (vs 24s for ufs or 42s
with txg_time=5).  Further tests ran from 27s to 43s, about half the time
greater than 40s.

zpool iostat doesn''t show the large no-writes gaps, but it is still
very bursty and peak bandwidth is lower.  Here is a 29s run:

tank         113K   464G      0      0      0      0
tank         113K   464G      0    226      0  28.2M
tank        40.1M   464G      0    441      0  46.9M
tank        88.2M   464G      0    384      0  39.8M
tank         136M   464G      0    445      0  47.4M
tank         184M   464G      0    412      0  43.4M
tank         232M   464G      0    411      0  43.2M
tank         272M   464G      0    402      0  42.1M
tank         320M   464G      0    435      0  46.3M
tank         368M   464G      0    366  63.4K  37.7M
tank         408M   464G      0    494      0  53.6M
tank         456M   464G      0    360      0  36.8M
tank         496M   464G      0    420      0  44.5M
tank         544M   463G      0    439      0  46.8M
tank         585M   463G      0    370      0  38.2M
tank         633M   463G      0    407      0  42.6M
tank         673M   463G      0    457      0  49.0M
tank         713M   463G      0    368      0  37.9M
tank         761M   463G      0    443      0  47.2M
tank         801M   463G      0    380  63.4K  39.4M
tank         844M   463G      0    444  63.4K  47.4M
tank         879M   463G      0    184      0  14.9M
tank         879M   463G      0    339      0  33.4M
tank         913M   463G      0    215      0  26.5M
tank         944M   463G      0    393  63.4K  36.4M
tank         976M   463G      0    171  63.4K  10.5M
tank        1008M   463G      0    237  63.4K  21.6M
tank        1008M   463G      0    312      0  31.5M
tank        1.02G   463G      0    137      0  9.05M
tank        1.05G   463G      0    313      0  23.4M
tank        1.05G   463G      0      0      0      0


Jesse
> Jesse,
> 
> This isn''t a stall -- it''s just the natural rhythm of
> pushing out
> transaction groups.  ZFS collects work (transactions)
> until either
> the transaction group is full (measured in terms of
> how much memory
> the system has), or five seconds elapse -- whichever
> comes first.
> 
> Your data would seem to suggest that the read side
> isn''t delivering
> data as fast as ZFS can write it.  However, it''s
> possible that
> there''s some sort of ''breathing'' effect
that''s
> hurting performance.
> One simple experiment you could try: patch txg_time
> to 1.  That
> will cause ZFS to push transaction groups every
> second instead of
> the default of every 5 seconds.  If this helps (or if
> it doesn''t),
> please let us know.
> 
> Thanks,
> 
> Jeff
> 
> Jesse DeFer wrote:
> > Hello,
> > 
> > I am having problems with ZFS stalling when
> writing, any help in troubleshooting would be
> appreciated.  Every 5 seconds or so the write
> bandwidth drops to zero, then picks up a few seconds
> later (see the zpool iostat at the bottom of this
> message).  I am running SXDE, snv_55b.
> > 
> > My test consists of copying a 1gb file (with cp)
> between two drives, one 80GB PATA, one 500GB SATA.
> The first drive is the system drive (UFS), the
> second is for data.  I have configured the data
> drive with UFS and it does not exhibit the stalling
> problem and it runs in almost half the time.  I have
> tried many different ZFS settings as well:
> atime=off, compression=off, checksums=off,
> zil_disable=1 all to no effect.  CPU jumps to about
> 25% system time during the stalls, and hovers around
>  5% when data is being transferred.
>  
>  # zpool iostat 1
>                 capacity     operations    bandwidth
> sed  avail   read  write   read  write
> > ----------  -----  -----  -----  -----  -----
>  -----
>  tank         183M   464G      0     17  1.12K  1.93M
>  tank         183M   464G      0    457      0  57.2M
>  tank         183M   464G      0    445      0  55.7M
>  tank         183M   464G      0    405      0  50.7M
>  tank         366M   464G      0    226      0  4.97M
>  tank         366M   464G      0      0      0      0
>  tank         366M   464G      0      0      0      0
>  tank         366M   464G      0      0      0      0
>  tank         366M   464G      0    200      0  25.0M
>  tank         366M   464G      0    431      0  54.0M
>  tank         366M   464G      0    445      0  55.7M
>  tank         366M   464G      0    423      0  53.0M
>  tank         574M   463G      0    270      0  18.1M
>  tank         574M   463G      0      0      0      0
>  tank         574M   463G      0      0      0      0
>  tank         574M   463G      0      0      0      0
>  tank         574M   463G      0    164      0  20.5M
>  tank         574M   463G      0    504      0  63.1M
>  tank         574M   463G      0    405      0  50.7M
>  tank         753M   463G      0    404      0  42.6M
>  tank         753M   463G      0      0      0      0
>  tank         753M   463G      0      0      0      0
>  tank         753M   463G      0      0      0      0
>  tank         753M   463G      0    343      0  42.9M
>  tank         753M   463G      0    476      0  59.5M
>  tank         753M   463G      0    465      0  50.4M
>  tank         907M   463G      0     68      0   390K
>  tank         907M   463G      0      0      0      0
>  tank         907M   463G      0     11      0  1.40M
>  tank         907M   463G      0    451      0  56.4M
>  tank         907M   463G      0    492      0  61.5M
>  tank        1.01G   463G      0    139      0  7.94M
>  tank        1.01G   463G      0      0      0      0
>  
>  Thanks,
>  Jesse DeFer 
 
This message posted from opensolaris.org

Selim Daoud

2007-Mar-12 06:07 UTC

head link

[zfs-discuss] Re: ZFS stalling problem

I observed better predictable thoughput if I use a  IO generator that
can do throttling (xdd or vdbench)

s.

On 3/11/07, Jesse DeFer <opensolaris at dotd.com>
wrote:> OK, I tried it with txg_time set to 1 and am seeing less predictable
results.  The first time I ran the test it completed in 27 seconds (vs 24s for
ufs or 42s with txg_time=5).  Further tests ran from 27s to 43s, about half the
time greater than 40s.
>
> zpool iostat doesn''t show the large no-writes gaps, but it is
still very bursty and peak bandwidth is lower.  Here is a 29s run:
>
> tank         113K   464G      0      0      0      0
> tank         113K   464G      0    226      0  28.2M
> tank        40.1M   464G      0    441      0  46.9M
> tank        88.2M   464G      0    384      0  39.8M
> tank         136M   464G      0    445      0  47.4M
> tank         184M   464G      0    412      0  43.4M
> tank         232M   464G      0    411      0  43.2M
> tank         272M   464G      0    402      0  42.1M
> tank         320M   464G      0    435      0  46.3M
> tank         368M   464G      0    366  63.4K  37.7M
> tank         408M   464G      0    494      0  53.6M
> tank         456M   464G      0    360      0  36.8M
> tank         496M   464G      0    420      0  44.5M
> tank         544M   463G      0    439      0  46.8M
> tank         585M   463G      0    370      0  38.2M
> tank         633M   463G      0    407      0  42.6M
> tank         673M   463G      0    457      0  49.0M
> tank         713M   463G      0    368      0  37.9M
> tank         761M   463G      0    443      0  47.2M
> tank         801M   463G      0    380  63.4K  39.4M
> tank         844M   463G      0    444  63.4K  47.4M
> tank         879M   463G      0    184      0  14.9M
> tank         879M   463G      0    339      0  33.4M
> tank         913M   463G      0    215      0  26.5M
> tank         944M   463G      0    393  63.4K  36.4M
> tank         976M   463G      0    171  63.4K  10.5M
> tank        1008M   463G      0    237  63.4K  21.6M
> tank        1008M   463G      0    312      0  31.5M
> tank        1.02G   463G      0    137      0  9.05M
> tank        1.05G   463G      0    313      0  23.4M
> tank        1.05G   463G      0      0      0      0
>
>
> Jesse
>
> > Jesse,
> >
> > This isn''t a stall -- it''s just the natural rhythm
of
> > pushing out
> > transaction groups.  ZFS collects work (transactions)
> > until either
> > the transaction group is full (measured in terms of
> > how much memory
> > the system has), or five seconds elapse -- whichever
> > comes first.
> >
> > Your data would seem to suggest that the read side
> > isn''t delivering
> > data as fast as ZFS can write it.  However, it''s
> > possible that
> > there''s some sort of ''breathing'' effect
that''s
> > hurting performance.
> > One simple experiment you could try: patch txg_time
> > to 1.  That
> > will cause ZFS to push transaction groups every
> > second instead of
> > the default of every 5 seconds.  If this helps (or if
> > it doesn''t),
> > please let us know.
> >
> > Thanks,
> >
> > Jeff
> >
> > Jesse DeFer wrote:
> > > Hello,
> > >
> > > I am having problems with ZFS stalling when
> > writing, any help in troubleshooting would be
> > appreciated.  Every 5 seconds or so the write
> > bandwidth drops to zero, then picks up a few seconds
> > later (see the zpool iostat at the bottom of this
> > message).  I am running SXDE, snv_55b.
> > >
> > > My test consists of copying a 1gb file (with cp)
> > between two drives, one 80GB PATA, one 500GB SATA.
> > The first drive is the system drive (UFS), the
> > second is for data.  I have configured the data
> > drive with UFS and it does not exhibit the stalling
> > problem and it runs in almost half the time.  I have
> > tried many different ZFS settings as well:
> > atime=off, compression=off, checksums=off,
> > zil_disable=1 all to no effect.  CPU jumps to about
> > 25% system time during the stalls, and hovers around
> >  5% when data is being transferred.
> >
> >  # zpool iostat 1
> >                 capacity     operations    bandwidth
> > sed  avail   read  write   read  write
> > > ----------  -----  -----  -----  -----  -----
> >  -----
> >  tank         183M   464G      0     17  1.12K  1.93M
> >  tank         183M   464G      0    457      0  57.2M
> >  tank         183M   464G      0    445      0  55.7M
> >  tank         183M   464G      0    405      0  50.7M
> >  tank         366M   464G      0    226      0  4.97M
> >  tank         366M   464G      0      0      0      0
> >  tank         366M   464G      0      0      0      0
> >  tank         366M   464G      0      0      0      0
> >  tank         366M   464G      0    200      0  25.0M
> >  tank         366M   464G      0    431      0  54.0M
> >  tank         366M   464G      0    445      0  55.7M
> >  tank         366M   464G      0    423      0  53.0M
> >  tank         574M   463G      0    270      0  18.1M
> >  tank         574M   463G      0      0      0      0
> >  tank         574M   463G      0      0      0      0
> >  tank         574M   463G      0      0      0      0
> >  tank         574M   463G      0    164      0  20.5M
> >  tank         574M   463G      0    504      0  63.1M
> >  tank         574M   463G      0    405      0  50.7M
> >  tank         753M   463G      0    404      0  42.6M
> >  tank         753M   463G      0      0      0      0
> >  tank         753M   463G      0      0      0      0
> >  tank         753M   463G      0      0      0      0
> >  tank         753M   463G      0    343      0  42.9M
> >  tank         753M   463G      0    476      0  59.5M
> >  tank         753M   463G      0    465      0  50.4M
> >  tank         907M   463G      0     68      0   390K
> >  tank         907M   463G      0      0      0      0
> >  tank         907M   463G      0     11      0  1.40M
> >  tank         907M   463G      0    451      0  56.4M
> >  tank         907M   463G      0    492      0  61.5M
> >  tank        1.01G   463G      0    139      0  7.94M
> >  tank        1.01G   463G      0      0      0      0
> >
> >  Thanks,
> >  Jesse DeFer
>
>
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>

Roch - PAE

2007-Mar-12 10:07 UTC

head link

[zfs-discuss] Re: ZFS stalling problem

Working with a small txg_time means we are hit by the pool
sync overhead more often. This is why the per second
throughpuot has smaller peak values.

With txg_time = 5, we have another problem which is that
depending on timing of the pool sync, some txg can end up 
with too little data in them and sync quickly. We''re closing 
in (I hope) on fixing both issues:


	6429205 each zpool needs to monitor its throughput and throttle heavy writers 
	6415647 Sequential writing is jumping

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id
-r


Jesse DeFer writes:
 > OK, I tried it with txg_time set to 1 and am seeing less predictable
 > results.  The first time I ran the test it completed in 27 seconds (vs
 > 24s for ufs or 42s with txg_time=5).  Further tests ran from 27s to
 > 43s, about half the time greater than 40s. 
 > 
 > zpool iostat doesn''t show the large no-writes gaps, but it is
still very bursty and peak bandwidth is lower.  Here is a 29s run:
 > 
 > tank         113K   464G      0      0      0      0
 > tank         113K   464G      0    226      0  28.2M
 > tank        40.1M   464G      0    441      0  46.9M
 > tank        88.2M   464G      0    384      0  39.8M
 > tank         136M   464G      0    445      0  47.4M
 > tank         184M   464G      0    412      0  43.4M
 > tank         232M   464G      0    411      0  43.2M
 > tank         272M   464G      0    402      0  42.1M
 > tank         320M   464G      0    435      0  46.3M
 > tank         368M   464G      0    366  63.4K  37.7M
 > tank         408M   464G      0    494      0  53.6M
 > tank         456M   464G      0    360      0  36.8M
 > tank         496M   464G      0    420      0  44.5M
 > tank         544M   463G      0    439      0  46.8M
 > tank         585M   463G      0    370      0  38.2M
 > tank         633M   463G      0    407      0  42.6M
 > tank         673M   463G      0    457      0  49.0M
 > tank         713M   463G      0    368      0  37.9M
 > tank         761M   463G      0    443      0  47.2M
 > tank         801M   463G      0    380  63.4K  39.4M
 > tank         844M   463G      0    444  63.4K  47.4M
 > tank         879M   463G      0    184      0  14.9M
 > tank         879M   463G      0    339      0  33.4M
 > tank         913M   463G      0    215      0  26.5M
 > tank         944M   463G      0    393  63.4K  36.4M
 > tank         976M   463G      0    171  63.4K  10.5M
 > tank        1008M   463G      0    237  63.4K  21.6M
 > tank        1008M   463G      0    312      0  31.5M
 > tank        1.02G   463G      0    137      0  9.05M
 > tank        1.05G   463G      0    313      0  23.4M
 > tank        1.05G   463G      0      0      0      0
 > 
 > 
 > Jesse
 > 
 > > Jesse,
 > > 
 > > This isn''t a stall -- it''s just the natural rhythm
of
 > > pushing out
 > > transaction groups.  ZFS collects work (transactions)
 > > until either
 > > the transaction group is full (measured in terms of
 > > how much memory
 > > the system has), or five seconds elapse -- whichever
 > > comes first.
 > > 
 > > Your data would seem to suggest that the read side
 > > isn''t delivering
 > > data as fast as ZFS can write it.  However, it''s
 > > possible that
 > > there''s some sort of ''breathing'' effect
that''s
 > > hurting performance.
 > > One simple experiment you could try: patch txg_time
 > > to 1.  That
 > > will cause ZFS to push transaction groups every
 > > second instead of
 > > the default of every 5 seconds.  If this helps (or if
 > > it doesn''t),
 > > please let us know.
 > > 
 > > Thanks,
 > > 
 > > Jeff
 > > 
 > > Jesse DeFer wrote:
 > > > Hello,
 > > > 
 > > > I am having problems with ZFS stalling when
 > > writing, any help in troubleshooting would be
 > > appreciated.  Every 5 seconds or so the write
 > > bandwidth drops to zero, then picks up a few seconds
 > > later (see the zpool iostat at the bottom of this
 > > message).  I am running SXDE, snv_55b.
 > > > 
 > > > My test consists of copying a 1gb file (with cp)
 > > between two drives, one 80GB PATA, one 500GB SATA.
 > > The first drive is the system drive (UFS), the
 > > second is for data.  I have configured the data
 > > drive with UFS and it does not exhibit the stalling
 > > problem and it runs in almost half the time.  I have
 > > tried many different ZFS settings as well:
 > > atime=off, compression=off, checksums=off,
 > > zil_disable=1 all to no effect.  CPU jumps to about
 > > 25% system time during the stalls, and hovers around
 > >  5% when data is being transferred.
 > >  
 > >  # zpool iostat 1
 > >                 capacity     operations    bandwidth
 > > sed  avail   read  write   read  write
 > > > ----------  -----  -----  -----  -----  -----
 > >  -----
 > >  tank         183M   464G      0     17  1.12K  1.93M
 > >  tank         183M   464G      0    457      0  57.2M
 > >  tank         183M   464G      0    445      0  55.7M
 > >  tank         183M   464G      0    405      0  50.7M
 > >  tank         366M   464G      0    226      0  4.97M
 > >  tank         366M   464G      0      0      0      0
 > >  tank         366M   464G      0      0      0      0
 > >  tank         366M   464G      0      0      0      0
 > >  tank         366M   464G      0    200      0  25.0M
 > >  tank         366M   464G      0    431      0  54.0M
 > >  tank         366M   464G      0    445      0  55.7M
 > >  tank         366M   464G      0    423      0  53.0M
 > >  tank         574M   463G      0    270      0  18.1M
 > >  tank         574M   463G      0      0      0      0
 > >  tank         574M   463G      0      0      0      0
 > >  tank         574M   463G      0      0      0      0
 > >  tank         574M   463G      0    164      0  20.5M
 > >  tank         574M   463G      0    504      0  63.1M
 > >  tank         574M   463G      0    405      0  50.7M
 > >  tank         753M   463G      0    404      0  42.6M
 > >  tank         753M   463G      0      0      0      0
 > >  tank         753M   463G      0      0      0      0
 > >  tank         753M   463G      0      0      0      0
 > >  tank         753M   463G      0    343      0  42.9M
 > >  tank         753M   463G      0    476      0  59.5M
 > >  tank         753M   463G      0    465      0  50.4M
 > >  tank         907M   463G      0     68      0   390K
 > >  tank         907M   463G      0      0      0      0
 > >  tank         907M   463G      0     11      0  1.40M
 > >  tank         907M   463G      0    451      0  56.4M
 > >  tank         907M   463G      0    492      0  61.5M
 > >  tank        1.01G   463G      0    139      0  7.94M
 > >  tank        1.01G   463G      0      0      0      0
 > >  
 > >  Thanks,
 > >  Jesse DeFer
 >  
 >  
 > This message posted from opensolaris.org
 > _______________________________________________
 > zfs-discuss mailing list
 > zfs-discuss at opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

zfs discuss - Mar 2007 - ZFS stalling problem

[zfs-discuss] ZFS stalling problem

[zfs-discuss] ZFS stalling problem

[zfs-discuss] ZFS stalling problem

[zfs-discuss] ZFS stalling problem

[zfs-discuss] Re: ZFS stalling problem

[zfs-discuss] Re: ZFS stalling problem

[zfs-discuss] Re: ZFS stalling problem

[zfs-discuss] Re: ZFS stalling problem

[zfs-discuss] Re: ZFS stalling problem