I use an application with a fairly large receive data buffer (256MB) to replicate data between sites. I have noticed the buffer becoming completely full when receiving snapshots for some filesystems, even over a slow (~2MB/sec) WAN connection. I assume this is due to the changes being widely scattered. Is there any way to improve this situation? Thanks, -- Ian.
On Apr 11, 2012, at 1:34 AM, Ian Collins wrote:> I use an application with a fairly large receive data buffer (256MB) to replicate data between sites. > > I have noticed the buffer becoming completely full when receiving snapshots for some filesystems, even over a slow (~2MB/sec) WAN connection. I assume this is due to the changes being widely scattered.Widely scattered on the sending side, receiving side should be mostly contiguous... unless you are mostly full or there is some other cause of slow writes. The usual disk-oriented performance analysis will show if this is the case. Most likely, something else is going on here.> Is there any way to improve this situation?Surely there must be... -- richard -- ZFS Performance and Training Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20120411/88049ddb/attachment.html>
On 04/12/12 04:17 AM, Richard Elling wrote:> On Apr 11, 2012, at 1:34 AM, Ian Collins wrote: > >> I use an application with a fairly large receive data buffer (256MB) >> to replicate data between sites. >> >> I have noticed the buffer becoming completely full when receiving >> snapshots for some filesystems, even over a slow (~2MB/sec) WAN >> connection. I assume this is due to the changes being widely scattered. > > Widely scattered on the sending side, receiving side should be mostly > contiguous...That''s what I originally thought.> unless you are mostly full or there is some other cause of slow > writes. The usual disk-oriented > performance analysis will show if this is the case. Most likely, > something else is going on here. >Odd. The pool is a single iSCSI volume exported from a 7320 and there is 18TB free. I see the same issues with local replications on our LAN. The filesystems that appear to write slowly are ones containing many small files, such as office documents. Over the WAN, the receive buffer high water mark is usually the TCP receive window size, except for the apparently slow filesystems. I''ll add some more diagnostics. -- Ian.
On 2012-Apr-11 18:34:42 +1000, Ian Collins <ian at ianshome.com> wrote:>I use an application with a fairly large receive data buffer (256MB) to >replicate data between sites. > >I have noticed the buffer becoming completely full when receiving >snapshots for some filesystems, even over a slow (~2MB/sec) WAN >connection. I assume this is due to the changes being widely scattered.As Richard pointed out, the write side should be mostly contiguous.>Is there any way to improve this situation?Is the target pool nearly full (so ZFS is spending lots of time searching for free space)? Do you have dedupe enabled on the target pool? This would force ZFS to search the DDT to write blocks - this will be expensive, especially if you don''t have enough RAM. Do yoy have a high compression level (gzip or gzip-N) on the target filesystems, without enough CPU horsepower? Do you have a dying (or dead) disk in the target pool? -- Peter Jeremy
On 04/12/12 09:00 AM, Jim Klimov wrote:> 2012-04-11 23:55, Ian Collins wrote: >> Odd. The pool is a single iSCSI volume exported from a 7320 and there is >> 18TB free. > Lame question: is that 18Tb free on the pool inside the > iSCSI volume, or on the backing pool on 7320? > > I mean that as far as the "external" pool is concerned, > the zvol''s blocks are allocated - even if the "internal" > pool considers them deleted but did not zero them out > and/or TRIM them explicitly. > > Thus there may be lags due to fragmentation on the backing > "external" pool (physical on 7320), especially if it is > not very free and/or ifs free space is already too heavily > fragmented into many small "bubbles".I''ll check, but I see the same effect with local replications as well. -- Ian.
On 04/12/12 09:51 AM, Peter Jeremy wrote:> On 2012-Apr-11 18:34:42 +1000, Ian Collins<ian at ianshome.com> wrote: >> I use an application with a fairly large receive data buffer (256MB) to >> replicate data between sites. >> >> I have noticed the buffer becoming completely full when receiving >> snapshots for some filesystems, even over a slow (~2MB/sec) WAN >> connection. I assume this is due to the changes being widely scattered. > As Richard pointed out, the write side should be mostly contiguous. > >> Is there any way to improve this situation? > Is the target pool nearly full (so ZFS is spending lots of time searching > for free space)? > > Do you have dedupe enabled on the target pool? This would force ZFS to > search the DDT to write blocks - this will be expensive, especially if > you don''t have enough RAM. > > Do yoy have a high compression level (gzip or gzip-N) on the target > filesystems, without enough CPU horsepower? > > Do you have a dying (or dead) disk in the target pool? >No to all of the above! -- Ian.