thr3ads.net - zfs discuss - [zfs-discuss] ZFS send needs optimalization [Jul 2007]

If this information is useful, please help other people find it:
Share via:

Łukasz

2007-Jul-23 12:19 UTC

[zfs-discuss] ZFS send needs optimalization

ZFS send is very slow.
dmu_sendbackup function is traversing dataset in one thread and in 
traverse callback function ( backup_cb  ) we are waiting for data in 
arc_read called with ARC_WAIT flag.

I want to parallize zfs send to make it faster. 
dmu_sendbackup could allocate buffer, that will be used for buffering output.
Few threads can traverse dataset, few threads would be used for async read
operations.

I think it could speed up zfs send operation 10x.

What do you think about it ?
 
 
This message posted from opensolaris.org

Robert Milkowski

2007-Jul-23 14:13 UTC

head link

[zfs-discuss] ZFS send needs optimalization

Hello ?ukasz,

Monday, July 23, 2007, 1:19:16 PM, you wrote:

?> ZFS send is very slow.
?> dmu_sendbackup function is traversing dataset in one thread and in 
?> traverse callback function ( backup_cb  ) we are waiting for data in 
?> arc_read called with ARC_WAIT flag.

?> I want to parallize zfs send to make it faster. 
?> dmu_sendbackup could allocate buffer, that will be used for buffering
output.
?> Few threads can traverse dataset, few threads would be used for async read
operations.

?> I think it could speed up zfs send operation 10x.

?> What do you think about it ?

I guess you should check with Matthew Ahrens as IIRC he''s working on
''zfs send -r'' and possibly some other improvements to zfs
send. The
question is what code changes Matthew has done so far (it hasn''t been
integrated AFAIK) and possibly work from there. Or perhaps Matthew is
already working on it also...

Now, if zfs resides on lots of disks then I guess it should speed up
zfs send considerably, at least in some cases (lot of small files,
written/deleted/created randomly).

Then it would be great if you could implement something and share with
us some results to see if there''s actually some performance gain.

Also I guess you''ll have to write all transactions to the other end
(zfs recv) in the same order they were created on disk,or not?


ps. Lukasz - nice to see you here more and more :)


-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

Matthew Ahrens

2007-Jul-23 16:56 UTC

head link

[zfs-discuss] ZFS send needs optimalization

Robert Milkowski wrote:> Hello ?ukasz,
> 
> Monday, July 23, 2007, 1:19:16 PM, you wrote:
> 
> ?> ZFS send is very slow.
> ?> dmu_sendbackup function is traversing dataset in one thread and in 
> ?> traverse callback function ( backup_cb  ) we are waiting for data in 
> ?> arc_read called with ARC_WAIT flag.
That''s correct.
> ?> I want to parallize zfs send to make it faster. 
> ?> dmu_sendbackup could allocate buffer, that will be used for buffering
output.
> ?> Few threads can traverse dataset, few threads would be used for async
read operations.
> 
> ?> I think it could speed up zfs send operation 10x.
> 
> ?> What do you think about it ?
You''re right that we need to issue more i/os in parallel -- see 6333409
"traversal code should be able to issue multiple reads in parallel"

However, it may be much more straightforward to just issue prefetches 
appropriately, rather than attempt to coordinate multiple threads.  That 
said, feel free to experiment.
> I guess you should check with Matthew Ahrens as IIRC he''s working
on
> ''zfs send -r'' and possibly some other improvements to zfs
send. The
> question is what code changes Matthew has done so far (it hasn''t
been
> integrated AFAIK) and possibly work from there. Or perhaps Matthew is
> already working on it also...
Unfortunately I am not working on this bug as part of my "zfs send -r"
changes.  But I plan to work on it (unless you get to it first!) later this 
year as part of the pool space reduction changes.
> Also I guess you''ll have to write all transactions to the other
end
> (zfs recv) in the same order they were created on disk,or not?
Nope, that''s (one of) the beauty of zfs send.

--matt

Łukasz

2007-Jul-24 08:03 UTC

head link

[zfs-discuss] ZFS send needs optimalization

> > ?> I want to parallize zfs send to make it faster. 
> > ?> dmu_sendbackup could allocate buffer, that will
> be used for buffering output.
> > ?> Few threads can traverse dataset, few threads
> would be used for async read operations.
> > 
> > ?> I think it could speed up zfs send operation
> 10x.
> > 
> > ?> What do you think about it ?
> 
> You''re right that we need to issue more i/os in
> parallel -- see 6333409 
> "traversal code should be able to issue multiple
> reads in parallel"
When do you think it will be available ?
> However, it may be much more straightforward to just
> issue prefetches 
> appropriately, rather than attempt to coordinate
> multiple threads.  That 
> said, feel free to experiment.
How can I prefetch data ? Traverse dataset in second thread ?

Correct me if I''m wrong. 
Adding simple buffering could speed up sending operation. Now for each packet
we are calling [b]vn_rdwr[/b] function.

What do you think about smaller dmu_replay_record_t struct.
Remove
  char drr_toname[MAXNAMELEN];
from drr_begin struct and for DRR_BEGIN command add read/write MAXNAMELEN bytes.
 
 
This message posted from opensolaris.org

Matthew Ahrens

2007-Jul-24 19:59 UTC

head link

[zfs-discuss] ZFS send needs optimalization

?ukasz wrote:>> You''re right that we need to issue more i/os in
>> parallel -- see 6333409 
>> "traversal code should be able to issue multiple
>> reads in parallel"
> 
> When do you think it will be available ?
Perhaps by the end of the calendar year, but perhaps longer.  Maybe sooner 
if you work on it :-)
>> However, it may be much more straightforward to just
>> issue prefetches 
>> appropriately, rather than attempt to coordinate
>> multiple threads.  That 
>> said, feel free to experiment.
> 
> How can I prefetch data ? Traverse dataset in second thread ?
No; see dmu_prefetch().
> Correct me if I''m wrong. 
> Adding simple buffering could speed up sending operation. Now for each
packet
> we are calling [b]vn_rdwr[/b] function.
Perhaps; try timing with "zfs send ... > /dev/null".  However much
faster
that is than sending it to your preferred location is the maximum amount of 
performance to be gained.
> What do you think about smaller dmu_replay_record_t struct.
> Remove
>   char drr_toname[MAXNAMELEN];
> from drr_begin struct and for DRR_BEGIN command add read/write MAXNAMELEN
bytes.
Yeah, that would be nice.  But it would sure be nice to be able to still 
read the old-style records too.

--matt

Matthew Ahrens

2007-Jul-24 20:01 UTC

head link

[zfs-discuss] ZFS send needs optimalization

?ukasz K wrote:> Hello Matthew,
> 
>    I have problems with pool fragmentation.
> http://www.opensolaris.org/jive/thread.jspa?threadID=34810
> 
> Now I want to speed up zfs send, because our pool space maps are
> huge - after sending space maps will be smaller ( from 1GB -> 50MB ).
> 
> As I understand I there will not be anything like defragmentation,
We will be implementing defragmentation with the device removal feature, 
perhaps by the end of the calendar year.
> so I need to live with this. But some changes could help:
> 1. Auto tune recordsize, when pool is out of 128kB blocks, then should 
> use smaller ones.
> 2. We should be more careful with unloading space maps. I have enough 
> RAM to keep metaslabs in memory.
> 
> Can you help me with changing this algorithms ?
See where metaslab_sync_done() calls space_map_unload().

--matt

zfs discuss - Jul 2007 - ZFS send needs optimalization

[zfs-discuss] ZFS send needs optimalization

[zfs-discuss] ZFS send needs optimalization

[zfs-discuss] ZFS send needs optimalization

[zfs-discuss] ZFS send needs optimalization

[zfs-discuss] ZFS send needs optimalization

[zfs-discuss] ZFS send needs optimalization