thr3ads.net - zfs discuss - [zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg [Mar 2009]

If this information is useful, please help other people find it:
Share via:

Alec Muffett

2009-Mar-27 14:39 UTC

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

Hi,

The inability to create more than 1 clone at a time (ie: in separate  
TXGs) is something which has hampered me (and several projects on  
which I have worked) for some years, now.

Specifically I am looking at various forms of diskless grid/cloud  
environments where you create a "golden image", snapshot it, and then
clone that snapshot perhaps 1000 times for 1000 machines... poking the  
image slightly each time, and letting DHCP pick up the administrative  
slack of systems management.


To create 1000 clones in this fashion (for i in `range 1 1000`  do ;  
zfs clone [...] ; done) may take well over 1 hour, because 1000 TXGs  
need to be set up and committed.

I would like to create 1000 clones in rather less than a couple of  
minutes.


I''ve kicked this idea around with Darren Moffat and he informs me that
it is painful to achieve the obvious solution:

	zfs clone tank/src at version  tank/foo tank/bar tank/baz ...

...because to explicitly specify each (of multiple) clone names on the  
cmdline causes multiple calls to ioctl() and therefore multiple TXGs,  
and thus slowness.

Thus we hit on this proposal, for your consideration:



zfs multiclone tank/fs at 1 tank/<PATTERN> <BEGIN> <END>
[STRIDE]

- implement limited but comprehensive snprintf semantics for PATTERN

- support: %d, %3d, %03d, FOO%d, FOO%dBAR, FOO%#08xBAR

- includes decimal, hex, octal

- BEGIN, END, STRIDE all decimal positive integers

- STRIDE optional, defaults to 1

- creation begins at BEGIN, increments by STRIDE, continues until END
   is exceeded


Examples:

- zfs multiclone tank/shark at 1 tank/fish%02d 0 7 2

- creates: /tank/fish00 /tank/fish02 /tank/fish04 /tank/fish06

- zfs multiclone tank/gold-image tank/diskless/node%d.root 1 100

- ...is pretty obvious. (.../node1.root/ etc)


What do you think?

	- alec

Mark J Musante

2009-Mar-27 14:49 UTC

head link

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

On Fri, 27 Mar 2009, Alec Muffett wrote:
> The inability to create more than 1 clone at a time (ie: in separate 
> TXGs) is something which has hampered me (and several projects on which 
> I have worked) for some years, now.
Hi Alec,

Does CR 6475257 cover what you''re looking for?


Regards,
markm

Alec Muffett

2009-Mar-27 14:53 UTC

head link

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

I would like to apologise to those reading via the forums, because I  
used BNF anglebrackets and even though I sent a plaintext message, it  
lost my text as "HTML"...

	zfs multiclone tank/fs at 1 tank/PATTERN BEGIN END [STRIDE]

	-a

Darren J Moffat

2009-Mar-27 14:54 UTC

head link

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

Mark J Musante wrote:> On Fri, 27 Mar 2009, Alec Muffett wrote:
> 
>> The inability to create more than 1 clone at a time (ie: in separate 
>> TXGs) is something which has hampered me (and several projects on 
>> which I have worked) for some years, now.
> 
> Hi Alec,
> 
> Does CR 6475257 cover what you''re looking for?
It was the same Alec that logged 6475257 back in 2006.

--
Darren J Moffat

Richard Elling

2009-Mar-27 15:04 UTC

head link

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

Alec Muffett wrote:> I would like to apologise to those reading via the forums, because I 
> used BNF anglebrackets and even though I sent a plaintext message, it 
> lost my text as "HTML"...
>
>     zfs multiclone tank/fs at 1 tank/PATTERN BEGIN END [STRIDE]
So much for an easy-to-use CLI :-O
How about feeding in a file containing names instead (qv fmthard -s)?
 -- richard

Alec Muffett

2009-Mar-27 15:25 UTC

head link

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

> So much for an easy-to-use CLI :-O
> How about feeding in a file containing names instead (qv fmthard -s)?
Not terribly script-friendly; I suffered that sort of thing with  
zonecfg and zoneadm (create a controlfile and squirt it into another  
command) and deemed it a horrible hack.  They are still too broken for  
me to face using in their raw state except on special occasions...

Same reason I never use fgrep, it''s just not the true Unix way.

If it *was* the true Unix way, you would never have written something  
like:

	rm `cat files-to-delete.txt`

...or anything else in backticks in your life; nor would you ever have  
used xargs... :-)

	-a

Darren J Moffat

2009-Mar-27 15:33 UTC

head link

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

Alec Muffett wrote:>> So much for an easy-to-use CLI :-O
>> How about feeding in a file containing names instead (qv fmthard -s)?
> 
> Not terribly script-friendly; I suffered that sort of thing with zonecfg 
> and zoneadm (create a controlfile and squirt it into another command) 
> and deemed it a horrible hack.  They are still too broken for me to face 
> using in their raw state except on special occasions...
Also it is unfriendly to the way the zfs ioctl code works today because 
that would mean passing every single filename down to the kernel over a 
single ioctl call.

The reason for the "pattern" based filenames is because:
	a) that is probably what is wanted most of the time anyway
	b) it is easy to pass from userland to kernel - you pass the
	   rules (after some userland sanity checking first) as is.

--
Darren J Moffat

Alec Muffett

2009-Mar-27 15:36 UTC

head link

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

> The reason for the "pattern" based filenames is because:
> 	a) that is probably what is wanted most of the time anyway
> 	b) it is easy to pass from userland to kernel - you pass the
> 	   rules (after some userland sanity checking first) as is.



Just to quote what I wrote back in 2006 (ahem) which would *also* fit  
the "single ioctl()" model:

>> 2) zfs clone -C N <snapshot> <filesystemXXXXXX>
>>
>> - ie: following the inspiration of mktemp(3c), when invokes with
"-C"
>>   (count) then N clones are created, numbered 0 thru (N-1), where the
>>   counter number for each one will replace XXXXXX in the filesystem
>>   pattern, zero-padded left if needed.
>>
>>   For example:
>>
>>       zfs clone -C 1000 pool/ xxxxx at xxxxx  pool/fishXXXXXX
>>
>>    ...yields clones named pool/fish000000 through pool/fish000999.

...but I like the control of having a snprintf() pattern with START/ 
STOP/STEP, more.  It brings out the BASIC programmer in me...

	- alec

Chris Kirby

2009-Mar-27 15:46 UTC

head link

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

On Mar 27, 2009, at 10:33 AM, Darren J Moffat wrote:> 	a) that is probably what is wanted most of the time anyway
> 	b) it is easy to pass from userland to kernel - you pass the
> 	   rules (after some userland sanity checking first) as is.

But doesn''t that also exclude the possibility of creating non-pattern  
based
clones in a single txg?

While I think that allowing multiple clones to be created in a single  
txg
is perfectly reasonable, we shouldn''t need to artificially restrict the
clone namespace in order to achieve that.

-Chris

Darren J Moffat

2009-Mar-27 15:58 UTC

head link

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

Chris Kirby wrote:> On Mar 27, 2009, at 10:33 AM, Darren J Moffat wrote:
>>     a) that is probably what is wanted most of the time anyway
>>     b) it is easy to pass from userland to kernel - you pass the
>>        rules (after some userland sanity checking first) as is.
> 
> 
> But doesn''t that also exclude the possibility of creating
non-pattern based
> clones in a single txg?
Yes it does.
> While I think that allowing multiple clones to be created in a single txg
> is perfectly reasonable, we shouldn''t need to artificially
restrict the
> clone namespace in order to achieve that.
Agreed, but other than pattern based I can''t at the moment thing of a 
nice way to pass all the names over the /dev/zfs ioctl call while 
maintaining the fact it is pretty much all fixed size.

I''m not saying passing a list of names over the ioctl is impossible, 
more it just doesn''t feel right to me at the moment - but I''m
happy to
be convinced otherwise.  That way the patterning part can be left to the 
shell.

--
Darren J Moffat

Miles Nordin

2009-Mar-27 18:52 UTC

head link

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

>>>>> "djm" == Darren J Moffat <darrenm at
opensolaris.org> writes:
djm> I''m not saying passing a list of names over the ioctl is
djm> impossible, more it just doesn''t feel right to me at the
djm> moment - but I''m happy to be convinced otherwise.

im not sure I want to convince you otherwise. but here are two
attempts for considering:

1. When we were talking about ''zfs list'' scalability and
''zfs destroy
<snapshot>'' scalability with thousands of filesystems there
was
also some discussion of time burned up making thousands of ioctls
to accomplish one administrative action. Maybe the ioctl
packing/unpacking/copyin overhead is part of the problem. Or maybe
there is actually work to be done in each of those thousand ioctl
so that combining them will be of no benefit, but making the
in-kernel work more efficient would probably be easier if it were
coalesced into fewer ioctls rather than many. sometihng that can
delete, insert, query multiple rows per operation like SQL, albeit
JUST multirow, without a parsed text grammar and without stuff like
''UPDATE'' for supporting multiple writers, might *not* be
overkill.

2. How is stuff like snapshot -r implemented atomically? Could a more
complicated ioctl interface make -r more elegant rather than less?

Maybe the ultimate question isn''t ``should we pass an asston of
stuff in one ioctl''''. The questions are more like:

* given this will probably not be the last atomic-change
needed---snapshot -r needed for consistency, this needed for
speed, and who-knows-what next?---which do you find less
maintainable:

(a) transactional ioctl interface, where you call ioctl
BEGINTRANSACTION, ioctl DOSTUFF ioctl DOSTUFF ioctl DOSTUFF,
ioctl COMMIT

(b) big ioctl interface where you express everything you want
done at once in one possibly-complicated large structured
ioctl blob and return success/fail on the whole blob

the (b) seems to be more in line with this nvlist hairy stuff
infesting solaris everywhere so maybe that''s better?

* what would you find simpler / better-respecting kernel-userland
boundary?

(a) passing instructions in some rather ugly interpreted bytecode
language, ``0xC0 means RANGE opcode, arguments to follow in
8-bit registers,'''' full of .h macros and lots of
structs in
unions. kernel executes bytecode to expand full argument
list, then dostuff. (your current favored proposal)

(b) expand in userland in C or in bash, rather than in kernel
crappy-switch()-based-bytecode-interpreter, simply pass the
full argument list in the ioctl, copyin, dostuff.

I''m reminded of printers that kept advertising increasingly
complicated page-description languages because the parallel port
and LocalTalk were so slow that publishers wanted to express
their pages in as few bits as possible.

now, this is not the only reason said thing happened with
printers. The printer became a hardware dongle enforcing your
``authorized'''' use of the fonts, which were encrypted in
an
attempt to bind them to $complicatedlanguage---they never got
this far, but if it still existed would probably be enforcing
rules like ``you have to pay for the Professional version of the
font if you want to print more than two consecutive capital
letters. The Home font is only for normal home correspondence so
consecutive capitals will be downcased automatically.''''
But the
former slow-interface-port reasons are how it was pitched, how
the architecture was justified.

People bought the argument. Printers grew hard drives to cache
fonts and reuseable ``preamble'''' libraries written in
$complicatedlanguage, printer CPUs and RAM''s bigger than the
computers driving them, and hidden cost of $complicatedlanguage
almost exceeded that of the publishing package driving it.
Sometimes you could see a page on the screen, but it was too
``complicated'''' to print---solution: buy a bigger
printer!
WYSIWYG broke since publishing package had to reimplement
$complicatedlanguage on screen, badly. merchants of
$complicatedlanguage, who are now a hegemonic monopoly, sneakily
trickle-sold heavily-DRM''d brokeass versions of
$complicaedlanguage for linking with the publishing packages to
make WYSIWYG work again like it used to, and these blobs even
made it into Solaris. all because of a fucking parallel port.

Eventually it was deemed mistaken and the whole damn tower
collapsed. $complicatedlanguage in solaris bitrotted. We
invented faster interfaces and stopped overcharging for them
(USB, Ethernet), and extremely simple page description languages:
modern laser printers get a single JBIG image of the page,
pre-dithered, and do not even understand what a glyph is. I am
not even sure how they decompress the JBIG, if they even have
enough video RAM to buffer an entire page or if they uncompress
it while the mirror is spinning. awesome, awesome. we keep
$complicatedlanguage around as an open-source emulator for those
who still need it. OS designers stood up for themselves and
invented their own font formats and font storage pools, which
they now hegemonically enforce font formats onto the font vendors
rather than the other way around. wysiwyg works, no more font
DRM, install fonts 1x, every page is printable, and at rated
speed, regardless of ``complexity'''', no more hard drives
inside
printers. hallelujah.

so my analagous argument is for (b), not to make the
kernel/userland boundary into a new virtual parallel port just
because it feels right to squeeze it into a straw. (though i
just contradicted reason (1).) At least just to make any
tinylanguages invented at interfaces extremelytiny and expressive
so the invention of the language makes less code in the overall
system even after implementing its interpreter, not to invent
boorish EE-inspired ``packing'''' languages based on
opcodes that
make lots of code for less data.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090327/9716a0ee/attachment.bin>

Carson Gaspar

2009-Mar-27 22:26 UTC

head link

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

Darren J Moffat wrote:
...> Agreed, but other than pattern based I can''t at the moment thing
of a
> nice way to pass all the names over the /dev/zfs ioctl call while 
> maintaining the fact it is pretty much all fixed size.
> 
> I''m not saying passing a list of names over the ioctl is
impossible,
> more it just doesn''t feel right to me at the moment - but
I''m happy to
> be convinced otherwise.  That way the patterning part can be left to the 
> shell.
OK, while I have played a developer upon occasion, I''ve never touched 
kernel code. So feel free to tell me I''m on crack.

What is so difficult about passing a pointer to memory as an argument in 
the ioctl? The kernel certainly has easy access to user-space pages. And 
parsing a list of text strings is neither complicated, nor dangerous. 
And as long as you never touch the memory after returning from ioctl(), 
no memory allocation ownership issues.

In short, what am I missing here? This ioctl() limit seems much ado 
about nothing to me...

-- 
Carson

Jeff Bonwick

2009-Mar-29 22:19 UTC

head link

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

I agree with Chris -- I''d much rather do something like:

	zfs clone snap1 clone1 snap2 clone2 snap3 clone3 ...

than introduce a pattern grammar.  Supporting multiple snap/clone pairs
on the command line allows you to do just about anything atomically.

Jeff

On Fri, Mar 27, 2009 at 10:46:33AM -0500, Chris Kirby
wrote:> On Mar 27, 2009, at 10:33 AM, Darren J Moffat wrote:
> >	a) that is probably what is wanted most of the time anyway
> >	b) it is easy to pass from userland to kernel - you pass the
> >	   rules (after some userland sanity checking first) as is.
> 
> 
> But doesn''t that also exclude the possibility of creating
non-pattern
> based
> clones in a single txg?
> 
> While I think that allowing multiple clones to be created in a single  
> txg
> is perfectly reasonable, we shouldn''t need to artificially
restrict the
> clone namespace in order to achieve that.
> 
> -Chris
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Alec Muffett

2009-Mar-29 22:41 UTC

head link

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

On 29 Mar 2009, at 23:19, Jeff Bonwick wrote:
> I agree with Chris -- I''d much rather do something like:
>
> 	zfs clone snap1 clone1 snap2 clone2 snap3 clone3 ...
>
> than introduce a pattern grammar.  Supporting multiple snap/clone  
> pairs
> on the command line allows you to do just about anything atomically.

Can you elucidate how this will help me take a single snap and clone  
it 1000 times, quickly and with minimum fuss?

	-a

Darren J Moffat

2009-Mar-30 08:57 UTC

head link

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

Carson Gaspar wrote:> Darren J Moffat wrote:
> ...
>> Agreed, but other than pattern based I can''t at the moment
thing of a
>> nice way to pass all the names over the /dev/zfs ioctl call while 
>> maintaining the fact it is pretty much all fixed size.
>>
>> I''m not saying passing a list of names over the ioctl is
impossible,
>> more it just doesn''t feel right to me at the moment - but
I''m happy to
>> be convinced otherwise.  That way the patterning part can be left to 
>> the shell.
> 
> OK, while I have played a developer upon occasion, I''ve never
touched
> kernel code. So feel free to tell me I''m on crack.
> 
> What is so difficult about passing a pointer to memory as an argument in 
> the ioctl? The kernel certainly has easy access to user-space pages. And 
> parsing a list of text strings is neither complicated, nor dangerous. 
> And as long as you never touch the memory after returning from ioctl(), 
> no memory allocation ownership issues.
> 
> In short, what am I missing here? This ioctl() limit seems much ado 
> about nothing to me...
You aren''t missing anything, it could certainly be done.  I was just 
trying to see what was possible without to much change from how the 
ioctl calls on /dev/zfs work today.  I was just being very conservative 
with respect to change.

If Jeff (as he indicated in another email) is happy with a non pattern 
method and what that means for how this is passed over the ioctl then so 
am I.

--
Darren J Moffat

zfs discuss - Mar 2009 - RFE: creating multiple clones in one zfs(1) call and one txg

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg

[zfs-discuss] RFE: creating multiple clones in one zfs(1) call and one txg