thr3ads.net - zfs discuss - [zfs-discuss] ZFS Boot: Dividing up the name space [Apr 2007]

If this information is useful, please help other people find it:
Share via:

Darren J Moffat

2007-Apr-24 13:33 UTC

[zfs-discuss] ZFS Boot: Dividing up the name space

With reference to Lori''s blog posting[1] I''d like to throw out
a few of
my thoughts on spliting up the namespace.

This is quite timely because only yesterday when I was updating the ZFS 
crypto document I was thinking about this.  I knew I needed ephemeral 
key support for ZVOLs so we could swap on an encrypted ZVOL.  However I 
chose not to make that option specific to ZVOLs but made it available to 
all datasets.  The rationale for this was having directories like 
/var/tmp as separate encrypted datasets with an ephemeral key.

So yes Lori I completely agree /var should be a separate data set, whats 
more I think we can identify certain points of the /var namespace that 
should almost always be a separate dataset.

Other than /var/tmp my short list for being separate ZFS datasets are:

/var/crash - because can be big and we might want quotas.
/var/core [ which we don''t yet have by default but I''m
considering
	   submitting an ARC case for this. ] - as above.
/var/tm	 Similar to the /var/log rationale.

There are obvious other places that would really benefit but I think 
having them as separate datasets really depends on what the machine is 
doing.  For example /var/apache if you really are a webserver, but then 
why not go one better and split out cgi-bin and htdocs into separate 
datasets too - that way you have set noexec in htdocs.

I think we have lots of options but it might be nice to come up with a 
short list of special/important directories that would should always 
recommend be separate datasets - lets not hardcode that into the 
installer though (heck we still think /usr/openwin is special !)

One of the things I''m really interested in seeing is more appropriate 
sharing with Zones because we have more flexibility in the installer as 
it becomes zone aware.  What I''d love to see is that we completely 
abandon the package based boundaries for Zones and instead use one based 
only on the actual filesystem namespace and use Zones to get the best 
out of that.

A nitpick on the terminology.  While I agree that some QoS things can be 
set at the level of a dataset there are others which are really only 
available to the pool, though now with ditto blocks for data as well as 
metadata that starts to blur a bit too.


[1] http://blogs.sun.com/lalt/entry/zfs_boot_issue_of_the
-- 
Darren J Moffat

Mark J Musante

2007-Apr-24 13:48 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

On Tue, 24 Apr 2007, Darren J Moffat wrote:
> There are obvious other places that would really benefit but I think
> having them as separate datasets really depends on what the machine is
> doing.  For example /var/apache if you really are a webserver, but then
> why not go one better and split out cgi-bin and htdocs into separate
> datasets too - that way you have set noexec in htdocs.
How specific do we want to get?  I can see the benefit of splitting out
the various apache directories, but those decisions might be better made
by the appliance team.   Creating a webserver would have different dataset
requirements from creating a NAS box, for example.

I believe we should stick to the most basic config for the default Solaris
installer.  Certainly it should allow the admin to create whataever
datasets might be desired, but we should keep it simple for the default
case.

I''ve heard arguments for /tmp and /var/tmp.  Your point about
/var/crash
is a good one.  /opt and /usr have also been given good reasons. 
That''s
six already, including root.

Regards,
markm

Brian Hechinger

2007-Apr-24 14:14 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

On Tue, Apr 24, 2007 at 09:48:33AM -0400, Mark J Musante
wrote:> 
> I believe we should stick to the most basic config for the default Solaris
> installer.  Certainly it should allow the admin to create whataever
> datasets might be desired, but we should keep it simple for the default
> case.
> 
> I''ve heard arguments for /tmp and /var/tmp.  Your point about
/var/crash
> is a good one.  /opt and /usr have also been given good reasons. 
That''s
> six already, including root.
I think for the sake of an argument, we should limit the required split off
datasets to be ones that are only related to zfsboot to make zfsboot easier
and more flexible (i look forward to raidz(2) boot ability) and not over-think
this.  I certainly don''t think there shouldn''t be a set of
recommended datasets
(/var/crash being a good example) but above and beyond that it should be
completely up to the administrator how far they want to go.

Just my $.02.  ;)

-brian
-- 
"Perl can be fast and elegant as much as J2EE can be fast and elegant.
In the hands of a skilled artisan, it can and does happen; it''s just
that most of the shit out there is built by people who''d be better
suited to making sure that my burger is cooked thoroughly."  -- Jonathan
Patschke

Robert Milkowski

2007-Apr-24 14:59 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

Hello Darren,

Tuesday, April 24, 2007, 3:33:47 PM, you wrote:

DJM> With reference to Lori''s blog posting[1] I''d like to
throw out a few of
DJM> my thoughts on spliting up the namespace.

DJM> This is quite timely because only yesterday when I was updating the ZFS
DJM> crypto document I was thinking about this.  I knew I needed ephemeral 
DJM> key support for ZVOLs so we could swap on an encrypted ZVOL.  However I
DJM> chose not to make that option specific to ZVOLs but made it available to
DJM> all datasets.  The rationale for this was having directories like 
DJM> /var/tmp as separate encrypted datasets with an ephemeral key.

DJM> So yes Lori I completely agree /var should be a separate data set, whats
DJM> more I think we can identify certain points of the /var namespace that
DJM> should almost always be a separate dataset.

DJM> Other than /var/tmp my short list for being separate ZFS datasets are:

DJM> /var/crash - because can be big and we might want quotas.

I agree - I''ve been doing this for some time (/ on UFS, rest of a disk
on zfs for zones and crash + core file systems with quota set).

DJM> /var/core [ which we don''t yet have by default but I''m
considering
DJM>            submitting an ARC case for this. ] - as above.

Definitely - we''re doing this in a jumpstart but frankly it should
have been for years by default (even without zfs).

DJM> /var/tm	 Similar to the /var/log rationale.

DJM> There are obvious other places that would really benefit but I think 
DJM> having them as separate datasets really depends on what the machine is
DJM> doing.  For example /var/apache if you really are a webserver, but then
DJM> why not go one better and split out cgi-bin and htdocs into separate 
DJM> datasets too - that way you have set noexec in htdocs.

DJM> I think we have lots of options but it might be nice to come up with a
DJM> short list of special/important directories that would should always 
DJM> recommend be separate datasets - lets not hardcode that into the 
DJM> installer though (heck we still think /usr/openwin is special !)

Definitely. We could scare people with dozen or more file systems
mounted after fresh install on their laptop.

However some time ago here was a discussion on ''zfs
split|merge''
functionality. Is it going to happen? If it does then maybe only
minimum number of datasets should be created by default (/ /var /opt)
and later admin can just ''zfs split root/var/log''?

While having lot of datasets is really nice please do not over use it,
at least not in a default configs when probably it would introduce
more confusion to most users than do any good.

I would also consider disabling or changing default config for autofs
so local users would go to /home as most people expect by default and
then also create /home as separate file system.

So my short list is:

  /
  /var
  /opt
  /home






-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

Robert Milkowski

2007-Apr-24 15:03 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

Hello Robert,

Tuesday, April 24, 2007, 4:59:31 PM, you wrote:

RM> Hello Darren,

RM> Tuesday, April 24, 2007, 3:33:47 PM, you wrote:

DJM>> With reference to Lori''s blog posting[1] I''d like
to throw out a few of
DJM>> my thoughts on spliting up the namespace.

DJM>> This is quite timely because only yesterday when I was updating the
ZFS
DJM>> crypto document I was thinking about this.  I knew I needed
ephemeral
DJM>> key support for ZVOLs so we could swap on an encrypted ZVOL. 
However I
DJM>> chose not to make that option specific to ZVOLs but made it
available to
DJM>> all datasets.  The rationale for this was having directories like 
DJM>> /var/tmp as separate encrypted datasets with an ephemeral key.

DJM>> So yes Lori I completely agree /var should be a separate data set,
whats
DJM>> more I think we can identify certain points of the /var namespace
that
DJM>> should almost always be a separate dataset.

DJM>> Other than /var/tmp my short list for being separate ZFS datasets
are:

DJM>> /var/crash - because can be big and we might want quotas.

RM> I agree - I''ve been doing this for some time (/ on UFS, rest of
a disk
RM> on zfs for zones and crash + core file systems with quota set).

DJM>> /var/core [ which we don''t yet have by default but
I''m considering
DJM>>            submitting an ARC case for this. ] - as above.

RM> Definitely - we''re doing this in a jumpstart but frankly it
should
RM> have been for years by default (even without zfs).

DJM>> /var/tm     Similar to the /var/log rationale.

DJM>> There are obvious other places that would really benefit but I think
DJM>> having them as separate datasets really depends on what the machine
is
DJM>> doing.  For example /var/apache if you really are a webserver, but
then
DJM>> why not go one better and split out cgi-bin and htdocs into separate
DJM>> datasets too - that way you have set noexec in htdocs.

DJM>> I think we have lots of options but it might be nice to come up with
a
DJM>> short list of special/important directories that would should always
DJM>> recommend be separate datasets - lets not hardcode that into the 
DJM>> installer though (heck we still think /usr/openwin is special !)

RM> Definitely. We could scare people with dozen or more file systems
RM> mounted after fresh install on their laptop.

RM> However some time ago here was a discussion on ''zfs
split|merge''
RM> functionality. Is it going to happen? If it does then maybe only
RM> minimum number of datasets should be created by default (/ /var /opt)
RM> and later admin can just ''zfs split root/var/log''?

RM> While having lot of datasets is really nice please do not over use it,
RM> at least not in a default configs when probably it would introduce
RM> more confusion to most users than do any good.

RM> I would also consider disabling or changing default config for autofs
RM> so local users would go to /home as most people expect by default and
RM> then also create /home as separate file system.

RM> So my short list is:

RM>   /
RM>   /var
RM>   /opt
RM>   /home


    /var/crash
    /var/core

I think configuring Solaris by default to write crashdumps and cores
to above locations should be considered however I would rather not
create separata datasets for them by default.



-- 
Best regards,
 Robert                            mailto:rmilkowski at task.gda.pl
                                       http://milek.blogspot.com

Richard Elling

2007-Apr-24 16:21 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

We''re also updating the EIS bootdisk standard, and are considering
similar
recommendations.

File systems are still not free.  They have costs in complexity and
maintenance, especially backup/restore.  One of the benefits of a single
namespace is that it is relatively simple to backup and restore quickly.
However, I don''t want to get sidetracked by the state of backup/restore
today.  One benefit to multiple file systems is that you can apply
different policies, so if we stick to discussing policies (ok, including
backup/restore policies) then we should be able to arrive at a concensus
relatively easily :-)

Darren J Moffat wrote:> With reference to Lori''s blog posting[1] I''d like to
throw out a few of
> my thoughts on spliting up the namespace.
> 
> This is quite timely because only yesterday when I was updating the ZFS 
> crypto document I was thinking about this.  I knew I needed ephemeral 
> key support for ZVOLs so we could swap on an encrypted ZVOL.  However I 
> chose not to make that option specific to ZVOLs but made it available to 
> all datasets.  The rationale for this was having directories like 
> /var/tmp as separate encrypted datasets with an ephemeral key.
cool
> So yes Lori I completely agree /var should be a separate data set, whats 
> more I think we can identify certain points of the /var namespace that 
> should almost always be a separate dataset.
> 
> Other than /var/tmp my short list for being separate ZFS datasets are:
> 
> /var/crash - because can be big and we might want quotas.
savecore already has a (sort of) quota implementation.  I think the policy
driving this is backup/restore, not quota.  I''d rather not spend a
bunch of
time or tape backing up old cores.
> /var/core [ which we don''t yet have by default but I''m
considering
>        submitting an ARC case for this. ] - as above.
ditto
> /var/tm     Similar to the /var/log rationale.
[assuming /var/tmp]
It is not clear to me how people use /var/tmp.  In other words, I''m
pretty
sure that most people don''t know /var/tmp exists, and those that do
know
use it differently than I do.  Perhaps the policy driving this should be quota.

methinks we need a table...

As Robert points out, life becomes so much easier if split/merge existed :-)
  -- richard

Nicolas Williams

2007-Apr-24 16:29 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

I left a comment on Lori''s blog to the effect that splitting the
namespace would complicate LU tools.  Perhaps we need a zfs clone -r to
match zfs snapshot -r?

Nico
--

Darren J Moffat

2007-Apr-24 16:30 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

Richard Elling wrote:
>> /var/tm     Similar to the /var/log rationale.
> 
> [assuming /var/tmp]
I intended to type /var/fm not /var/tm or /var/tmp.  The FMA state data 
is I believe something that you would want to share between all boot 
environments on a given bit of hardware, right ?

-- 
Darren J Moffat

Gavin Maltby

2007-Apr-26 09:14 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

On 04/24/07 17:30, Darren J Moffat wrote:> Richard Elling wrote:
> 
>>> /var/tm     Similar to the /var/log rationale.
>>
>> [assuming /var/tmp]
> 
> I intended to type /var/fm not /var/tm or /var/tmp.  The FMA state data 
> is I believe something that you would want to share between all boot 
> environments on a given bit of hardware, right ?
Yes, under normal production circumstances that is what you''d want.
I guess under some test circumstances you may want different state
for different BEs.

I''d also like to have compression turned on by default for /var/fm.
It will cost nothing in turns of cpu time since additions to that
tree are at a very low rate and only small chunks of data at a time;
but the small chunks can add up in a system suffering solid errors
if the ereports are not throttled in some way, and they''re eminently
compressible.

There are a couple of CRs logged for this somewhere.

Gavin

Peter Tribble

2007-Apr-26 18:29 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

On 4/24/07, Darren J Moffat <Darren.Moffat at sun.com>
wrote:> With reference to Lori''s blog posting[1] I''d like to
throw out a few of
> my thoughts on spliting up the namespace.
Just a plea with my sysadmin hat on - please don''t go overboard
and make new filesystems just because we can. Each extra
filesystem generates more work for the administrator, if only
for the effort to parse df output (which is more than cluttered enough
already).

In other words, let people have a system with just one filesystem.
> I think we have lots of options but it might be nice to come up with a
> short list of special/important directories that would should always
> recommend be separate datasets -
If there is such a list, explain *why*, so that admins can make
informed choices.

Or maybe even restructure the filesystem layout so that directories
with common properties could live under a common parent that could
be a separate filesystem rather than creating separate filesystems
for each?
> lets not hardcode that into the
> installer though (heck we still think /usr/openwin is special !)
Ugh, yes!
> One of the things I''m really interested in seeing is more
appropriate
> sharing with Zones because we have more flexibility in the installer as
> it becomes zone aware.  What I''d love to see is that we completely
> abandon the package based boundaries for Zones and instead use one based
> only on the actual filesystem namespace and use Zones to get the best
> out of that.
Agreed, zones based on packaging causes too much pain all round.

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/

Darren J Moffat

2007-Apr-26 18:50 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

Peter Tribble wrote:> In other words, let people have a system with just one filesystem.
I''m fine with that.
>> I think we have lots of options but it might be nice to come up with a
>> short list of special/important directories that would should always
>> recommend be separate datasets -
> 
> If there is such a list, explain *why*, so that admins can make
> informed choices.
> 
> Or maybe even restructure the filesystem layout so that directories
> with common properties could live under a common parent that could
> be a separate filesystem rather than creating separate filesystems
> for each?
Hmn we have that already.  /usr - mostly readonly executable and 
support.  /var pretty much everything here needs to be written to and 
can grow, /etc can change but should be rare.


-- 
Darren J Moffat

Peter Tribble

2007-Apr-26 18:58 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

On 4/26/07, Darren J Moffat <Darren.Moffat at sun.com>
wrote:>
> > Or maybe even restructure the filesystem layout so that directories
> > with common properties could live under a common parent that could
> > be a separate filesystem rather than creating separate filesystems
> > for each?
>
> Hmn we have that already.  /usr - mostly readonly executable and
> support.  /var pretty much everything here needs to be written to and
> can grow, /etc can change but should be rare.
Should have said, but I was actually thinking of the /var/crash and
/var/core case - where the requirement (for quotas, maybe) and
the essential function is the same. A little bit of restructure and we
could have 1 dataset instead of 2.

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/

Lori Alt

2007-Apr-26 20:18 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

Peter Tribble wrote:> On 4/24/07, Darren J Moffat <Darren.Moffat at sun.com> wrote:
>> With reference to Lori''s blog posting[1] I''d like to
throw out a few of
>> my thoughts on spliting up the namespace.
>
> Just a plea with my sysadmin hat on - please don''t go overboard
> and make new filesystems just because we can. Each extra
> filesystem generates more work for the administrator, if only
> for the effort to parse df output (which is more than cluttered enough
> already).My first reaction to that, is yes, of course, extra file systems are extra
work.  Don''t require them, and don''t even make them the
default unless
they buy you a lot.  But then I thought, no, let''s challenge that a
bit.

Why do administrators do ''df'' commands?  It''s to find
out how much space
is used or available in a single file system.   That made sense when file
systems each had their own dedicated slice, but now it doesn''t make
that
much sense anymore.  Unless you''ve assigned a quota to a zfs file
system,
"space available" is meaningful more at the pool level.  And if you
DID
assign a quota to the file system, then you really did want that part of
the name space to be a separate, and separately manageable, file system.

With zfs, file systems are in many ways more like directories than what
we used to call file systems.   They draw from pooled storage.  They
have low overhead and are easy to create and destroy.  File systems
are sort of like super-functional directories, with quality-of-service
control and cloning and snapshots.  Many of the things that sysadmins
used to have to do with file systems just aren''t necessary or even
meaningful anymore.  And so maybe the additional work of managing
more file systems is actually a lot smaller than you might initially think.

In other words, think about ALL of the implications of using zfs,
not just some.

We''ve come up with a lot of good reasons for having multiple
file systems.  So we know that there are benefits.  We also know
that there are costs.  But if we can figure out a way to keep the
costs low, the benefits might outweigh them.
>
> In other words, let people have a system with just one filesystem.I think I can agree with this, but I''m not absolutely certain.   On the
one hand, sure, more freedom is better.  But I''m concerned that
our long-term install and upgrade strategies might be constrained
by having to support configurations that haven''t been set up with
the granularity needed for some kinds of valuable storage management
features.

This conversation is great!  I''m getting lots of good information
and I *really* want to figure out what''s best, even if it challenges
some of my cherished notions.

Lori

Mike Dotson

2007-Apr-26 21:07 UTC

head link

[zfs-discuss] Re: ZFS Boot: Dividing up the name space

> Peter Tribble wrote:
> > On 4/24/07, Darren J Moffat <Darren.Moffat at sun.com>
> wrote:
> >> With reference to Lori''s blog posting[1] I''d
like
> to throw out a few of
> >> my thoughts on spliting up the namespace.
> >
> > Just a plea with my sysadmin hat on - please don''t
> go overboard
> > and make new filesystems just because we can. Each
> extra
> > filesystem generates more work for the
> administrator, if only
> > for the effort to parse df output (which is more
> than cluttered enough
> > already).
> My first reaction to that, is yes, of course, extra
> file systems are extra
> work.  Don''t require them, and don''t even make them
> the default unless
> they buy you a lot.  But then I thought, no, let''s
> challenge that a bit.
> 
> Why do administrators do ''df'' commands?  It''s to
find
> out how much space
> is used or available in a single file system.   That
> made sense when file
> systems each had their own dedicated slice, but now
> it doesn''t make that
> much sense anymore.  Unless you''ve assigned a quota
> to a zfs file system,
> "space available" is meaningful more at the pool
> level.  And if you DID
> assign a quota to the file system, then you really
> did want that part of
> the name space to be a separate, and separately
> manageable, file system.
I''d like to put my sysadmin hat on and add to this:  

Yes, if you start adding quota''s, etc. you''ll have to start
looking at doing df''s again but this is actually easier with zfs (zfs
list).  Now I can see, very easily, where my space is being allocated and start
diving in from there instead of the multiple du -ks * |sort -n recursive
rampages I do on one big filesystem.

Also, if I start using zfs and some of the other features (read only) for
example, I can start taking and locking down some of these filesystems (/usr
perhaps???) so I no longer need to worry about the space being allocated in
/usr.  Or setting reserve and quota''s on file systems, basically
eliminating them from my constant monitoring and free space shuffle of where did
my space go.
> 
> With zfs, file systems are in many ways more like
> directories than what
> we used to call file systems.   They draw from pooled
> storage.  They
> have low overhead and are easy to create and destroy.
>  File systems
> re sort of like super-functional directories, with
> quality-of-service
> control and cloning and snapshots.  Many of the
> things that sysadmins
> used to have to do with file systems just aren''t
> necessary or even
> meaningful anymore.  And so maybe the additional work
> of managing
> more file systems is actually a lot smaller than you
> might initially think.
I believe so.  Just having zfs boot on my system for a couple of days and
breaking out the major food groups, I can easily see where my space is at -
again zfs list is much faster than du -ks and I don''t have to be root
for it to be 100% accurate - my postgres data files aren''t owned by
me;)

Other things (I''ve mentioned to Lori off alias) is the possible ability
to compress some file systems - again possibly /usr and /opt???

Breaking out the namespace provides the flexibility of separate file systems and
snapping/cloning/administrating those as needed with the benefits of a single
root file system - one disk and not having to get the partition space right.

But, there is the matter of balance - too much would be overkill.  Perhaps the
split and merge RFE''s would bridge that gap to provide again more
flexibility?
> 
> In other words, think about ALL of the implications
> of using zfs,
> not just some.
> 
> We''ve come up with a lot of good reasons for having
> multiple
> file systems.  So we know that there are benefits.
>  We also know
> hat there are costs.  But if we can figure out a way
> to keep the
> costs low, the benefits might outweigh them.
> 
> >
> > In other words, let people have a system with just
> one filesystem.
> I think I can agree with this, but I''m not absolutely
> certain.   On the
> one hand, sure, more freedom is better.  But I''m
> concerned that
> our long-term install and upgrade strategies might be
> constrained
> by having to support configurations that haven''t been
> set up with
> the granularity needed for some kinds of valuable
> storage management
> features.
> 
> This conversation is great!  I''m getting lots of good
> information
> and I *really* want to figure out what''s best, even
> if it challenges
> some of my cherished notions.
> 
> Lori
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discu
> ss
> 
 
This message posted from opensolaris.org

Mike Gerdts

2007-Apr-27 02:11 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

On 4/24/07, Darren J Moffat <Darren.Moffat at sun.com>
wrote:> Other than /var/tmp my short list for being separate ZFS datasets are:
>
> /var/crash - because can be big and we might want quotas.
> /var/core [ which we don''t yet have by default but I''m
considering
>            submitting an ARC case for this. ] - as above.
> /var/tm  Similar to the /var/log rationale.
How does this[1] play with live upgrade or a like technology?
Presumably a boot environment is created with "zfs snapshot -r".
There is very significant value in having the notion of "boot
environment data" and app or user data on a server.  By having this
distinction, it greatly takes away the significance of the question
that I keep getting from Sun folks: "how long are you OK between
lucreate and luactivate?"

One word of caution with "/var/core" is that it makes per-process core
dumps for a process with cwd of /var impossible, assuming per-process
core pattern is still "core".  My approach to this is to have a global
core dump pattern of /var/cores/core-%... and have core dump logging
enabled.  If /var/cores doesn''t exist, I get a syslog message saying
that the core dump failed which is usually all the information I need
(no, the sysadmin didn''t kill your process, your vendor''s
programmers
did).  In the relatively rare cases where I need to capture a core, I
can create /var/cores then rename it once I have good data.

[1] And additional proliferation of file systems proposed in this
thread.  Many file systems can be good but too many become a headache.
> I think we have lots of options but it might be nice to come up with a
> short list of special/important directories that would should always
> recommend be separate datasets - lets not hardcode that into the
> installer though (heck we still think /usr/openwin is special !)
Most certainly.  While I find separating file systems based upon
software management boundaries[2], others feel there is more benefit
in different strategies[3].  The installer needs reasonable defaults
but should only enforce the creation of the set that is really the
minimum.

[2] /, /usr, /var, and /opt all belong together because everything
there is managed by pkgadd/pkgrm/patchadd/patchrm.  Some random app
that installs in /opt/random-app through a custom installer (and as
such is likely administered by non-root, is portable across boot
environments) gets its own file system.  /var/tmp has no software in
it and can be abused to hurt the rest of the system - that''s a good
candidate for another FS.
[3]  Or are simply afraid to deviate from the advice they received in
1988 to have / and /usr as separate file systems
> One of the things I''m really interested in seeing is more
appropriate
> sharing with Zones because we have more flexibility in the installer as
> it becomes zone aware.  What I''d love to see is that we completely
> abandon the package based boundaries for Zones and instead use one based
> only on the actual filesystem namespace and use Zones to get the best
> out of that.
I don''t follow this.  It seems to me that zones are very much based
upon file system name space and not on package boundaries.  For
example, a package that has components in /etc and /usr (for better or
for worse) installs, uninstalls, and propagates into full and sparse
zones properly.  I was actually quite impressed that this worked out
so well.

What I would find really useful is something that allows me to create
a zone by cloning the global zone''s file systems then customizing as
required.  When I patch the server (global zone, propagate patches to
local zone), the global zone and the non-global zones should still be
referencing the same disk blocks for almost everything in /usr, /lib,
etc. (not /etc :) ).  The best hope of this right now is some sort of
de-duplication that seems not to be high on the list of coming
features.  This would give the benefits of sparse zones (more
efficient use of memory, etc.) without the drawback of not being able
to even create mount points for other file systems.

Mike

-- 
Mike Gerdts
http://mgerdts.blogspot.com/

2007-Apr-27 05:51 UTC

head link

[zfs-discuss] Re: ZFS Boot: Dividing up the name space

> With zfs, file systems are in many ways more like directories than whatwe used to call file systems. They draw from pooled storage. They
have low overhead and are easy to create and destroy. File systems
are sort of like super-functional directories, with quality-of-service
control and cloning and snapshots.

When you put it that way, I really look forward to an explorer.exe-style file
browser tree with pools at the top, maroon file systems underneath, and yellow
directories underneath those.  I can see a time 5 years down the road where ZFS
file systems are actually called "superfolders"! :)

*mentally right clicks on /pool/mydocuments and chooses "revert to
yesterday''s snapshot"*
 
 
This message posted from opensolaris.org

Peter Tribble

2007-Apr-28 16:48 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

On 4/26/07, Lori Alt <Lori.Alt at sun.com> wrote:> Peter Tribble wrote:
> > On 4/24/07, Darren J Moffat <Darren.Moffat at sun.com> wrote:
> >> With reference to Lori''s blog posting[1] I''d
like to throw out a few of
> >> my thoughts on spliting up the namespace.
> >
> > Just a plea with my sysadmin hat on - please don''t go
overboard
> > and make new filesystems just because we can. Each extra
> > filesystem generates more work for the administrator, if only
> > for the effort to parse df output (which is more than cluttered enough
> > already).
> My first reaction to that, is yes, of course, extra file systems are extra
> work.  Don''t require them, and don''t even make them the
default unless
> they buy you a lot.  But then I thought, no, let''s challenge that
a bit.
>
> Why do administrators do ''df'' commands?  It''s to
find out how much space
> is used or available in a single file system.   That made sense when file
> systems each had their own dedicated slice, but now it doesn''t
make that
> much sense anymore.  Unless you''ve assigned a quota to a zfs file
system,
> "space available" is meaningful more at the pool level.
True, but it''s actually quite hard to get at the moment. It''s
easy if
you have a single pool - it doesn''t matter which line you look at.
But once you have 2 or more pools (and that''s the way it would
work, I expect - a boot pool and 1 or more data pools) there''s
an awful lot of output you may have to read. This isn''t helped
by zpool and zfs giving different answers., with the one from zfs
being the one I want. The point is that every filesystem adds
additional output the administrator has to mentally filter. (For
one thing, you have to map a directory name to a containing
pool.)
> With zfs, file systems are in many ways more like directories than what
> we used to call file systems.   They draw from pooled storage.  They
> have low overhead and are easy to create and destroy.  File systems
> are sort of like super-functional directories, with quality-of-service
> control and cloning and snapshots.  Many of the things that sysadmins
> used to have to do with file systems just aren''t necessary or even
> meaningful anymore.  And so maybe the additional work of managing
> more file systems is actually a lot smaller than you might initially think.
Oh, I agree. The trouble is that sysadmins still have to work using
their traditional tools, including their brains, which are tooled up
for cases with a much lower filesystem count. What I don''t see as
part of this are new tools (or enhancements to existing tools) that
make this easier to handle.

For example, backup tools are currently filesystem based.

Eventually, the tools will catch up. But my experience so far
is that while zfs is fantastic from the point of view of pooling,
once I''ve got large numbers of filesystems and snapshots
and clones thereof, and the odd zvol, it can be a devil of
a job to work out what''s going on.

-- 
-Peter Tribble
http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/

Mike Dotson

2007-Apr-28 18:48 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

On Sat, 2007-04-28 at 17:48 +0100, Peter Tribble wrote:> On 4/26/07, Lori Alt <Lori.Alt at sun.com> wrote:
> > Peter Tribble wrote:
> <snip>
> Why do administrators do ''df'' commands?  It''s to
find out how much space
> > is used or available in a single file system.   That made sense when
file
> > systems each had their own dedicated slice, but now it
doesn''t make that
> > much sense anymore.  Unless you''ve assigned a quota to a zfs
file system,
> > "space available" is meaningful more at the pool level.
> 
> True, but it''s actually quite hard to get at the moment.
It''s easy if
> you have a single pool - it doesn''t matter which line you look at.
> But once you have 2 or more pools (and that''s the way it would
> work, I expect - a boot pool and 1 or more data pools) there''s
> an awful lot of output you may have to read. This isn''t helped
> by zpool and zfs giving different answers., with the one from zfs
> being the one I want. The point is that every filesystem adds
> additional output the administrator has to mentally filter. (For
> one thing, you have to map a directory name to a containing
> pool.)
It''s actually quite easy and easier than the other alternatives (ufs,
veritas, etc):

# zfs list -rH -o name,used,available,refer rootdg

And now it''s setup to be parsed by a script (-H) since the output is
tabbed.  The -r says to recursively display children of the parent and
the -o with the specified fields says to only display the fields
specified.

(output from one of my systems)

blast(9):> zfs list -rH -o name,used,available,refer rootdg
rootdg  4.39G   44.1G   32K
rootdg/nvx_wos_62       4.38G   44.1G   503M
rootdg/nvx_wos_62/opt   793M    44.1G   793M
rootdg/nvx_wos_62/usr   3.01G   44.1G   3.01G
rootdg/nvx_wos_62/var   113M    44.1G   113M
rootdg/swapvol  16K     44.1G   16K

Even tho the mount point is setup as a legacy mount point, I know where
each of them is mounted due to the vol name.

And yes, this system has more than one pool:

blast(10):> zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
lpool                  17.8G   11.4G   6.32G    64%  ONLINE     -
rootdg                 49.2G   4.39G   44.9G     8%  ONLINE     -

> 
> > With zfs, file systems are in many ways more like directories than
what
> > we used to call file systems.   They draw from pooled storage.  They
> > have low overhead and are easy to create and destroy.  File systems
> > are sort of like super-functional directories, with quality-of-service
> > control and cloning and snapshots.  Many of the things that sysadmins
> > used to have to do with file systems just aren''t necessary or
even
> > meaningful anymore.  And so maybe the additional work of managing
> > more file systems is actually a lot smaller than you might initially
think.
> 
> Oh, I agree. The trouble is that sysadmins still have to work using
> their traditional tools, including their brains, which are tooled up
> for cases with a much lower filesystem count. What I don''t see as
> part of this are new tools (or enhancements to existing tools) that
> make this easier to handle.
Not sure I agree with this.  Many times, you end up dealing with
multiple vxvol''s and file systems.  Anything over 12 filesystems and
you''re in overload (at least for me;) and I used my monitoring and
scripting tools to filter that for me. 

Many of the systems I admin''d were setup quite differently based on use
and functionality and disk size.

Most of my tools were setup to take most of these into consideration and
the fact that we ran almost every flavor of UNIX possible using the
features of each OS as appropriate.

Most of the tools will still work with zfs (if using df, etc) but it
actually makes it easier once you have a monitoring issue - running out
of space for example.

Most tools have high and low water marks so when a file system gets too
full, you get a warning.  ZFS makes this much easier to admin as you can
see which file system is being the hog and go directly to that file
system and hunt instead of first finding the file system, hence the
debate of the all-in-one / slice or breaking up to the major os fs''s.

Benefit of all-in-one / is you didn''t have to guess at how much space
you needed for each slice so you could upgrade, add optional software
without needing to grow/shrink the OS.

Drawback, if you filled up the file system, you had to hunt where it was
filling up - /dev, /usr, /var/tmp, /var, / ??? 

Benefit of multiple slices was one fs didn''t affect the others if you
filled it up and you could find which was the problem fs very easily but
if you estimated incorrectly, you had wasted disk space in one slice and
not enough in another.

ZFS gives you the benefit of both all-in-one and partitioned as it draws
from a single pool of storage but also allows you to find which fs is
being the problem and lock it down with quota''s and reservations.
> 
> For example, backup tools are currently filesystem based.
And this changes the scenario how?  I''ve actually been pondering this
for quite some time now.  Why do we backup the root disk?  With many of
the tools out now, it makes far more sense to do a flar/incremental
flars of the systems and or create custom jumpstart profiles to rebuild
the system.

Typical scenario for loosing the root file systems (catastrophic) is to
restore the OS, install the backup software to the fresh install, then
restore the OS via backup software to mirror disk.  

Why not just restore the OS from a base flar and apply the incremental?
Application data is what you really care about and any specific config
changes to the OS itself, the rest is fairly generic OS install w/
patches.

Other scenario is ufsdump/restore.  In that case, it doesn''t really
change the scenario any as the scripts iterate across the file systems
you want to dump anyway (at least mine).
> 
> Eventually, the tools will catch up. But my experience so far
> is that while zfs is fantastic from the point of view of pooling,
> once I''ve got large numbers of filesystems and snapshots
> and clones thereof, and the odd zvol, it can be a devil of
> a job to work out what''s going on.
No more difficult than doing ufs/vxfs snapshots and quick I/O, etc.
Only thing that really changes is the specific command for each and if
you''re doing that, then you''ve already got the infrastructure
for it
setup.

But that''s just my viewpoint...

-- 
Mike Dotson

Mike Gerdts

2007-Apr-28 22:24 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

On 4/28/07, Mike Dotson <Mike.Dotson at sun.com>
wrote:> And this changes the scenario how?  I''ve actually been pondering
this
> for quite some time now.  Why do we backup the root disk?  With many of
> the tools out now, it makes far more sense to do a flar/incremental
> flars of the systems and or create custom jumpstart profiles to rebuild
> the system.
I would love to see flash archive content that is the result of "zfs
send".  Incrementals are easy to do  so long as you keep the initial
(pristine) snapshot around that matches up exactly with the flar that
was initially applied.

Mike

-- 
Mike Gerdts
http://mgerdts.blogspot.com/

Torrey McMahon

2007-May-02 01:58 UTC

head link

[zfs-discuss] ZFS Boot: Dividing up the name space

Mike Dotson wrote:> On Sat, 2007-04-28 at 17:48 +0100, Peter Tribble wrote:
>   
>> On 4/26/07, Lori Alt <Lori.Alt at sun.com> wrote:
>>     
>>> Peter Tribble wrote:
>>>       
>> <snip>
>>     
>
>   
>> Why do administrators do ''df'' commands? 
It''s to find out how much space
>>     
>>> is used or available in a single file system.   That made sense
when file
>>> systems each had their own dedicated slice, but now it
doesn''t make that
>>> much sense anymore.  Unless you''ve assigned a quota to a
zfs file system,
>>> "space available" is meaningful more at the pool level.
>>>       
>> True, but it''s actually quite hard to get at the moment.
It''s easy if
>> you have a single pool - it doesn''t matter which line you look
at.
>> But once you have 2 or more pools (and that''s the way it would
>> work, I expect - a boot pool and 1 or more data pools) there''s
>> an awful lot of output you may have to read. This isn''t helped
>> by zpool and zfs giving different answers., with the one from zfs
>> being the one I want. The point is that every filesystem adds
>> additional output the administrator has to mentally filter. (For
>> one thing, you have to map a directory name to a containing
>> pool.)
>>     
>
> It''s actually quite easy and easier than the other alternatives
(ufs,
> veritas, etc):
>
> # zfs list -rH -o name,used,available,refer rootdg
>
> And now it''s setup to be parsed by a script (-H) since the output
is
> tabbed.  The -r says to recursively display children of the parent and
> the -o with the specified fields says to only display the fields
> specified.
>
> (output from one of my systems)
>
> blast(9):> zfs list -rH -o name,used,available,refer rootdg
> rootdg  4.39G   44.1G   32K
> rootdg/nvx_wos_62       4.38G   44.1G   503M
> rootdg/nvx_wos_62/opt   793M    44.1G   793M
> rootdg/nvx_wos_62/usr   3.01G   44.1G   3.01G
> rootdg/nvx_wos_62/var   113M    44.1G   113M
> rootdg/swapvol  16K     44.1G   16K
>
> Even tho the mount point is setup as a legacy mount point, I know where
> each of them is mounted due to the vol name.
>
>
> And yes, this system has more than one pool:
>
> blast(10):> zpool list
> NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
> lpool                  17.8G   11.4G   6.32G    64%  ONLINE     -
> rootdg                 49.2G   4.39G   44.9G     8%  ONLINE     -
>
>
>   
>>> With zfs, file systems are in many ways more like directories than
what
>>> we used to call file systems.   They draw from pooled storage. 
They
>>> have low overhead and are easy to create and destroy.  File systems
>>> are sort of like super-functional directories, with
quality-of-service
>>> control and cloning and snapshots.  Many of the things that
sysadmins
>>> used to have to do with file systems just aren''t necessary
or even
>>> meaningful anymore.  And so maybe the additional work of managing
>>> more file systems is actually a lot smaller than you might
initially think.
>>>       
>> Oh, I agree. The trouble is that sysadmins still have to work using
>> their traditional tools, including their brains, which are tooled up
>> for cases with a much lower filesystem count. What I don''t see
as
>> part of this are new tools (or enhancements to existing tools) that
>> make this easier to handle.
>>     
>
> Not sure I agree with this.  Many times, you end up dealing with
> multiple vxvol''s and file systems.  Anything over 12 filesystems
and
> you''re in overload (at least for me;) and I used my monitoring and
> scripting tools to filter that for me. 
>
> Many of the systems I admin''d were setup quite differently based
on use
> and functionality and disk size.
>
> Most of my tools were setup to take most of these into consideration and
> the fact that we ran almost every flavor of UNIX possible using the
> features of each OS as appropriate.
>
> Most of the tools will still work with zfs (if using df, etc) but it
> actually makes it easier once you have a monitoring issue - running out
> of space for example.
>
> Most tools have high and low water marks so when a file system gets too
> full, you get a warning.  ZFS makes this much easier to admin as you can
> see which file system is being the hog and go directly to that file
> system and hunt instead of first finding the file system, hence the
> debate of the all-in-one / slice or breaking up to the major os
fs''s.
>
> Benefit of all-in-one / is you didn''t have to guess at how much
space
> you needed for each slice so you could upgrade, add optional software
> without needing to grow/shrink the OS.
>
> Drawback, if you filled up the file system, you had to hunt where it was
> filling up - /dev, /usr, /var/tmp, /var, / ??? 
>
> Benefit of multiple slices was one fs didn''t affect the others if
you
> filled it up and you could find which was the problem fs very easily but
> if you estimated incorrectly, you had wasted disk space in one slice and
> not enough in another.
>
> ZFS gives you the benefit of both all-in-one and partitioned as it draws
> from a single pool of storage but also allows you to find which fs is
> being the problem and lock it down with quota''s and reservations.
>
>   
>> For example, backup tools are currently filesystem based.
>>     
>
> And this changes the scenario how?  I''ve actually been pondering
this
> for quite some time now.  Why do we backup the root disk?  With many of
> the tools out now, it makes far more sense to do a flar/incremental
> flars of the systems and or create custom jumpstart profiles to rebuild
> the system.
Usually because "thats the way we''ve always done it" or
"our operations
are such that changing is cost prohibitive" or ....

zfs discuss - Apr 2007 - ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] Re: ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] Re: ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space

[zfs-discuss] ZFS Boot: Dividing up the name space