thr3ads.net - zfs discuss - [zfs-discuss] live upgrade with lots of zfs filesystems [Aug 2009]

If this information is useful, please help other people find it:
Share via:

Paul B. Henson

2009-Aug-28 01:20 UTC

[zfs-discuss] live upgrade with lots of zfs filesystems

Well, so I''m getting ready to install the first set of patches on my
x4500
since we deployed into production, and have run into an unexpected snag.

I already knew that with about 5-6k file systems the reboot cycle was going
to be over an hour (not happy about, but knew about and planned for).

However, I went to create a new boot environment to install the patches
into, and so far that''s been running for about an hour and a half :(,
which was not expected or planned for.

First, it looks like the ludefine script spent about 20 minutes iterating
through all of my zfs file systems, and then something named lupi_bebasic
ran for over an hour, and then it looks like it mounted all of my zfs
filesystems under /.alt.tmp.b-nAe.mnt, and now it looks like it is
unmounting all of them.

I hadn''t noticed before, but when I went to check on my test system
(with
only a handful of filesystems), but evidently when I get to the point of
using lumount to mount the boot environment for patching, it''s going to
again mount all of my zfs file systems under the alternative root, and then
need to unmount them all again after I''m done patching, which is going
to
add probably another hour or two.

I don''t think I''m going to make my downtime window :(, and
will probably
need to reschedule the patching. I never considered I might have to start
the patch process six hours before the window.

I poked around a bit, but have not come across any way to exclude zfs
filesystems not part of the boot os pool from the copy and mount process.
I''m really hoping I''m just being stupid and missing something
blindingly
obvious. Given a boot pool named ospool, and a data pool named export, is
there anyway to make live upgrade completely ignore the data pool? There
is no need for my 6k user file systems to be mounted in the alternative
environment during patching. I only want the file systems in the ospool
copied, processed, and mounted.

<fingers crossed> Thanks...



-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Trevor Pretty

2009-Aug-28 01:34 UTC

head link

Re: live upgrade with lots of zfs filesystems

Paul

You need to exclude all the file system that are not the "OS"

My S10 Virtual machine is not booted but you can put all the
"excluded"
file systems in a file and use   -f   from memory.

You use to have to do this if there was a DVD in the drive otherwise
/cdrom got copied to the new boot environment. I know this because I
logged an RFE when Live Upgrade first appeared, and it was put into
state Deferred as the workaround is to just exclude it. I think it did
get fixed however in a later release.

trevor

Paul B. Henson wrote:

Well, so I''m getting ready to install the first set of patches on my
x4500
since we deployed into production, and have run into an unexpected snag.

I already knew that with about 5-6k file systems the reboot cycle was going
to be over an hour (not happy about, but knew about and planned for).

However, I went to create a new boot environment to install the patches
into, and so far that''s been running for about an hour and a half :(,
which was not expected or planned for.

First, it looks like the ludefine script spent about 20 minutes iterating
through all of my zfs file systems, and then something named lupi_bebasic
ran for over an hour, and then it looks like it mounted all of my zfs
filesystems under /.alt.tmp.b-nAe.mnt, and now it looks like it is
unmounting all of them.

I hadn''t noticed before, but when I went to check on my test system
(with
only a handful of filesystems), but evidently when I get to the point of
using lumount to mount the boot environment for patching, it''s going to
again mount all of my zfs file systems under the alternative root, and then
need to unmount them all again after I''m done patching, which is going
to
add probably another hour or two.

I don''t think I''m going to make my downtime window :(, and
will probably
need to reschedule the patching. I never considered I might have to start
the patch process six hours before the window.

I poked around a bit, but have not come across any way to exclude zfs
filesystems not part of the boot os pool from the copy and mount process.
I''m really hoping I''m just being stupid and missing something
blindingly
obvious. Given a boot pool named ospool, and a data pool named export, is
there anyway to make live upgrade completely ignore the data pool? There
is no need for my 6k user file systems to be mounted in the alternative
environment during patching. I only want the file systems in the ospool
copied, processed, and mounted.

 Thanks...

www.eagle.co.nz 

This email is confidential and may be legally 
privileged. If received in error please destroy and immediately notify 
us.


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Paul B. Henson

2009-Aug-28 02:25 UTC

head link

[zfs-discuss] live upgrade with lots of zfs filesystems

On Thu, 27 Aug 2009, Trevor Pretty wrote:
> My S10 Virtual machine is not booted but you can put all the
"excluded"
> file systems in a file and use -f from memory.
Unfortunately, I wasn''t that stupid. I saw the -f option, but
it''s not
applicable to ZFS root:

     -f exclude_list_file

         Use  the  contents  of  exclude_list_file   to   exclude
         specific  files  (including  directories) from the newly
         created BE. exclude_list_file contains a list  of  files
         and directories, one per line. If a line item is a file,
         only that file is excluded; if a directory, that  direc-
         tory  and  all  files  beneath that directory, including
         subdirectories, are excluded.

         This option is not supported when the source BE is on  a
         ZFS file system.

After it finished unmounting everything from the alternative root, it seems
to have spawned *another* lupi_bebasic process which has eaten up 62
minutes of CPU time so far. Evidentally it''s doing a lot of string
comparisons (per truss):

/1 at 1:   <- libc:strcmp() = 0
/1 at 1:   -> libc:strcmp(0x86fceec, 0xfefa1218)
/1 at 1:   <- libc:strcmp() = 0
/1 at 1:   -> libc:strcmp(0x86fd534, 0xfefa1218)
/1 at 1:   <- libc:strcmp() = 0
/1 at 1:   -> libc:strcmp(0x86fdccc, 0xfefa1218)
/1 at 1:   <- libc:strcmp() = 0
/1 at 1:   -> libc:strcmp(0x86fdcfc, 0xfefa1218)
/1 at 1:   <- libc:strcmp() = 0
/1 at 1:   -> libc:strcmp(0x86fec84, 0xfefa1218)
/1 at 1:   <- libc:strcmp() = 0
/1 at 1:   -> libc:strcmp(0x86fecb4, 0xfefa1218)
/1 at 1:   <- libc:strcmp() = 0

The first one finished in a bit over an hour, hopefully this one''s
about
done too and there''s not any more stuff to do.

Thanks...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Paul B. Henson

2009-Aug-28 05:59 UTC

head link

[zfs-discuss] live upgrade with lots of zfs filesystems

On Thu, 27 Aug 2009, Paul B. Henson wrote:
> However, I went to create a new boot environment to install the patches
> into, and so far that''s been running for about an hour and a half
:(,
> which was not expected or planned for.
[...]> I don''t think I''m going to make my downtime window :(,
and will probably
> need to reschedule the patching. I never considered I might have to start
> the patch process six hours before the window.
Well, so far lucreate took 3.5 hours, lumount took 1.5 hours, applying the
patches took all of 10 minutes, luumount took about 20 minutes, and
luactivate has been running for about 45 minutes. I''m assuming it will
probably take at least the 1.5 hours of the lumount (particularly
considering it appears to be running a lumount process under the hood) if
not the 3.5 hours of lucreate. Add in the 1-1.5 hours to reboot, and, well,
so much for patches this maintenance window.

The lupi_bebasic process seems to be the time killer here. Not sure what
it''s doing, but it spent 75 minutes running strcmp. Pretty much nothing
but
strcmp. 75 CPU minutes running strcmp???? I took a look for the source but
I guess that component''s not a part of opensolaris, or at least I
couldn''t
find it.

Hopefully I can figure out how to make this perform a little more
acceptably before our next maintenance window.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Casper.Dik at Sun.COM

2009-Aug-28 08:32 UTC

head link

[zfs-discuss] live upgrade with lots of zfs filesystems

>Well, so far lucreate took 3.5 hours, lumount took 1.5 hours, applying the
>patches took all of 10 minutes, luumount took about 20 minutes, and
>luactivate has been running for about 45 minutes. I''m assuming it
will
>probably take at least the 1.5 hours of the lumount (particularly
>considering it appears to be running a lumount process under the hood) if
>not the 3.5 hours of lucreate. Add in the 1-1.5 hours to reboot, and, well,
>so much for patches this maintenance window.
>
>The lupi_bebasic process seems to be the time killer here. Not sure what
>it''s doing, but it spent 75 minutes running strcmp. Pretty much
nothing but
>strcmp. 75 CPU minutes running strcmp???? I took a look for the source but
>I guess that component''s not a part of opensolaris, or at least I
couldn''t
>find it.
>
>Hopefully I can figure out how to make this perform a little more
>acceptably before our next maintenance window.

Do you have a lot of files in /etc/mnttab, including nfs filesystems
mounted from "server1,server2:/path"?

And you''re using lucreate for a ZFS root?  It should be
"quick"; we are
changing a number of things in Solaris 10 update 8 and we hope it will
be faster/

Casper

Jens Elkner

2009-Aug-28 12:11 UTC

head link

[zfs-discuss] live upgrade with lots of zfs filesystems

On Thu, Aug 27, 2009 at 10:59:16PM -0700, Paul B. Henson
wrote:> On Thu, 27 Aug 2009, Paul B. Henson wrote:
> 
> > However, I went to create a new boot environment to install the
patches
> > into, and so far that''s been running for about an hour and a
half :(,
> > which was not expected or planned for.
> [...]
> > I don''t think I''m going to make my downtime window
:(, and will probably
> > need to reschedule the patching. I never considered I might have to
start
> > the patch process six hours before the window.
> 
> Well, so far lucreate took 3.5 hours, lumount took 1.5 hours, applying the
> patches took all of 10 minutes, luumount took about 20 minutes, and
> luactivate has been running for about 45 minutes. I''m assuming it
will
Have a look at http://iws.cs.uni-magdeburg.de/~elkner/luc/lu-5.10.patch
or http://iws.cs.uni-magdeburg.de/~elkner/luc/lu-5.11.patch ...
So first install most recent LU patches and than one of the above.
Since still on vacation (for ~8 weeks), haven''t checked, whether there
are new LU patches out there and the patches still match (usually they do).
If not, adjusting the files manually shouldn''t be a problem ;-)

There are also versions for pre svn_b107 and pre 121430-36,121431-37:
see http://iws.cs.uni-magdeburg.de/~elkner/

More info:
http://iws.cs.uni-magdeburg.de/~elkner/luc/lutrouble.html#luslow

Have fun,
jel.
-- 
Otto-von-Guericke University     http://www.cs.uni-magdeburg.de/
Department of Computer Science   Geb. 29 R 027, Universitaetsplatz 2
39106 Magdeburg, Germany         Tel: +49 391 67 12768

Paul B. Henson

2009-Aug-29 04:56 UTC

head link

[zfs-discuss] live upgrade with lots of zfs filesystems

On Fri, 28 Aug 2009 Casper.Dik at Sun.COM wrote:
> >luactivate has been running for about 45 minutes. I''m assuming
it will
> >probably take at least the 1.5 hours of the lumount (particularly
> >considering it appears to be running a lumount process under the hood)
if
> >not the 3.5 hours of lucreate.
Eeeek, the luactivate command ended up taking about *7 hours* to complete.
And I''m not sure it was even successful, output excerpts at the end of
this
message.
> Do you have a lot of files in /etc/mnttab, including nfs filesystems
> mounted from "server1,server2:/path"?
There''s only one nfs filesystem in vfstab which is always mounted, user
home directories are automounted and would be in mnttab if accessed, but
during the lu process no users were on the box.

On the other hand, there are a *lot* of zfs filesytems in mnttab:

# grep zfs /etc/mnttab  | wc -l
    8145
> And you''re using lucreate for a ZFS root?  It should be
"quick"; we are
> changing a number of things in Solaris 10 update 8 and we hope it will be
> faster/
lucreate on a system with *only* an os root pool is blazing (the magic of
clones). The problem occurs when my data pool (with 6k odd filesystems) is
also there. The live upgrade process is analyzing all 6k of those
filesystems, mounting them all in the alternate root, unmounting them all,
and who knows what else. This is totally wasted effort, those filesystems
have nothing to do with the OS or patching, and I''m really hoping that
they
can just be completely ignored.

So, after 7 hours, here is the last bit of output from luactivate. Other
than taking forever and a day, all of the output up to this point seemed
normal. The BE s10u6 is neither the currently active BE nor the one being
made active, but these errors have me concerned something _bad_ might
happen if I reboot :(. Any thoughts?

Modifying boot archive service
Propagating findroot GRUB for menu conversion.
ERROR: Read-only file system: cannot create mount point
</.alt.s10u6/export/group/ceis>
ERROR: failed to create mount point </.alt.s10u6/export/group/ceis> for
file system <export/group/ceis>
ERROR: unmounting partially mounted boot environment file systems
ERROR: No such file or directory: error unmounting <ospool/ROOT/s10u6>
ERROR: umount: warning: ospool/ROOT/s10u6 not in mnttab
umount: ospool/ROOT/s10u6 no such file or directory
ERROR: cannot unmount <ospool/ROOT/s10u6>
ERROR: cannot mount boot environment by name <s10u6>
ERROR: Failed to mount BE <s10u6>.
ERROR: Failed to mount BE <s10u6>. Cannot propagate file
</etc/lu/installgrub.findroot> to BE
File propagation was incomplete
ERROR: Failed to propagate installgrub
ERROR: Could not propagate GRUB that supports the findroot command.
Activation of boot environment <patch-20090817> successful.

According to lustatus everything is good, but <shiver>... These boxes have
only been in full production about a month, it would not be good for them
to die during the first scheduled patches.

# lustatus
Boot Environment           Is       Active Active    Can    Copy
Name                       Complete Now    On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
s10u6                      yes      no     no        yes    -
s10u6-20090413             yes      yes    no        no     -
patch-20090817             yes      no     yes       no     -

Tuanks...

-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Paul B. Henson

2009-Aug-29 05:16 UTC

head link

[zfs-discuss] live upgrade with lots of zfs filesystems

On Fri, 28 Aug 2009, Jens Elkner wrote:
> More info:
> http://iws.cs.uni-magdeburg.de/~elkner/luc/lutrouble.html#luslow
******sweet******!!!!!!

This is *exactly* the functionality I was looking for. Thanks much!!!!

Any Sun people have any idea if Sun has any similar functionality planned
for live upgrade? Live upgrade without this capability is basically useless
on a system with lots of zfs filesystems.

Jens, thanks again, this is perfect.

-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

zfs discuss - Aug 2009 - live upgrade with lots of zfs filesystems

[zfs-discuss] live upgrade with lots of zfs filesystems

Re: live upgrade with lots of zfs filesystems

[zfs-discuss] live upgrade with lots of zfs filesystems

[zfs-discuss] live upgrade with lots of zfs filesystems

[zfs-discuss] live upgrade with lots of zfs filesystems

[zfs-discuss] live upgrade with lots of zfs filesystems

[zfs-discuss] live upgrade with lots of zfs filesystems

[zfs-discuss] live upgrade with lots of zfs filesystems