thr3ads.net - zfs discuss - [zfs-discuss] Thank you. [Jul 2009]

If this information is useful, please help other people find it:
Share via:
Dennis Clarke
2009-Jul-15 20:36 UTC
[zfs-discuss] Thank you.

I want to express my thanks. My gratitude. I am not easily impressed
by technology anymore and ZFS impressed me this morning.

Sometime late last night a primary server of mine had a critical
fault. One of the PCI cards in a V480 was the cause and for whatever
reasons this destroyed the DC-DC power convertors that powered the
primary internal disks. It also dropped the whole machine and 12
zones.

I feared the worst and made the call for service at about midnight
last night. A Sun service tech said he could be there in 2 hours
or so but he asked me to check this and check that. The people at
the datacenter were happy to tell me there was a wrench light on
but other than that, they knew nothing.

This machine, like all critical systems I have, uses mirrored disks
in ZPools with multiple links of fibre to arrays.  I dreaded what
would happen when we tried to boot this box after all the dust was
blown out and hardware swapped.

Early this morning ... I watched the detailed diags run and finally
a nice clean ok prompt.

<*>
Hardware Power On

@(#)OBP 4.22.34 2007/07/23 13:01 Sun Fire 4XX
System is initializing with diag-switch? overrides.
Online: CPU0 CPU1 CPU2 CPU3*
Validating JTAG integrity...Done
.
.
.
CPU0: System POST Completed
    Pass/Fail Status  = 0000.0000.0000.0000
    ESB Overall Status  = ffff.ffff.ffff.ffff

<*>
POST Reset
.
.
.

{3} ok show-post-results
System POST Results
Component:    Results

CPU/Memory:    Passed
IO-Bridge8:    Passed
IO-Bridge9:    Passed
GPTwo Slots:   Passed
Onboard FCAL:  Passed
Onboard Net1:  Passed
Onboard Net0:  Passed
Onboard IDE:   Passed
PCI Slots:     Passed
BBC0:          Passed
RIO:           Passed
USB:           Passed
RSC:           Passed
POST Message:  POST PASS
{3} ok boot -s

Eventually I saw my login prompt. There were no warnings about data
corruption. No data loss. No noise at all in fact.   :-O

# zpool list
NAME     SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
fibre0   680G   654G  25.8G    96%  ONLINE  -
z0      40.2G   103K  40.2G     0%  ONLINE  -

# zpool status fibre0
  pool: fibre0
 state: ONLINE
status: The pool is formatted using an older on-disk format.  The pool can
        still be used, but some features are unavailable.
action: Upgrade the pool using ''zpool upgrade''.  Once this is
done, the
        pool will no longer be accessible on older software versions.
 scrub: none requested
config:

        NAME         STATE     READ WRITE CKSUM
        fibre0       ONLINE       0     0     0
          mirror     ONLINE       0     0     0
            c2t16d0  ONLINE       0     0     0
            c5t0d0   ONLINE       0     0     0
          mirror     ONLINE       0     0     0
            c5t1d0   ONLINE       0     0     0
            c2t17d0  ONLINE       0     0     0
          mirror     ONLINE       0     0     0
            c5t2d0   ONLINE       0     0     0
            c2t18d0  ONLINE       0     0     0
          mirror     ONLINE       0     0     0
            c2t20d0  ONLINE       0     0     0
            c5t4d0   ONLINE       0     0     0
          mirror     ONLINE       0     0     0
            c2t21d0  ONLINE       0     0     0
            c5t6d0   ONLINE       0     0     0
        spares
          c2t22d0    AVAIL

errors: No known data errors
#

Not one error. No message about resilver this or inode that.

Everything booted flawlessly and I was able to see all my zones :

# bin/lz
---------------------------------------------------------------------
NAME   ID  STATUS      PATH          HOSTNAME      BRAND     IP
---------------------------------------------------------------------
z_001  4   running     /zone/z_001   pluto         solaris8  excl
z_002  -   installed   /zone/z_002   ldap01        native    shared
z_003  -   installed   /zone/z_003   openfor       solaris9  shared
z_004  6   running     /zone/z_004   gaspra        native    shared
z_005  5   running     /zone/z_005   ibisprd       native    shared
z_006  7   running     /zone/z_006   io            native    shared
z_007  1   running     /zone/z_007   nis           native    shared
z_008  3   running     /zone/z_008   callistoz     native    shared
z_009  2   running     /zone/z_009   loginz        native    shared
z_010  -   installed   /zone/z_010   venus         solaris8  shared
z_011  -   installed   /zone/z_011   adbs          solaris9  shared
z_012  -   installed   /zone/z_012   auroraux      native    shared
z_013  8   running     /zone/z_013   osiris        native    excl
z_014  -   installed   /zone/z_014   jira          native    shared

People love to complain. I see it all the time.

I downloaded this OS for free and I run it in production.
I have support and I am fine with paying for support contracts.
But someone somewhere needs to buy the ZFS guys some keg(s) of
whatever beer they want. Or maybe new Porsche Cayman S toys.

That would be gratitude as something more than just words.

Thank you.

-- 
Dennis Clarke

ps: the one funny thing is that I had to get a few things swapped
    out and I guess that resets the system clock. It now reports :

# uptime
  8:19pm  up 3483 day(s), 19:07,  1 user,  load average: 0.24, 0.21, 0.18

I don''t think that is accurate :-)
zfs discuss - Jul 2009 - Thank you.

[zfs-discuss] Thank you.