thr3ads.net - zfs discuss - [zfs-discuss] ZFS Problems under vmware [May 2008]

If this information is useful, please help other people find it:
Share via:

Paul B. Henson

2008-May-12 23:20 UTC

[zfs-discuss] ZFS Problems under vmware

I have a test bed S10U5 system running under vmware ESX that has a weird
problem.

I have a single virtual disk, with some slices allocated as UFS filesystem
for the operating system, and s7 as a ZFS pool.

Whenever I reboot, the pool fails to open:

May  8 17:32:30 niblet fmd: [ID 441519 daemon.error] SUNW-MSG-ID: ZFS-8000-CS,
TYPE: Fault, VER: 1, SEVERITY: Major
May  8 17:32:30 niblet EVENT-TIME: Thu May  8 17:32:30 PDT 2008
May  8 17:32:30 niblet PLATFORM: VMware Virtual Platform, CSN: VMware-50 35 75
0b a3 b3 e5 d4-38 3f 00 7a 10 c0 e2 d7, HOSTNAME: niblet
May  8 17:32:30 niblet SOURCE: zfs-diagnosis, REV: 1.0
May  8 17:32:30 niblet EVENT-ID: f163d843-694d-4659-81e8-aa15bb72e2e0
May  8 17:32:30 niblet DESC: A ZFS pool failed to open.  Refer to
http://sun.com/msg/ZFS-8000-CS for more information.
May  8 17:32:30 niblet AUTO-RESPONSE: No automated response will occur.
May  8 17:32:30 niblet IMPACT: The pool data is unavailable
May  8 17:32:30 niblet REC-ACTION: Run ''zpool status -x'' and
either attach the missing device or
May  8 17:32:30 niblet      restore from backup.


According to ''zpool status'', the device could not be opened:

root at niblet ~ # zpool status
  pool: ospool
 state: UNAVAIL
status: One or more devices could not be opened.  There are insufficient
        replicas for the pool to continue functioning.
action: Attach the missing device and online it using ''zpool
online''.
   see: http://www.sun.com/msg/ZFS-8000-D3
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        ospool      UNAVAIL      0     0     0  insufficient replicas
          c1t0d0s7  UNAVAIL      0     0     0  cannot open


However, according to format, the device is perfectly accessible, and
format even indicates that slice 7 is an active pool:

root at niblet ~ # format
Searching for disks...done

AVAILABLE DISK SELECTIONS:
       0. c1t0d0 <DEFAULT cyl 4092 alt 2 hd 128 sec 32>
          /pci at 0,0/pci1000,30 at 10/sd at 0,0
Specify disk (enter its number): 0
selecting c1t0d0
[disk formatted]
Warning: Current Disk has mounted partitions.
/dev/dsk/c1t0d0s0 is currently mounted on /. Please see umount(1M).
/dev/dsk/c1t0d0s1 is currently used by swap. Please see swap(1M).
/dev/dsk/c1t0d0s3 is currently mounted on /usr. Please see umount(1M).
/dev/dsk/c1t0d0s4 is currently mounted on /var. Please see umount(1M).
/dev/dsk/c1t0d0s5 is currently mounted on /opt. Please see umount(1M).
/dev/dsk/c1t0d0s6 is currently mounted on /home. Please see umount(1M).
/dev/dsk/c1t0d0s7 is part of active ZFS pool ospool. Please see zpool(1M).


Trying to import it does not find it:

root at niblet ~ # zpool import
no pools available to import


Exporting it works fine:

root at niblet ~ # zpool export ospool


But then the import indicates that the pool may still be in use:

root at niblet ~ # zpool import ospool
cannot import ''ospool'': pool may be in use from other system


Adding the -f flag imports successfully:

root at niblet ~ # zpool import -f ospool

root at niblet ~ # zpool status
  pool: ospool
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        ospool      ONLINE       0     0     0
          c1t0d0s7  ONLINE       0     0     0

errors: No known data errors


And then everything works perfectly fine, until I reboot again, at which
point the cycle repeats.

I have a similar test bed running on actual x4100 hardware that doesn''t
exhibit this problem.

Any idea what''s going on here?


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Gabriele Bulfon

2008-May-28 15:32 UTC

head link

[zfs-discuss] ZFS Problems under vmware

Hello, I''m having the same exact situation on one VM, and not on
another VM on the same infrastructure.
The only difference is that on the failing VM I initially created the pool with
a name and then changed the mountpoint to another name.
Did you found a solution to the issue?
Should I consider to get back to UFS on this infrastructure?
Thanx a lot
Gabriele.
 
 
This message posted from opensolaris.org

Anthony Worrall

2008-Jun-16 15:00 UTC

head link

[zfs-discuss] ZFS Problems under vmware

I am seeing the same problem using a seperate virtual disk for the pool.
This is happening with Solaris 10 U3, U4 and U5


SCSI reservations is know to be an issue with clustered solaris
http://blogs.sun.com/SC/entry/clustering_solaris_guests_that_run

I wonder if this is the same problem. Maybe we have to use Raw Device Mapping
(RDM) to get zfs to work under vmware.

Anthony Worrall
 
 
This message posted from opensolaris.org

Anthony Worrall

2008-Jun-16 17:08 UTC

head link

[zfs-discuss] ZFS Problems under vmware

Added an vdev using rdm and that seems to be stable over reboots

however the pools based on a virtual disk now also seems to be stable after
doing an export and import -f
 
 
This message posted from opensolaris.org

Dave Bechtel

2008-Jun-16 17:14 UTC

head link

[zfs-discuss] ZFS Problems under vmware

Hi - I''m interested in your solution as my current ZFS/vmware
experiment is stalled.
I have a 6-disk SCSI rack ( 6 @ 9GB/ea ) attached as Raw disks to the VM
(Workstation 6), and have been getting ZFS pool corruption on reboot.  Vmware is
allowing the Solaris guest to write a disklabel that is (1) cylinder over the
physical # on the disk.

Could you please post more detailed info? TIA
 
 
This message posted from opensolaris.org

Anthony Worrall

2008-Jun-17 09:08 UTC

head link

[zfs-discuss] ZFS Problems under vmware

Raw Device Mapping is a feature of ESX 2.5 and above which allows a guest OS to
have access to a LUN on fibre or ISCSI SAN.

See http://www.vmware.com/pdf/esx25_rawdevicemapping.pdf for more details.

You may be able to do something similar with the raw disks under workstation
see http://www.vmware.com/support/reference/linux/osonpartition_linux.html


Since I added the RDM to one of my guest OSes all of them them have started
working using virtual disks after running
<code>
#zpool export tank
#zpool import -f tank
</code>

Maybe adding the RDM changed some behavoiur of ESX or mabe I just got lucky
 
 
This message posted from opensolaris.org

zfs discuss - May 2008 - ZFS Problems under vmware

[zfs-discuss] ZFS Problems under vmware

[zfs-discuss] ZFS Problems under vmware

[zfs-discuss] ZFS Problems under vmware

[zfs-discuss] ZFS Problems under vmware

[zfs-discuss] ZFS Problems under vmware

[zfs-discuss] ZFS Problems under vmware