Hello all,
I have a situation where zpool status shows no known data errors but all
processes on a specific filesystem are hung. This has happened 2 times before
since we installed Opensolaris 2009.06 snv_111b. For instance there are two
files systems in this pool ''zfs get all'' on one filesystem
returns with out issue when ran on the other filesystem it hangs. Also a
''df -h'' hangs, etc.
This file system has many different operation running on it;
1) It receives incremental snapshot every 30 minutes continuously.
2) every night a clone is made from one of the received snapshot streams then a
filesystem backup is taken on that clone (the backup is a directory traversal)
once the backup completes the clone is destroyed.
We tried to upgrade to the latest build but ran in to the current
''check sum'' issue in build snv_122 so we rolled back.
# uname -a
SunOS lahar2 5.11 snv_111b i86pc i386 i86pc
# zpool status zdisk1
pool: zdisk1
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
zdisk1 ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c7t1d0 ONLINE 0 0 0
c7t2d0 ONLINE 0 0 0
c7t3d0 ONLINE 0 0 0
c7t4d0 ONLINE 0 0 0
c7t5d0 ONLINE 0 0 0
spares
c7t6d0 AVAIL
errors: No known data errors
The filesystem is currently in this ''hung'' state, is there
any commands I can run to help debug the issue?
TIA
--
This message posted from opensolaris.org
Marc Emmerson
2009-Sep-14 14:59 UTC
[zfs-discuss] zpool status OK but zfs filesystem seems hung
Hi there, I wonder if your issue is related to mine, see thread here: http://opensolaris.org/jive/thread.jspa?threadID=112777&tstart=0 It only manifested after I upgraded to snv121, although booting back to 118 did not fix it. -- This message posted from opensolaris.org
Thanks for the reply but this seems to be a bit different. a couple of things I failed to mention; 1) this is a secondary pool and not the root pool. 2) the snapshot are trimmed to only keep 80 or so. The system boots and runs fine. It''s just an issue for this secondary pool and filesystem. It seems to be directly related to I/O intensive operations as the (full) backup seems to trigger it, never seen it happen with incremental backups... Thanks. -- This message posted from opensolaris.org