thr3ads.net - zfs discuss - [zfs-discuss] ZFS Volume Destroy Halts I/O [Feb 2010]

If this information is useful, please help other people find it:
Share via:

Nick

2010-Feb-15 16:31 UTC

[zfs-discuss] ZFS Volume Destroy Halts I/O

I''ve seen threads like this around this ZFS forum, so forgive me if
I''m covering old ground.  I currently have a ZFS configuration where I
have individual drives presented to my Opensolaris machine and I''m
using ZFS to do a RAIDZ-1 on the drives.  I have several filesystems and volumes
on this storage pool.  When I do a "zfs destroy" on a volume (and
possibly a filesystem, though I haven''t tried that, yet), I run into
two issues.  The first is that the destroy command takes several hours to
complete - for example, destroying a 10 GB volume on Friday took 5 hours.  The
second is that, while this command is running, all I/O on the storage pool
appears to be halted, or at least paused.  There are a few symptoms of
this...first, NFS clients accessing volumes on this server just hang and do not
respond to commands.  Some clients hang indefinitely while others time out and
mark the volume as stale.  On iSCSI clients, the clients most often time out and
disconnect from the iSCSI volume - which is bad for my clients that are booting
over those iSCSI volumes.

I''m using the latest Opensolaris dev build (132) and I have my storage
pools and volumes upgraded to the latest available versions.  I am using
deduplication on my ZFS volumes, set at the highest volume level, so
I''m not sure if this has an impact.  Can anyone provide any hints as to
whether this is a bug or expected behavior, what''s causing it, and how
I can solve or work around it?

Thanks,
Nick
-- 
This message posted from opensolaris.org

Bob Friesenhahn

2010-Feb-15 16:43 UTC

head link

[zfs-discuss] ZFS Volume Destroy Halts I/O

On Mon, 15 Feb 2010, Nick wrote:>
> I''m using the latest Opensolaris dev build (132) and I have my 
> storage pools and volumes upgraded to the latest available versions. 
> I am using deduplication on my ZFS volumes, set at the highest 
> volume level, so I''m not sure if this has an impact.  Can anyone 
> provide any hints as to whether this is a bug or expected behavior, 
> what''s causing it, and how I can solve or work around it?
There is no doubt that it is both a bug and expected behavior and is 
related to deduplication being enabled.

Others here have reported similar problems.  The problem seems to be 
due to insufficient caching in the zfs ARC due to not enough RAM or 
L2ARC not being installed.  Some people achieved rapid success after 
installing a SSD as a L2ARC device.  Other people have reported 
success after moving their pool to a system with a lot more RAM 
installed.  Others have relied on patience.  A few have given up and 
considered their pool totally lost.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Nick

2010-Feb-15 18:26 UTC

head link

[zfs-discuss] ZFS Volume Destroy Halts I/O

> 
> There is no doubt that it is both a bug and expected
> behavior and is 
> related to deduplication being enabled.
Is it expected because it''s a bug, or is it a bug that is not going to
be fixed and so I should expect it?  Is there a bug/defect I can keep an eye on
in one of the Opensolaris bug/defect interfaces that will help me figure out
what''s going on with it and when a solution is expected?
> 
> Others here have reported similar problems.  The
> problem seems to be 
> due to insufficient caching in the zfs ARC due to not
> enough RAM or 
> L2ARC not being installed.  Some people achieved
> rapid success after 
> installing a SSD as a L2ARC device.  Other people
> have reported 
> success after moving their pool to a system with a
> lot more RAM 
> installed.  Others have relied on patience.  A few
> have given up and 
> considered their pool totally lost.
> 
I have 8 GB of RAM on my system, which I consider to be a fairly decent amount
of RAM for a storage system - maybe I''m naive about that, though.  8GB
should provide a pretty decent amount of RAM available for caching, so I would
think that a 10 GB volume would be able to go through RAM pretty quickly.  Also,
there isn''t much except ZFS and COMSTAR running on this box, so there
isn''t really anything else using the RAM.

I''ve already considered going and buying some SSDs for the L2ARC stuff,
so maybe I''ll pursue this path.  I was certainly patient with it - I
didn''t reboot the box because I could see slow progress on the destroy.
However, the other guys in my group who had stuff hanging off of this ZFS
storage that had to wait 5 hours for the storage to respond to their requests
were not quite so understanding or patient.  This is a pretty big roadblock,
IMHO, to this being a workable storage solution.  I certainly do understand that
I''m using the dev releases, so it is under development and I should
expect bugs - this one just seems pretty significant, like I would need to
schedule maintenance windows to do volume management.

Thanks!
-Nick
-- 
This message posted from opensolaris.org

Eric Schrock

2010-Feb-15 18:42 UTC

head link

[zfs-discuss] ZFS Volume Destroy Halts I/O

On 02/15/10 10:26, Nick wrote:>> There is no doubt that it is both a bug and expected
>> behavior and is 
>> related to deduplication being enabled.
> 
> Is it expected because it''s a bug, or is it a bug that is not
going to be fixed and so I should expect it?  Is there a bug/defect I can keep
an eye on in one of the Opensolaris bug/defect interfaces that will help me
figure out what''s going on with it and when a solution is expected?
See:

6922161 zio_ddt_free is single threaded with performance impact
6924824 destroying a dedup-enabled dataset bricks system

Both issues stem from the fact that free operations used to be in-memory 
only but with dedup enabled can result in synchronous I/O to disks in 
syncing context.

- Eric

-- 
Eric Schrock, Fishworks                    http://blogs.sun.com/eschrock

Nick

2010-Feb-15 19:04 UTC

head link

[zfs-discuss] ZFS Volume Destroy Halts I/O

Thanks!
-- 
This message posted from opensolaris.org

Nick

2010-Feb-25 20:01 UTC

head link

[zfs-discuss] ZFS Volume Destroy Halts I/O

One other question - I''m seeing the same sort of behavior when I try to
do something like "zfs set sharenfs=off storage/fs" - is there a
reason that turning off NFS sharing should halt I/O?
-- 
This message posted from opensolaris.org

zfs discuss - Feb 2010 - ZFS Volume Destroy Halts I/O

[zfs-discuss] ZFS Volume Destroy Halts I/O

[zfs-discuss] ZFS Volume Destroy Halts I/O

[zfs-discuss] ZFS Volume Destroy Halts I/O

[zfs-discuss] ZFS Volume Destroy Halts I/O

[zfs-discuss] ZFS Volume Destroy Halts I/O

[zfs-discuss] ZFS Volume Destroy Halts I/O