thr3ads.net - zfs discuss - [zfs-discuss] Re: system unresponsive after issuing a zpool attach [Aug 2006]

If this information is useful, please help other people find it:
Share via:

Jeff Bonwick

2006-Aug-17 06:38 UTC

[zfs-discuss] Re: system unresponsive after issuing a zpool attach

> And it started replacement/resilvering... after few minutes system became unavailbale. Reboot only gives me a few minutes, then resilvering make system
unresponsible.>
> Is there any workaroud or patch for this problem???
Argh, sorry -- the problem is that we don''t do aggressive enough
scrub/resilver throttling. The effect is most pronounced on 32-bit
or low-memory systems. We''re working on it.

One thing you might try is reducing txg_time to 1 second (the default
is 5 seconds) by saying this: "echo txg_time/W1 | mdb -kw".

Let me describe what''s happening, and why this may help.

When we kick off a scrub (same code path as resilver, so I''ll use
the term generically), we traverse the entire block tree looking
for blocks that need scrubbing. The tree traversal itself is
single-threaded, but the work it generates is not -- each time
we find a block that needs scrubbing, we schedule an async I/O
to do it. As you''ve discovered, we can generate work faster than
the I/O subsystem can process it. To avoid overloading the disks,
we throttle I/O downstream, but we don''t (yet) have an upstream
throttle. If we discover blocks really fast, we can end up
scheduling lots of I/O -- and sitting on lots of memory -- before
the downstream throttle kicks in.

The reason this relates to txg_time is that every time we sync a
transaction group, we suspend the scrub thread and wait for all
pending scrub I/Os to complete. This ensures that we won''t
asynchronously scrub a block that was freed and reallocated
in a future txg; when coupled with the COW nature of ZFS,
this allows us to run scrubs entirely independent of all
filesystem-level structure (e.g. directories) and locking rules.
This little trick makes the scrubbing algorithms *much* simpler.

The key point is that each spa_sync() throttles the scrub to zero.
By lowering txg_time from 5 to 1, you''re cutting down the maximum
number of pending scrub I/Os by roughly 5x. The unresponsiveness
you''re seeing is a threshold effect; I''m hoping that by
running
spa_sync() more often, we can get you below that threshold.

Please let me know if this works for you.

Jeff

zfs discuss - Aug 2006 - Re: system unresponsive after issuing a zpool attach

[zfs-discuss] Re: system unresponsive after issuing a zpool attach