thr3ads.net - zfs discuss - [zfs-discuss] Data distribution not even between vdevs [Nov 2011]

If this information is useful, please help other people find it:
Share via:

Ding Honghui

2011-Nov-09 03:13 UTC

[zfs-discuss] Data distribution not even between vdevs

Hi list,

My zfs write performance is poor and need your help.

I create zpool with 2 raidz1. When the space is to be used up, I add 2
another raidz1 to extend the zpool.
After some days, the zpool is almost full, I remove some old data.

But now, as show below, the first 2 raidz1 vdev usage is about 78% and the
last 2 raidz1 vdev usage is about 93%.

I have line in /etc/system

set zfs:metaslab_df_free_pct=4

So the performance degrade will happen when the vdev usage is above 90%.

All my file is small files which size is about 150KB.

Now the questions is:
1. Should I balance the data between the vdevs by copy the data and remove
the data which locate in last 2 vdevs?
2. Is there any method to automatically re-balance the data?
or
Any better solution to resolve this problem?

root at nas-01:~# zpool iostat -v
                                           capacity     operations
bandwidth
pool                                     used  avail   read  write   read
write
--------------------------------------  -----  -----  -----  -----  -----
-----
datapool                                21.3T  3.93T     26     96  81.4K
2.81M
  raidz1                                4.93T  1.39T      8     28  25.7K
708K
    c3t6002219000854867000003B2490FB009d0      -      -      3     10
216K   119K
    c3t6002219000854867000003B4490FB063d0      -      -      3     10
214K   119K
    c3t60022190008528890000055F4CB79C10d0      -      -      3     10
214K   119K
    c3t6002219000854867000003B8490FB0FFd0      -      -      3     10
215K   119K
    c3t6002219000854867000003BA490FB14Fd0      -      -      3     10
215K   119K
    c3t60022190008528890000041C490FAFA0d0      -      -      3     10
215K   119K
    c3t6002219000854867000003C0490FB27Dd0      -      -      3     10
214K   119K
  raidz1                                4.64T  1.67T      8     32  24.6K
581K
    c3t6002219000854867000003C2490FB2BFd0      -      -      3     10
224K  98.2K
    c3t60022190008528890000041F490FAFD0d0      -      -      3     10
222K  98.2K
    c3t600221900085288900000428490FB0D8d0      -      -      3     10
222K  98.2K
    c3t600221900085288900000422490FB02Cd0      -      -      3     10
223K  98.3K
    c3t600221900085288900000425490FB07Cd0      -      -      3     10
223K  98.3K
    c3t600221900085288900000434490FB24Ed0      -      -      3     10
223K  98.3K
    c3t60022190008528890000043949100968d0      -      -      3     10
224K  98.2K
  raidz1                                5.88T   447G      5     17  16.0K
67.7K
    c3t60022190008528890000056B4CB79D66d0      -      -      3     12
215K  12.2K
    c3t6002219000854867000004B94CB79F91d0      -      -      3     12
216K  12.2K
    c3t6002219000854867000004BB4CB79FE1d0      -      -      3     12
214K  12.2K
    c3t6002219000854867000004BD4CB7A035d0      -      -      3     12
215K  12.2K
    c3t6002219000854867000004BF4CB7A0ABd0      -      -      3     12
216K  12.2K
    c3t60022190008528890000055C4CB79BB8d0      -      -      3     12
214K  12.2K
    c3t6002219000854867000004C14CB7A0FDd0      -      -      3     12
215K  12.2K
  raidz1                                5.88T   441G      4      1  14.9K
12.4K
    c3t60022190008528890000042B490FB124d0      -      -      1      1
131K  2.33K
    c3t6002219000854867000004C54CB7A199d0      -      -      1      1
132K  2.33K
    c3t6002219000854867000004C74CB7A1D5d0      -      -      1      1
130K  2.33K
    c3t6002219000852889000005594CB79B64d0      -      -      1      1
133K  2.33K
    c3t6002219000852889000005624CB79C86d0      -      -      1      1
132K  2.34K
    c3t6002219000852889000005654CB79CCCd0      -      -      1      1
131K  2.34K
    c3t6002219000852889000005684CB79D1Ed0      -      -      1      1
132K  2.33K
  c3t6B8AC6F0000F8376000005864DC9E9F1d0      0   928G      0     16    289
1.47M
--------------------------------------  -----  -----  -----  -----  -----
-----

root at nas-01:~#
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20111109/ecf69f3c/attachment-0001.html>

Edward Ned Harvey

2011-Nov-09 14:05 UTC

head link

[zfs-discuss] Data distribution not even between vdevs

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Ding Honghui
> 
> But now, as show below, the first 2 raidz1 vdev usage is about 78% and the
> last 2 raidz1 vdev usage is about 93%.
In this case, when you write, it should be writing to the first two vdevs,
not the last two.  So the fact that the last two are over 93% full should be
irrelevant in terms of write performance.

> All my file is small files which size is about 150KB.
That''s too bad.  Raidz performs well with large sequential data, and
performs poor with small random files.

> Now the questions is:
> 1. Should I balance the data between the vdevs by copy the data and remove
> the data which locate in last 2 vdevs?
If you want to.  But most people wouldn''t bother.  Especially since
you''re
talking about 75% versus 90%.  It''s difficult to balance it so
*precisely*
as to get them both around 85%

> 2. Is there any method to automatically re-balance the data?
> or
There is no automatic way to do it.

> Any better solution to resolve this problem?
I would recommend, if possible, re-creating your pool as a bunch of mirrors
instead of raidz.  It will perform better, but it will cost hardware.  Also,
if you have compressible data then enabling compression gains both
performance and available disk space.

Gregg Wonderly

2011-Nov-09 15:09 UTC

head link

[zfs-discuss] Data distribution not even between vdevs

On 11/9/2011 8:05 AM, Edward Ned Harvey wrote:>> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
>> bounces at opensolaris.org] On Behalf Of Ding Honghui
>>
>> But now, as show below, the first 2 raidz1 vdev usage is about 78% and
the
>> last 2 raidz1 vdev usage is about 93%.
> In this case, when you write, it should be writing to the first two vdevs,
> not the last two.  So the fact that the last two are over 93% full should
be
> irrelevant in terms of write performance.
>
>
>> All my file is small files which size is about 150KB.
> That''s too bad.  Raidz performs well with large sequential data,
and
> performs poor with small random files.
>
>
>> Now the questions is:
>> 1. Should I balance the data between the vdevs by copy the data and
remove
>> the data which locate in last 2 vdevs?
> If you want to.  But most people wouldn''t bother.  Especially
since you''re
> talking about 75% versus 90%.  It''s difficult to balance it so
*precisely*
> as to get them both around 85%
>
>
>> 2. Is there any method to automatically re-balance the data?
>> or
> There is no automatic way to do it.For me, this is a key issue.  If there was an automatic rebalancing mechanism, 
that same mechanism would work perfectly to allow pools to have disk sets 
removed.  It would provide the needed basic mechanism of just moving stuff 
around to eliminate the use of a particular part of the pool that you wanted to 
remove.

Gregg Wonderly

Edward Ned Harvey

2011-Nov-10 13:31 UTC

head link

[zfs-discuss] Data distribution not even between vdevs

> From: Gregg Wonderly [mailto:greggwon at gmail.com]
> 
> > There is no automatic way to do it.
> For me, this is a key issue.  If there was an automatic rebalancing
mechanism,> that same mechanism would work perfectly to allow pools to have disk sets
> removed.  It would provide the needed basic mechanism of just moving stuff
> around to eliminate the use of a particular part of the pool that you
wanted> to
> remove.
Search this list for bp_rewrite.  There are many features that are dependent
on this feature - rebalance, defrag, vdev removal, toggle compression or
dedup for existing data, etc.  It''s long since requested by many
people, but
apparently fundamentally difficult to do, or something.

Possibly Parallel Threads

Search for more maybe matching threads

zfs discuss - Nov 2011 - Data distribution not even between vdevs

[zfs-discuss] Data distribution not even between vdevs

[zfs-discuss] Data distribution not even between vdevs

[zfs-discuss] Data distribution not even between vdevs

[zfs-discuss] Data distribution not even between vdevs

Possibly Parallel Threads