Hello Robert,
Friday, April 13, 2007, 2:07:11 AM, you wrote:
RM> Hello Enda,
RM> Thursday, April 12, 2007, 2:36:39 PM, you wrote:
EOCSMSI>> Robert Milkowski wrote:>>> Hello Enda,
>>>
>>> Wednesday, April 11, 2007, 4:21:35 PM, you wrote:
>>>
>>> EOCSMSI> Robert Milkowski wrote:
>>>>> Hello zfs-discuss,
>>>>>
>>>>> In order to get IDR126199-01 I need to install 120473-05
first.
>>>>> I can get 120473-07 but everything more than -05 is
marked as
>>>>> incompatible with IDR126199-01 so I do not want to force
it.
>>>>>
>>>>> Local Sun''s support has problems with getting
120473-05 also so I''m
>>>>> stuck for now and I would really like to get that IDR
running.
>>>>>
>>>>> Can someone help?
>>>>>
>>>
>>> EOCSMSI> Hi
>>> EOCSMSI> This patch will be on SunSolve possibly later today,
tomorrow at latest
>>> EOCSMSI> I suspect as it has only justed being pushed out from
testing.
>>> EOCSMSI> I have sent the patch in another mail for now.
>>>
>>> Thank you patch - it worked (installed) along with IDR properly.
>>>
>>> However it seems like the problem is not solved by IDR :(
>>>
EOCSMSI>> Hi Robert
EOCSMSI>> So this IDR has two bugs as fixed
EOCSMSI>> 6458218 assertion failed: ss == NULL
EOCSMSI>> 6495013 Loops and recursion in metaslab_ff_alloc can kill
performance,
EOCSMSI>> even on a pool with lots of free data
EOCSMSI>> I have add''ed the IDR''s requestors as they can
comment, which one of the
EOCSMSI>> above fixes was not solved via this IDR in your testing.
RM> I''m talking about 6495013 Loops and recursion in
metaslab_ff_alloc can
RM> kill performance.
RM> I''ve got to test again with dtrace during peak hours if
recursion
RM> still happens (on that scale) or maybe it''s not. The problem is
that
RM> during peak hours I still (again) can see performance problems. At
RM> first glance it looks like this time CPU is probably NOT a problem, so
RM> it would seem as 6495013 is indeed fixed and the problem is somewhere
RM> else. As I send|recv all file systems to another server and than back
RM> to the same server (after re-creating pool) before applying IDR it
RM> solved problem with a performance for some time. Not it again is
RM> getting worse day by day, and the IDR seems not to help. So perhaps
RM> there is a problem with zfs data fragmentation - that''s why zfs
RM> sedn|recv helped and 6495013 made it only worse.
RM> ??
RM> Do you have any specific dtrace script in mind to test if 6495013 is
RM> solved or should I use the same as in the case?
bash-3.00# uname -a
SunOS nfs-10-1.srv 5.10 Generic_125100-04 sun4u sparc SUNW,Sun-Fire-V440
bash-3.00# showrev -p|grep IDR
Patch: IDR126199-01 Obsoletes: Requires: 120473-05 Incompatibles: 120473-06
Packages: SUNWzfskr
bash-3.00#
Right now this is during off hours and while I see no problems on
clients it seems much slower then similar configs on UFS
(zil_disable=1). There''s no much writes.
I''ll send later output during peak hours.
For example:
bash-3.00# ptime sync
real 1:29.199
user 0.001
sys 0.032
bash-3.00#
bash-3.00# dtrace -n profile-97''{@[stack()]=count();}'' -n
END''{trunc(@,10);printa(@);}''
dtrace: description ''profile-97'' matched 1 probe
dtrace: description ''END'' matched 1 probe
^C
CPU ID FUNCTION:NAME
3 2 :END
unix`i_ddi_splhigh+0x14
unix`disp_getwork+0x38
unix`idle+0xd4
unix`thread_start+0x4
116
unix`disp_getwork+0x1a8
unix`idle+0xd4
unix`thread_start+0x4
117
unix`idle+0x12c
unix`thread_start+0x4
124
unix`i_ddi_splx
unix`disp_getwork+0x160
unix`idle+0xd4
unix`thread_start+0x4
127
unix`disp_getwork+0x10c
unix`idle+0xd4
unix`thread_start+0x4
128
zfs`space_map_seg_compare+0x4
genunix`avl_find+0x2c
zfs`space_map_add+0x12c
zfs`space_map_load+0x218
zfs`metaslab_activate+0x3c
zfs`metaslab_group_alloc+0x1d4
zfs`metaslab_alloc_dva+0x114
zfs`metaslab_alloc+0x2c
zfs`zio_dva_allocate+0x4c
zfs`zio_write_compress+0x1ec
zfs`arc_write+0xe4
zfs`dbuf_sync+0x6c0
zfs`dnode_sync+0x35c
zfs`dmu_objset_sync_dnodes+0x6c
zfs`dmu_objset_sync+0x54
zfs`dsl_dataset_sync+0xc
zfs`dsl_pool_sync+0x64
zfs`spa_sync+0x1b0
zfs`txg_sync_thread+0x134
unix`thread_start+0x4
128
unix`disp_getwork+0x90
unix`idle+0xd4
unix`thread_start+0x4
132
unix`disp_getwork+0x7c
unix`idle+0xd4
unix`thread_start+0x4
139
zfs`space_map_seg_compare
genunix`avl_find+0x2c
zfs`space_map_add+0x12c
zfs`space_map_load+0x218
zfs`metaslab_activate+0x3c
zfs`metaslab_group_alloc+0x1d4
zfs`metaslab_alloc_dva+0x114
zfs`metaslab_alloc+0x2c
zfs`zio_dva_allocate+0x4c
zfs`zio_write_compress+0x1ec
zfs`arc_write+0xe4
zfs`dbuf_sync+0x6c0
zfs`dnode_sync+0x35c
zfs`dmu_objset_sync_dnodes+0x6c
zfs`dmu_objset_sync+0x54
zfs`dsl_dataset_sync+0xc
zfs`dsl_pool_sync+0x64
zfs`spa_sync+0x1b0
zfs`txg_sync_thread+0x134
unix`thread_start+0x4
140
unix`i_ddi_splx+0x1c
unix`disp_getwork+0x160
unix`idle+0xd4
unix`thread_start+0x4
318
bash-3.00#
bash-3.00# ./avl_impact_on_zfs.d
^C
Loops in metaslab_ff_alloc
value ------------- Distribution ------------- count
-1 | 0
0 |@@@@@@@@@@@@@@@@@@@@@@@ 2249
1 |@@@@@ 485
2 |@@@ 334
4 |@@ 241
8 |@@ 170
16 |@ 125
32 |@ 76
64 |@ 61
128 | 36
256 | 19
512 | 15
1024 | 8
2048 | 8
4096 | 5
8192 | 6
16384 | 9
32768 | 3
65536 | 5
131072 | 5
262144 | 6
524288 | 0
Time per avl_walk in ns
value ------------- Distribution ------------- count
-1 | 0
0 | 138
1 | 0
2 | 0
4 | 0
8 | 0
16 | 0
32 | 0
64 |@@@@@@@@@@@@@@@@ 1523563
128 |@@@@@ 527718
256 |@@@@@@@@@ 860982
512 |@@@@@@@ 696676
1024 |@@@ 254440
2048 | 25446
4096 | 643
8192 | 603
16384 | 1368
32768 | 292
65536 | 23
131072 | 0
Loops in avl_find
value ------------- Distribution ------------- count
-1 | 0
0 | 86
1 | 188
2 | 1100
4 | 2178
8 |@ 5261
16 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 288941
32 | 0
Time per avl_find in ns
value ------------- Distribution ------------- count
64 | 0
128 | 27
256 | 104
512 | 591
1024 |@ 5094
2048 |@@@@@@@@@@@@@@@@@@@ 142584
4096 |@@@@@@@@@@@@@@@@@@@@ 146899
8192 | 1277
16384 | 685
32768 | 480
65536 | 12
131072 | 0
262144 | 0
524288 | 1
1048576 | 0
Time per avl_insert in ns
value ------------- Distribution ------------- count
32 | 0
64 |@@@@@@@@@@@@@ 727720
128 |@@@@@@@@@@@@@@@@@ 906701
256 |@@@@@@@@@@ 556764
512 | 1213
1024 | 122
2048 | 122
4096 | 26
8192 | 40
16384 | 257
32768 | 73
65536 | 5
131072 | 0
Loops in metaslab_ff_alloc
value ------------- Distribution ------------- count
-1 | 0
0 |@@@@@@@@@@@@@@@@@@@@@@@ 2249
1 |@@@@@ 485
2 |@@@ 334
4 |@@ 241
8 |@@ 170
16 |@ 125
32 |@ 76
64 |@ 61
128 | 36
256 | 19
512 | 15
1024 | 8
2048 | 8
4096 | 5
8192 | 6
16384 | 9
32768 | 3
65536 | 5
131072 | 5
262144 | 6
524288 | 0
Time per avl_walk in ns
value ------------- Distribution ------------- count
-1 | 0
0 | 138
1 | 0
2 | 0
4 | 0
8 | 0
16 | 0
32 | 0
64 |@@@@@@@@@@@@@@@@ 1523563
128 |@@@@@ 527718
256 |@@@@@@@@@ 860982
512 |@@@@@@@ 696676
1024 |@@@ 254440
2048 | 25446
4096 | 643
8192 | 603
16384 | 1368
32768 | 292
65536 | 23
131072 | 0
Time spent in avl_find : 1284633600 ns
Time spent in avl_insert : 445301000 ns
Time spent in avl_walk : 1620582500 ns
Elapsed time : 65060492600 ns
#### avl script
bash-3.00# cat avl_impact_on_zfs.d
#!/usr/sbin/dtrace -s
#pragma D option quiet
BEGIN
{
self->in_metaslab = 0;
self->in_space_map_add = 0;
self->in_space_map_remove = 0;
Start = timestamp;
TotalTimeIn_avl_walk = 0;
TotalTimeIn_avl_find = 0;
TotalTimeIn_avl_insert = 0;
}
fbt::metaslab_ff_alloc:entry
/self->in_metaslab == 0/
{
self->in_metaslab = 1;
self->loopcount = 0;
self->metaslab_alloc_start = timestamp;
}
fbt::space_map_add:entry
/self->in_space_map_add == 0/
{
self->in_space_map_add = 1;
}
fbt::avl_walk:entry
/self->in_metaslab/
{
self->loopcount++;
self->avl_walk_start_time = vtimestamp;
}
fbt::avl_walk:return
/self->in_metaslab/
{
@avl_walk_time["Time per avl_walk in ns"] =
quantize(vtimestamp - self->avl_walk_start_time);
TotalTimeIn_avl_walk += vtimestamp - self->avl_walk_start_time;
}
fbt::avl_insert:entry
/self->in_space_map_remove/
{
self->avl_insert_start_time = vtimestamp;
}
fbt::avl_insert:return
/self->in_space_map_remove/
{
@avl_insert_time["Time per avl_insert in ns"] =
quantize(vtimestamp - self->avl_insert_start_time);
TotalTimeIn_avl_insert += vtimestamp - self->avl_insert_start_time;
}
fbt::avl_find:entry
/self->in_space_map_add/
{
self->loops_in_avl_find = 0;
self->avl_find_start_time = vtimestamp;
}
fbt::space_map_seg_compare:entry
/self->in_space_map_add/
{
self->loops_in_avl_find++;
}
fbt::avl_find:return
/self->in_space_map_add/
{
@avl_find_loops["Loops in avl_find"] =
quantize(self->loops_in_avl_find);
@avl_find_time["Time per avl_find in ns"] =
quantize(vtimestamp - self->avl_find_start_time);
TotalTimeIn_avl_find += vtimestamp - self->avl_find_start_time;
}
fbt::space_map_add:return
/self->in_space_map_add/
{
self->in_space_map_add = 0;
}
fbt::metaslab_ff_alloc:return
/self->in_metaslab/
{
self->in_metaslab = 0;
@loops["Loops in metaslab_ff_alloc"] =
quantize(self->loopcount);
self->loopcount = 0;
}
fbt::space_map_remove:entry
/self->in_space_map_remove == 0/
{
self->in_space_map_remove = 1;
}
fbt::space_map_remove:return
/self->in_space_map_remove/
{
self->in_space_map_remove = 0;
}
END {
printa(@loops);
printa(@avl_find_loops);
printa(@avl_find_time);
printa(@avl_insert_time);
printa(@avl_walk_time);
printf ("\tTime spent in avl_find\t\t: %d ns\n",
TotalTimeIn_avl_find);
printf ("\tTime spent in avl_insert\t: %d ns\n",
TotalTimeIn_avl_insert);
printf ("\tTime spent in avl_walk\t\t: %d ns\n",
TotalTimeIn_avl_walk);
printf ("\tElapsed time\t\t\t: %d ns\n",timestamp - Start);
}
--
Best regards,
Robert mailto:rmilkowski at task.gda.pl
http://milek.blogspot.com