thr3ads.net - zfs discuss - [zfs-discuss] long time to schedule commands [Jul 2006]

If this information is useful, please help other people find it:
Share via:

Sean Meighan

2006-Jul-10 17:14 UTC

[zfs-discuss] long time to schedule commands

I am not sure if this is  ZFS, Niagara or something else issue? Does 
someone know why commands have the latency shown below?

*1) do a ls of a directory.  6.9 seconds total, truss only shows .07 
seconds.*

itsm-mpk-2% timex  truss -fdD -o truss.out ls | wc

real        6.94
user        0.01
sys         0.05

     639     639   37247
itsm-mpk-2% tail truss.out
27449:   0.0714  0.0002 fstat64(1, 0xFFBFDBF8)                          = 0
27449:   0.0720  0.0006 write(1, " C . 2 0 0 6 0 6 3 0 2 2".., 5120)
= 5120
27449:   0.0726  0.0006 write(1, " g z\n C . 2 0 0 6 0 7 1".., 5120)
= 5120
27449:   0.0733  0.0007 write(1, " n . C O M . t a r . g z".., 5120)
= 5120
27449:   0.0739  0.0006 write(1, " . t a r . g z\n C . 2 0".., 5120)
= 5120
27449:   0.0745  0.0006 write(1, " 1 0 9 . s r 1 - e h d b".., 5120)
= 5120
27449:   0.0752  0.0007 write(1, " 2 - 0 1 . i s t a n b u".., 5120)
= 5120
27449:   0.0758  0.0006 write(1, " p e 0 6 - 0 5 . s u n r".., 5120)
= 5120
27449:   0.0763  0.0005 write(1, " c s y d 0 7 - 0 2 . S M".., 1407)
= 1407
27449:   0.0766  0.0003 _exit(0)

*2) do a pwd. 4.7 seconds total, truss only shows .18 seconds.*
itsm-mpk-2%  timex truss -fdD -o truss.out pwd
/upload/canary/incoming

real        4.75
user        0.02
sys         0.06

itsm-mpk-2% !tail
tail truss.out
19422:   0.1802  0.0003 close(3)                                        = 0
19422:   0.1820  0.0018 munmap(0xFF3A0000, 8192)                        = 0
19422:   0.1824  0.0004 mmap(0x00010000, 24576, 
PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON|MAP_ALIGN, -1, 0) = 
0xFF3A0000
19422:   0.1830  0.0006 getcontext(0xFFBFF788)
19422:   0.1832  0.0002 getrlimit(RLIMIT_STACK, 0xFFBFF768)             = 0
19422:   0.1835  0.0003 getpid()                                        
= 19422 [18587]
19422:   0.1838  0.0003 setustack(0xFF3A2088)
19422:   0.1843  0.0005 getcwd("/upload/canary/incoming", 1025)       
= 0
19422:   0.1849  0.0006 write(1, " / u p l o a d / c a n a".., 24)    
= 24
19422:   0.1854  0.0005 _exit(0)

*try it on a normal sun ray server, 0.21 secs instead of 4.71 secs.*

sr1-ubrm-20%  timex truss -fdD -o truss.out pwd
/home/smeighan

real        0.21
user        0.01
sys         0.02



*3) do an uptime. 4.7 seconds total, truss only shows .68 seconds.*
timex truss -fdD -o truss.out uptime
  9:49am  up 11 day(s), 16:38,  4 users,  load average: 1.96, 2.39, 2.11

real        4.70
user        0.01
sys         0.04

itsm-mpk-2% !tail
tail truss.out
23966:   0.6823  0.0002 mmap(0x00000000, 8192, 
PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFF380000
23966:   0.6826  0.0003 sysinfo(SI_ISALIST, "", 1)
= 115
23966:   0.6829  0.0003 sysinfo(SI_ISALIST, "sparcv9 sparcv8plus sparcv8 
sparcv8-fsmuld sparcv7 sparc sparcv9+vis sparcv9+vis2 sparcv8plus+vis 
sparcv8plus+vis2", 115) = 115
23966:   0.6834  0.0005 open("/proc/self/auxv", O_RDONLY)             
= 3
23966:   0.6837  0.0003 fstat(3, 0xFFBFFB00)                            = 0
23966:   0.6839  0.0002 read(3, "\0\007D8FFBFFFD8\0\007DE".., 152)
= 152
23966:   0.6842  0.0003 close(3)                                        = 0
23966:   0.6844  0.0002 access("/usr/bin/sparcv9/uptime", X_OK)       
= 0
23966:   0.6847  0.0003 execve("/usr/bin/sparcv9/uptime", 0xFFBFFCB4, 
0xFFBFFCBC)
23966:          *** cannot trace across exec() of 
/usr/bin/sparcv9/uptime ***


It acts like the command is not getting scheduled right away. We don''t 
have a lot of load on this box

itsm-mpk-2% vmstat 1 11
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr s1 s2 -- --   in   sy   cs us 
sy id
 0 0 0 1298392 660592 338 2708 54 2 2 0  5 408 0  0  0 1754 7387 2113  
2  6 91
 0 0 0 430712 390312 25 446  0  0  0  0  0 440 0  0  0 2037 3087 2369  
1  2 97
 0 0 0 430528 390136 203 1345 0 0  0  0  0 419 0  0  0 1998 5147 2383  
1  3 96
 0 0 0 428024 387584 111 1594 0 0  0  0  0 392 0  0  0 1846 4102 2279  
0  3 96
 0 0 0 428104 387488 152 2284 0 0  0  0  0 391 0  0  0 2149 6195 2718  
2  4 94
 0 0 0 428048 390240 109 1697 0 0  0  0  0 366 0  0  0 1956 4711 2477  
1  3 96
 0 0 0 426560 385792 242 2519 32 0 0  0  0 337 0  0  0 1896 5116 2326  
1  5 94
 1 0 0 425648 384992 136 1603 0 0  0  0  0 327 0  0  0 1540 3444 1805  
1  5 94
 1 0 0 426336 385584 25 349  0  0  0  0  0 328 0  0  0 1356 1107 1527  
0  4 96
 1 0 0 425952 385192 77 711  0  0  0  0  0 322 0  0  0 1414 1785 1598  
0  5 95
 1 0 0 425464 384488 249 2263 0 0  0  0  0 347 0  0  0 1855 4636 2264  
0  6 94

when some of our batch processes run, we still do not use much of the box

itsm-mpk-2% vmstat 1 11
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr s1 s2 -- --   in   sy   cs us 
sy id
 0 0 0 1299456 660600 339 2707 54 2 2 0  5 408 0  0  0 1753 7388 2113  
2  6 91
 0 0 0 1086392 1046240 153 2501 0 0 0 0  0 445 0  0  0 1770 10175 2080 
5  4 91
 0 0 0 1074800 1034608 230 3342 0 0 0 0  0 413 0  0  0 1833 6621 2243  
1  4 95
 0 0 0 1066920 1029512 474 6529 0 0 0 0  0 493 0  0  0 1931 25615 2372 
10 10 80
 0 0 0 1021792 987048 771 13479 0 0 0 0  0 481 0  0  0 2342 37421 3094 
29 20 51
 0 0 0 991712 964256 1083 20993 0 0 0 0  0 461 0  0  0 2601 39540 3904 5 
30 65
 0 0 0 985880 953208 1125 22201 8 0 0 0  0 430 0  0  0 2695 49500 3996 
12 35 54
 0 0 0 971920 936512 1295 26029 0 0 0 0  0 435 0  0  0 2753 48871 4311 
10 38 52
 0 0 0 946088 923368 1117 24737 0 0 0 0  0 440 0  0  0 3835 46394 6793 6 
36 59
 0 0 0 935304 914856 1029 22261 0 0 0 0  0 418 0  0  0 3836 45741 6734 5 
34 61
 0 0 0 940704 902776 1145 25996 0 0 0 0  0 427 0  0  0 3724 46909 6476 6 
42 51


The box is a T2000, 32x1000mhz 8 gig memory. This is a Solaris 10 
machine running ZFS.

We do have root on our box so can run dtrace. I have included the output 
from dexplorer.

thanks
sean

PS: i have copied the internal Niagara alias as a BCC since I am 
including an external email alias.

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20060710/b8816a32/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: de_itsm-mpk-2_200607101001.tar.gz
Type: application/x-gzip
Size: 38381 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20060710/b8816a32/attachment.bin>

Michael Schuster - Sun Microsystems

2006-Jul-11 08:13 UTC

head link

[zfs-discuss] long time to schedule commands

Sean Meighan wrote:> I am not sure if this is  ZFS, Niagara or something else issue? Does 
> someone know why commands have the latency shown below?
> 
> *1) do a ls of a directory.  6.9 seconds total, truss only shows .07 
> seconds.*
[...]

this may be an issue with your $PATH. Do you see the same behaviour if you use 
absolute paths for the commands?

HTH
Michael
-- 
Michael Schuster                  (+49 89) 46008-2974 / x62974
visit the online support center:  http://www.sun.com/osc/

Recursion, n.: see ''Recursion''

Richard Elling

2006-Jul-11 15:56 UTC

head link

[zfs-discuss] long time to schedule commands

Michael Schuster - Sun Microsystems wrote:> Sean Meighan wrote:
>> I am not sure if this is  ZFS, Niagara or something else issue? Does 
>> someone know why commands have the latency shown below?
>>
>> *1) do a ls of a directory.  6.9 seconds total, truss only shows .07 
>> seconds.*
> 
> [...]
> 
> this may be an issue with your $PATH. Do you see the same behaviour if 
> you use absolute paths for the commands?
It could also be an interaction with your shell.  ksh, for example, will
write to your $HISTFILE before the exec.  If it takes a long time to
update your $HISTFILE, then it will appear as if the command itself is
slow.  truss -d on the shell will show this conclusively.
  -- richard

Dana H. Myers

2006-Jul-11 16:07 UTC

head link

[zfs-discuss] long time to schedule commands

Richard Elling wrote:> Michael Schuster - Sun Microsystems wrote:
>> Sean Meighan wrote:
>>> I am not sure if this is  ZFS, Niagara or something else issue?
Does
>>> someone know why commands have the latency shown below?
>>>
>>> *1) do a ls of a directory.  6.9 seconds total, truss only shows
.07
>>> seconds.*
>>
>> [...]
>>
>> this may be an issue with your $PATH. Do you see the same behaviour if
>> you use absolute paths for the commands?
> 
> It could also be an interaction with your shell.  ksh, for example, will
> write to your $HISTFILE before the exec.  If it takes a long time to
> update your $HISTFILE, then it will appear as if the command itself is
> slow.  truss -d on the shell will show this conclusively.
ksh doesn''t just write to $HISTFILE - ksh fsync()s it, and thus one
may be tripping over:

6413510 zfs: writing to ZFS filesystem slows down fsync() on other files in the
same FS

Certainly, I initially found this issue as a result sluggishness in the
ksh prompt.

Dana

Sean Meighan

2006-Jul-11 16:36 UTC

head link

[zfs-discuss] long time to schedule commands

i made sure path is clean, i also qualified the paths. time varies from 
0.5 seconds to 15 seconds. If i just do a "timex pwd", it always seems
to be fast. We are using csh.

itsm-mpk-2% *env*
HOME=/app/canary
PATH=/usr/bin:/usr/local/bin:/usr/sbin
LOGNAME=canary
HZ=100
TERM=xterm
TZ=US/Pacific
SHELL=/bin/csh
MAIL=/var/mail/canary
DISPLAY=sr1-ubrm-20:55.0
PWD=/app/canary/data/incoming
USER=canary
JAVA_HOME=/usr/jdk/instances/jdk1.5.0
TOOLS=/app/tools
LD_LIBRARY_PATHVIM=/app/tools/vim70/share/vim/vim70
CVSROOT=:pserver:canary at cvs.sfbay:/export/cvs/cvsroot
EDITOR=vi
ENV=/app/canary/.kshrc
LD_LIBRARY_PATH=PATH=/usr/bin:/usr/local/bin:/usr/ccs/bin:/usr/sbin:/usr/openwin/binitsm-mpk-2%
*/bin/timex /bin/truss -fdD -o truss.out pwd*
/upload/canary/incoming

real       13.57
user        0.01
sys         0.03
itsm-mpk-2% tail truss.out
26078:   0.0541  0.0001 close(3)                                        = 0
26078:   0.0553  0.0012 munmap(0xFF3A0000, 8192)                        = 0
26078:   0.0556  0.0003 mmap(0x00010000, 24576, 
PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON|MAP_ALIGN, -1, 0) = 
0xFF3A0000
26078:   0.0560  0.0004 getcontext(0xFFBFF7F8)
26078:   0.0562  0.0002 getrlimit(RLIMIT_STACK, 0xFFBFF7D8)             = 0
26078:   0.0563  0.0001 getpid()                                        
= 26078 [25982]
26078:   0.0565  0.0002 setustack(0xFF3A2088)
26078:   0.0568  0.0003 getcwd("/upload/canary/incoming", 1025)       
= 0
26078:   0.0573  0.0005 write(1, " / u p l o a d / c a n a".., 24)    
= 24
26078:   0.0576  0.0003 _exit(0)


Michael Schuster - Sun Microsystems wrote:
> Sean Meighan wrote:
>
>> I am not sure if this is  ZFS, Niagara or something else issue? Does 
>> someone know why commands have the latency shown below?
>>
>> *1) do a ls of a directory.  6.9 seconds total, truss only shows .07 
>> seconds.*
>
>
> [...]
>
> this may be an issue with your $PATH. Do you see the same behaviour if 
> you use absolute paths for the commands?
>
> HTH
> Michael

-- 
<http://www.sun.com> 	* Sean Meighan *
Mgr ITSM Engineering

*Sun Microsystems, Inc.*
US
Phone x32329 / +1 408 850-9537
Mobile 303-520-2024
Fax 408 850-9537
Email Sean.Meighan at Sun.COM
	

------------------------------------------------------------------------
NOTICE: This email message is for the sole use of the intended 
recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. 
If you are not the intended recipient, please contact the sender by 
reply email and destroy all copies of the original message.
------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20060711/528ecb9a/attachment.html>

eric kustarz

2006-Jul-12 01:58 UTC

head link

[zfs-discuss] long time to schedule commands

Sean Meighan wrote:
> i made sure path is clean, i also qualified the paths. time varies 
> from 0.5 seconds to 15 seconds. If i just do a "timex pwd", it
always
> seems to be fast. We are using csh.
>
Here''s a simple dscript to figure out how long each syscall is taking:

#!/usr/sbin/dtrace -FCs

syscall:::entry
/execname == "pwd"/
{
        @count[probefunc] = count();
        self->time = timestamp;
}

syscall:::return
/self->time/
{
        @avg[probefunc] = avg(timestamp - self->time);
        @sum[probefunc] = sum(timestamp - self->time);
        self->time = 0;
}

END
{
        printf("\nCOUNT\n");
        printa(@count);
        printf("\n\n\n\nAVG TIME\n");
        printa(@avg);
        printf("\n\n\n\nSUM TIME\n");
        printa(@sum);
}

Then you can run this like:
fsh-hake# /usr/bin/time dtrace -Zs dscript.d -c ''/usr/bin/pwd''

Make sure to path out ''pwd'' (as opposed to using your
shell''s built in
command).  If it isn''t ''pwd'', then you can see if
''time'' is taking a
long time too....

eric
> itsm-mpk-2% *env*
> HOME=/app/canary
> PATH=/usr/bin:/usr/local/bin:/usr/sbin
> LOGNAME=canary
> HZ=100
> TERM=xterm
> TZ=US/Pacific
> SHELL=/bin/csh
> MAIL=/var/mail/canary
> DISPLAY=sr1-ubrm-20:55.0
> PWD=/app/canary/data/incoming
> USER=canary
> JAVA_HOME=/usr/jdk/instances/jdk1.5.0
> TOOLS=/app/tools
> LD_LIBRARY_PATH> VIM=/app/tools/vim70/share/vim/vim70
> CVSROOT=:pserver:canary at cvs.sfbay:/export/cvs/cvsroot
> EDITOR=vi
> ENV=/app/canary/.kshrc
> LD_LIBRARY_PATH=>
PATH=/usr/bin:/usr/local/bin:/usr/ccs/bin:/usr/sbin:/usr/openwin/bin>
itsm-mpk-2% */bin/timex /bin/truss -fdD -o truss.out pwd*
> /upload/canary/incoming
>
> real       13.57
> user        0.01
> sys         0.03
> itsm-mpk-2% tail truss.out
> 26078:   0.0541  0.0001 
> close(3)                                        = 0
> 26078:   0.0553  0.0012 munmap(0xFF3A0000, 
> 8192)                        = 0
> 26078:   0.0556  0.0003 mmap(0x00010000, 24576, 
> PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON|MAP_ALIGN, -1, 0) 
> = 0xFF3A0000
> 26078:   0.0560  0.0004 getcontext(0xFFBFF7F8)
> 26078:   0.0562  0.0002 getrlimit(RLIMIT_STACK, 
> 0xFFBFF7D8)             = 0
> 26078:   0.0563  0.0001 
> getpid()                                        = 26078 [25982]
> 26078:   0.0565  0.0002 setustack(0xFF3A2088)
> 26078:   0.0568  0.0003 getcwd("/upload/canary/incoming", 
> 1025)         = 0
> 26078:   0.0573  0.0005 write(1, " / u p l o a d / c a n a".., 
> 24)      = 24
> 26078:   0.0576  0.0003 _exit(0)
>
>
> Michael Schuster - Sun Microsystems wrote:
>
>> Sean Meighan wrote:
>>
>>> I am not sure if this is  ZFS, Niagara or something else issue?
Does
>>> someone know why commands have the latency shown below?
>>>
>>> *1) do a ls of a directory.  6.9 seconds total, truss only shows
.07
>>> seconds.*
>>
>>
>> [...]
>>
>> this may be an issue with your $PATH. Do you see the same behaviour 
>> if you use absolute paths for the commands?
>>
>> HTH
>> Michael
>
>
> -- 
> <http://www.sun.com> 	* Sean Meighan *
> Mgr ITSM Engineering
>
> *Sun Microsystems, Inc.*
> US
> Phone x32329 / +1 408 850-9537
> Mobile 303-520-2024
> Fax 408 850-9537
> Email Sean.Meighan at Sun.COM
> 	
>
> ------------------------------------------------------------------------
> NOTICE: This email message is for the sole use of the intended 
> recipient(s) and may contain confidential and privileged information. 
> Any unauthorized review, use, disclosure or distribution is 
> prohibited. If you are not the intended recipient, please contact the 
> sender by reply email and destroy all copies of the original message.
> ------------------------------------------------------------------------
>
>------------------------------------------------------------------------
>
>_______________________________________________
>zfs-discuss mailing list
>zfs-discuss at opensolaris.org
>http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>  
>

zfs discuss - Jul 2006 - long time to schedule commands

[zfs-discuss] long time to schedule commands

[zfs-discuss] long time to schedule commands

[zfs-discuss] long time to schedule commands

[zfs-discuss] long time to schedule commands

[zfs-discuss] long time to schedule commands

[zfs-discuss] long time to schedule commands