thr3ads.net - dtrace discuss - [dtrace-discuss] dtrace wired issue [Jan 2008]

If this information is useful, please help other people find it:
Share via:

Aubrey Li

2008-Jan-14 12:29 UTC

[dtrace-discuss] dtrace wired issue

Every first time to run dtrace command after the system boot up,
It takes a very long time to get response.
But the second time is OK, as follows:

# time dtrace -l > /dev/null

real    4m8.011s
user    0m0.116s
sys     0m2.420s
# time dtrace -l > /dev/null

real    0m0.632s
user    0m0.075s
sys     0m0.553s

Any clue?

Thanks,
-Aubrey

Sean McGrath - Sun Microsystems Ireland

2008-Jan-14 12:52 UTC

head link

[dtrace-discuss] dtrace wired issue

Aubrey Li stated:
< Every first time to run dtrace command after the system boot up,
< It takes a very long time to get response.
< But the second time is OK, as follows:
< 
< # time dtrace -l > /dev/null
< 
< real    4m8.011s
< user    0m0.116s
< sys     0m2.420s

  This first time is probably when the kernel is loading the dtrace modules.
  Though still seems slow, 4 minutes.
    What kind of system (cpu speed etc) is the machine ?

< # time dtrace -l > /dev/null
< 
< real    0m0.632s
< user    0m0.075s
< sys     0m0.553s
< 

 And the second time, the dtrace modules are already loaded so less time.

< Any clue?
< 
< Thanks,
< -Aubrey

Regards,
-- 
Sean.
.

Aubrey Li

2008-Jan-14 13:16 UTC

head link

[dtrace-discuss] dtrace wired issue

On Jan 14, 2008 8:52 PM, Sean McGrath - Sun Microsystems Ireland
<sean.mcgrath at sun.com> wrote:> Aubrey Li stated:
> < Every first time to run dtrace command after the system boot up,
> < It takes a very long time to get response.
> < But the second time is OK, as follows:
> <
> < # time dtrace -l > /dev/null
> <
> < real    4m8.011s
> < user    0m0.116s
> < sys     0m2.420s
>
>   This first time is probably when the kernel is loading the dtrace
modules.
>   Though still seems slow, 4 minutes.
>     What kind of system (cpu speed etc) is the machine ?
# psrinfo -vp
The physical processor has 2 virtual processors (0 1)
  x86 (GenuineIntel 10674 family 6 model 23 step 4 clock 2400 MHz)
        Intel(r) CPU                  @ 2.40GHz

So, I failed to understand the modules loading needs 4 minutes.
>
> < # time dtrace -l > /dev/null
> <
> < real    0m0.632s
> < user    0m0.075s
> < sys     0m0.553s
> <
>
>  And the second time, the dtrace modules are already loaded so less time.
>
This makes sense to me.
> < Any clue?
> <
> < Thanks,
> < -Aubrey
>
> Regards,
> --
> Sean.
> .
>
Thanks,
-Aubrey

James C. McPherson

2008-Jan-14 13:22 UTC

head link

[dtrace-discuss] dtrace wired issue

Aubrey Li wrote:> On Jan 14, 2008 8:52 PM, Sean McGrath - Sun Microsystems Ireland
> <sean.mcgrath at sun.com> wrote:
>> Aubrey Li stated:
>> < Every first time to run dtrace command after the system boot up,
>> < It takes a very long time to get response.
>> < But the second time is OK, as follows:
>> <
>> < # time dtrace -l > /dev/null
>> <
>> < real    4m8.011s
>> < user    0m0.116s
>> < sys     0m2.420s
>>
>>   This first time is probably when the kernel is loading the dtrace
modules.
>>   Though still seems slow, 4 minutes.
>>     What kind of system (cpu speed etc) is the machine ?
> 
> # psrinfo -vp
> The physical processor has 2 virtual processors (0 1)
>   x86 (GenuineIntel 10674 family 6 model 23 step 4 clock 2400 MHz)
>         Intel(r) CPU                  @ 2.40GHz
> 
> So, I failed to understand the modules loading needs 4 minutes.

If you run "dtrace -l" with no args, *every* single loadable
module on the system will be loaded, interrogated by dtrace
and then unloaded if possible.

All those attach()es and detach()es need time, as does the
probe collation.



James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp	http://www.jmcp.homeunix.com/blog

Aubrey Li

2008-Jan-14 13:26 UTC

head link

[dtrace-discuss] dtrace wired issue

On Jan 14, 2008 9:22 PM, James C. McPherson <James.McPherson at sun.com>
wrote:> Aubrey Li wrote:
> > On Jan 14, 2008 8:52 PM, Sean McGrath - Sun Microsystems Ireland
> > <sean.mcgrath at sun.com> wrote:
> >> Aubrey Li stated:
> >> < Every first time to run dtrace command after the system boot
up,
> >> < It takes a very long time to get response.
> >> < But the second time is OK, as follows:
> >> <
> >> < # time dtrace -l > /dev/null
> >> <
> >> < real    4m8.011s
> >> < user    0m0.116s
> >> < sys     0m2.420s
> >>
> >>   This first time is probably when the kernel is loading the
dtrace modules.
> >>   Though still seems slow, 4 minutes.
> >>     What kind of system (cpu speed etc) is the machine ?
> >
> > # psrinfo -vp
> > The physical processor has 2 virtual processors (0 1)
> >   x86 (GenuineIntel 10674 family 6 model 23 step 4 clock 2400 MHz)
> >         Intel(r) CPU                  @ 2.40GHz
> >
> > So, I failed to understand the modules loading needs 4 minutes.
>
>
> If you run "dtrace -l" with no args, *every* single loadable
> module on the system will be loaded, interrogated by dtrace
> and then unloaded if possible.
>
> All those attach()es and detach()es need time, as does the
> probe collation.
>So may I ask, how long "dtrace -l" get response on your system?
And how fast the cpu speed on your system?

4 minutes, it is absolutely acceptable for me.

Thanks,
-Aubrey

Aubrey Li

2008-Jan-14 13:28 UTC

head link

[dtrace-discuss] dtrace wired issue

On Jan 14, 2008 9:26 PM, Aubrey Li <aubreylee at gmail.com>
wrote:>
> On Jan 14, 2008 9:22 PM, James C. McPherson <James.McPherson at
sun.com> wrote:
> > Aubrey Li wrote:
> > > On Jan 14, 2008 8:52 PM, Sean McGrath - Sun Microsystems Ireland
> > > <sean.mcgrath at sun.com> wrote:
> > >> Aubrey Li stated:
> > >> < Every first time to run dtrace command after the system
boot up,
> > >> < It takes a very long time to get response.
> > >> < But the second time is OK, as follows:
> > >> <
> > >> < # time dtrace -l > /dev/null
> > >> <
> > >> < real    4m8.011s
> > >> < user    0m0.116s
> > >> < sys     0m2.420s
> > >>
> > >>   This first time is probably when the kernel is loading the
dtrace modules.
> > >>   Though still seems slow, 4 minutes.
> > >>     What kind of system (cpu speed etc) is the machine ?
> > >
> > > # psrinfo -vp
> > > The physical processor has 2 virtual processors (0 1)
> > >   x86 (GenuineIntel 10674 family 6 model 23 step 4 clock 2400
MHz)
> > >         Intel(r) CPU                  @ 2.40GHz
> > >
> > > So, I failed to understand the modules loading needs 4 minutes.
> >
> >
> > If you run "dtrace -l" with no args, *every* single loadable
> > module on the system will be loaded, interrogated by dtrace
> > and then unloaded if possible.
> >
> > All those attach()es and detach()es need time, as does the
> > probe collation.
> >
> So may I ask, how long "dtrace -l" get response on your system?
> And how fast the cpu speed on your system?
>
> 4 minutes, it is absolutely acceptable for me.
>err, sorry, I mean unacceptable, ;-)

-Aubrey

max at bruningsystems.com

2008-Jan-14 13:33 UTC

head link

[dtrace-discuss] dtrace wired issue

Hi James,

James C. McPherson wrote:> Aubrey Li wrote:
>   
>> On Jan 14, 2008 8:52 PM, Sean McGrath - Sun Microsystems Ireland
>> <sean.mcgrath at sun.com> wrote:
>>     
>>> Aubrey Li stated:
>>> < Every first time to run dtrace command after the system boot
up,
>>> < It takes a very long time to get response.
>>> < But the second time is OK, as follows:
>>> <
>>> < # time dtrace -l > /dev/null
>>> <
>>> < real    4m8.011s
>>> < user    0m0.116s
>>> < sys     0m2.420s
>>>
>>>   This first time is probably when the kernel is loading the dtrace
modules.
>>>   Though still seems slow, 4 minutes.
>>>     What kind of system (cpu speed etc) is the machine ?
>>>       
>> # psrinfo -vp
>> The physical processor has 2 virtual processors (0 1)
>>   x86 (GenuineIntel 10674 family 6 model 23 step 4 clock 2400 MHz)
>>         Intel(r) CPU                  @ 2.40GHz
>>
>> So, I failed to understand the modules loading needs 4 minutes.
>>     
>
>
> If you run "dtrace -l" with no args, *every* single loadable
> module on the system will be loaded, interrogated by dtrace
> and then unloaded if possible.
>   Are you sure of this?  Why should it load everything?  (And then unload 
them if they''re not being used).
# dtrace -l | wc -l
   88939
# modload dedump
# dtrace -l | wc -l
   88963

I get different counts.  I think dtrace only looks at what is currently 
on the system.  But 4 minutes???

max
> All those attach()es and detach()es need time, as does the
> probe collation.
>
>
>
> James C. McPherson
> --
> Senior Kernel Software Engineer, Solaris
> Sun Microsystems
> http://blogs.sun.com/jmcp	http://www.jmcp.homeunix.com/blog
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
>
>

James C. McPherson

2008-Jan-14 13:38 UTC

head link

[dtrace-discuss] dtrace wired issue

Aubrey Li wrote:> On Jan 14, 2008 9:26 PM, Aubrey Li <aubreylee at gmail.com> wrote:
>> On Jan 14, 2008 9:22 PM, James C. McPherson <James.McPherson at
sun.com> wrote:
....>>> If you run "dtrace -l" with no args, *every* single
loadable
>>> module on the system will be loaded, interrogated by dtrace
>>> and then unloaded if possible.
>>>
>>> All those attach()es and detach()es need time, as does the
>>> probe collation.
>>>
>> So may I ask, how long "dtrace -l" get response on your
system?
>> And how fast the cpu speed on your system?
>>
>> 4 minutes, it is absolutely acceptable for me.
>>
> err, sorry, I mean unacceptable, ;-)
(figured that :)

*** On a u20m2

$ uname -a
SunOS farnarkle 5.11 snv_77 i86pc i386 i86pc
$ su root -c ''time dtrace -l > /dev/null''
Password:

real        2.2
user        0.1
sys         1.0
$ psrinfo -v
Status of virtual processor 0 as of: 01/14/2008 23:32:17
   on-line since 01/08/2008 09:11:53.
   The i386 processor operates at 2200 MHz,
         and has an i387 compatible floating point processor.
Status of virtual processor 1 as of: 01/14/2008 23:32:17
   on-line since 01/08/2008 09:11:56.
   The i386 processor operates at 2200 MHz,
         and has an i387 compatible floating point processor.



*** on a u20

# time dtrace -l > /dev/null

real    0m0.219s
user    0m0.063s
sys     0m0.089s
# psrinfo -v
Status of virtual processor 0 as of: 01/14/2008 23:33:12
   on-line since 01/14/2008 22:43:46.
   The i386 processor operates at 2412 MHz,
         and has an i387 compatible floating point processor.
Status of virtual processor 1 as of: 01/14/2008 23:33:12
   on-line since 01/14/2008 22:43:50.
   The i386 processor operates at 2412 MHz,
         and has an i387 compatible floating point processor.

# uname -a
SunOS crashburn 5.11 snv_80 i86pc i386 i86pc





*** on a u60

# time dtrace -l > /dev/null

real        4.4
user        0.9
sys         3.2
# psrinfo -v
Status of virtual processor 0 as of: 01/14/2008 23:47:33
   on-line since 09/14/2007 21:44:39.
   The sparcv9 processor operates at 296 MHz,
         and has a sparcv9 floating point processor.
Status of virtual processor 2 as of: 01/14/2008 23:47:33
   on-line since 09/14/2007 21:44:41.
   The sparcv9 processor operates at 296 MHz,
         and has a sparcv9 floating point processor.

# uname -a
SunOS synergise 5.11 snv_71 sun4u sparc SUNW,Ultra-60




You seem to be convinced that cpu is the limiting
factor. Why?



James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp	http://www.jmcp.homeunix.com/blog

Aubrey Li

2008-Jan-14 13:43 UTC

head link

[dtrace-discuss] dtrace wired issue

On Jan 14, 2008 9:38 PM, James C. McPherson <James.McPherson at sun.com>
wrote:> Aubrey Li wrote:
> > On Jan 14, 2008 9:26 PM, Aubrey Li <aubreylee at gmail.com>
wrote:
> >> On Jan 14, 2008 9:22 PM, James C. McPherson <James.McPherson at
sun.com> wrote:
> ....
> >>> If you run "dtrace -l" with no args, *every* single
loadable
> >>> module on the system will be loaded, interrogated by dtrace
> >>> and then unloaded if possible.
> >>>
> >>> All those attach()es and detach()es need time, as does the
> >>> probe collation.
> >>>
> >> So may I ask, how long "dtrace -l" get response on your
system?
> >> And how fast the cpu speed on your system?
> >>
> >> 4 minutes, it is absolutely acceptable for me.
> >>
> > err, sorry, I mean unacceptable, ;-)
>
> (figured that :)
>
> *** On a u20m2
>
> $ uname -a
> SunOS farnarkle 5.11 snv_77 i86pc i386 i86pc
> $ su root -c ''time dtrace -l > /dev/null''
> Password:
>
> real        2.2
> user        0.1
> sys         1.0
> $ psrinfo -v
> Status of virtual processor 0 as of: 01/14/2008 23:32:17
>    on-line since 01/08/2008 09:11:53.
>    The i386 processor operates at 2200 MHz,
>          and has an i387 compatible floating point processor.
> Status of virtual processor 1 as of: 01/14/2008 23:32:17
>    on-line since 01/08/2008 09:11:56.
>    The i386 processor operates at 2200 MHz,
>          and has an i387 compatible floating point processor.
>
>
>
> *** on a u20
>
> # time dtrace -l > /dev/null
>
> real    0m0.219s
> user    0m0.063s
> sys     0m0.089s
> # psrinfo -v
> Status of virtual processor 0 as of: 01/14/2008 23:33:12
>    on-line since 01/14/2008 22:43:46.
>    The i386 processor operates at 2412 MHz,
>          and has an i387 compatible floating point processor.
> Status of virtual processor 1 as of: 01/14/2008 23:33:12
>    on-line since 01/14/2008 22:43:50.
>    The i386 processor operates at 2412 MHz,
>          and has an i387 compatible floating point processor.
>
> # uname -a
> SunOS crashburn 5.11 snv_80 i86pc i386 i86pc
>
>
>
>
>
> *** on a u60
>
> # time dtrace -l > /dev/null
>
> real        4.4
> user        0.9
> sys         3.2
> # psrinfo -v
> Status of virtual processor 0 as of: 01/14/2008 23:47:33
>    on-line since 09/14/2007 21:44:39.
>    The sparcv9 processor operates at 296 MHz,
>          and has a sparcv9 floating point processor.
> Status of virtual processor 2 as of: 01/14/2008 23:47:33
>    on-line since 09/14/2007 21:44:41.
>    The sparcv9 processor operates at 296 MHz,
>          and has a sparcv9 floating point processor.
>
> # uname -a
> SunOS synergise 5.11 snv_71 sun4u sparc SUNW,Ultra-60
>
>
>
>
> You seem to be convinced that cpu is the limiting
> factor. Why?
No, I want to say, if your cpu is not faster than mine,
but dtrace response is faster than mine, then your explanation
----snip----
All those attach()es and detach()es need time, as does the
probe collation.
----snip----

is not reasonable.

So, there must be something wrong on my system,
but I can''t figure it out so far.

Thanks,
-Aubrey

Bryan Cantrill

2008-Jan-14 14:46 UTC

head link

[dtrace-discuss] dtrace wired issue

On Mon, Jan 14, 2008 at 09:16:34PM +0800, Aubrey Li
wrote:> On Jan 14, 2008 8:52 PM, Sean McGrath - Sun Microsystems Ireland
> <sean.mcgrath at sun.com> wrote:
> > Aubrey Li stated:
> > < Every first time to run dtrace command after the system boot up,
> > < It takes a very long time to get response.
> > < But the second time is OK, as follows:
> > <
> > < # time dtrace -l > /dev/null
> > <
> > < real    4m8.011s
> > < user    0m0.116s
> > < sys     0m2.420s
> >
> >   This first time is probably when the kernel is loading the dtrace
modules.
> >   Though still seems slow, 4 minutes.
> >     What kind of system (cpu speed etc) is the machine ?
> 
> # psrinfo -vp
> The physical processor has 2 virtual processors (0 1)
>   x86 (GenuineIntel 10674 family 6 model 23 step 4 clock 2400 MHz)
>         Intel(r) CPU                  @ 2.40GHz
> 
> So, I failed to understand the modules loading needs 4 minutes.
Yes, this is definitely fishy.  Is this a highly memory constrained system?
If you "modunload -i 0" enough times to get dtrace(7D) unloaded (that
is, "dtrace" doesn''t appear in modinfo), does it again take 4
minutes?
As you can imagine, it''s a little tough to investigate this problem
because
we can''t use DTrace to do it! ;)
> > < # time dtrace -l > /dev/null
> > <
> > < real    0m0.632s
> > < user    0m0.075s
> > < sys     0m0.553s
And 600+ milliseconds is still a long time.  How many probes are we talking
about here?

	- Bryan

--------------------------------------------------------------------------
Bryan Cantrill, Sun Microsystems FishWorks.       http://blogs.sun.com/bmc

Adam Leventhal

2008-Jan-14 16:44 UTC

head link

[dtrace-discuss] dtrace wired issue

On Mon, Jan 14, 2008 at 12:52:55PM +0000, Sean McGrath - Sun Microsystems
Ireland wrote:> Aubrey Li stated:
> < Every first time to run dtrace command after the system boot up,
> < It takes a very long time to get response.
> < But the second time is OK, as follows:
> < 
> < # time dtrace -l > /dev/null
> < 
> < real    4m8.011s
> < user    0m0.116s
> < sys     0m2.420s
> 
>   This first time is probably when the kernel is loading the dtrace
modules.
>   Though still seems slow, 4 minutes.
>     What kind of system (cpu speed etc) is the machine ?
The first time DTrace is loaded it needs to uncompress the CTF (type)
information for all of the currently loaded kernel modules which can take
quite a while. Four minutes does seem a bit excessive. Are you seeing any
swapping?

Adam

-- 
Adam Leventhal, FishWorks                        http://blogs.sun.com/ahl

Adam Leventhal

2008-Jan-14 16:55 UTC

head link

[dtrace-discuss] dtrace wired issue

On Mon, Jan 14, 2008 at 06:46:49AM -0800, Bryan Cantrill
wrote:> > So, I failed to understand the modules loading needs 4 minutes.
> 
> Yes, this is definitely fishy.  Is this a highly memory constrained system?
> If you "modunload -i 0" enough times to get dtrace(7D) unloaded
(that
> is, "dtrace" doesn''t appear in modinfo), does it again
take 4 minutes?
> As you can imagine, it''s a little tough to investigate this
problem because
> we can''t use DTrace to do it! ;)
Kicking the dtrace kernel module out is a bit trickier than back in the day
due to this bug:

  6282201 USDT probes can prevent dtrace(7D) from being unloaded

In order to unload dtrace, you''ll need to make sure there are no
processes
with USDT probes. This is done as part of one of the DTrace test suite
tests:

--8<-- usr/src/cmd/dtrace/test/tst/common/predicates/tst.predcache.ksh
--8<--
28 unload()
29 {
30 	#
31 	# Get the list of services whose processes have USDT probes.  Ideally
32 	# it would be possible to unload the fasttrap provider while USDT
33 	# probes exist -- once that fix is integrated, this hack can go away
34 	# We create two lists -- one of regular SMF services and one of legacy
35 	# services -- since each must be enabled and disabled using a specific
36 	# mechanism.
37 	#
38 	pids=$(dtrace -l | \
39 	    perl -ne ''print "$1\n" if
(/^\s*\S+\s+\S*\D(\d+)\s+/);'' | \
40 	    sort | uniq | tr ''\n'' '','')
41 
42 	ctids=$(ps -p $pids -o ctid | tail +2 | sort | uniq)
43 	svcs44 	lrcs45 
46 	for ct in $ctids
47 	do
48 		line=$(svcs -o fmri,ctid | grep " $ct\$")
49 		svc=$(echo $line | cut -d'' '' -f1)
50 
51 		if [[ $(svcs -Ho STA $svc) == "LRC" ]]; then
52 			lrc=$(svcs -Ho SVC $svc | tr _ ''?'')
53 			lrcs="$lrcs $lrc"
54 		else
55 			svcs="$svcs $svc"
56 	fi
57 	done
58 
59 	for svc in $svcs
60 	do
61 		svcadm disable -ts $svc
62 	done
63 
64 	for lrc in $lrcs
65 	do
66 		#
67 		# Does it seem a little paternalistic that lsvcrun requires
68 		# this environment variable to be set? I''d say so...
69 		#
70 		SMF_RESTARTER=svc:/system/svc/restarter:default \
71 		    /lib/svc/bin/lsvcrun $lrc stop
72 	done
73 
74 	modunload -i 0
75 	modunload -i 0
76 	modunload -i 0
77 	modinfo | grep dtrace
78 	success=$?
79 
80 	for svc in $svcs
81 	do
82 		svcadm enable -ts $svc
83 	done
84 
85 	for lrc in $lrcs
86 	do
87 		SMF_RESTARTER=svc:/system/svc/restarter:default \
88 		    /lib/svc/bin/lsvcrun $lrc start
89 	done
90 
91 	if [ ! $success ]; then
92 		echo $tst: could not unload dtrace
93 		exit 1
94 	fi
95 }
--8<-- usr/src/cmd/dtrace/test/tst/common/predicates/tst.predcache.ksh
--8<--

Adam

-- 
Adam Leventhal, FishWorks                        http://blogs.sun.com/ahl

max at bruningsystems.com

2008-Jan-14 22:12 UTC

head link

[dtrace-discuss] dtrace wired issue

Hi James,
James C. McPherson wrote:> Aubrey Li wrote:
>   
>> On Jan 14, 2008 8:52 PM, Sean McGrath - Sun Microsystems Ireland
>> <sean.mcgrath at sun.com> wrote:
>>     
>>> Aubrey Li stated:
>>> < Every first time to run dtrace command after the system boot
up,
>>> < It takes a very long time to get response.
>>> < But the second time is OK, as follows:
>>> <
>>> < # time dtrace -l > /dev/null
>>> <
>>> < real    4m8.011s
>>> < user    0m0.116s
>>> < sys     0m2.420s
>>>
>>>   This first time is probably when the kernel is loading the dtrace
modules.
>>>   Though still seems slow, 4 minutes.
>>>     What kind of system (cpu speed etc) is the machine ?
>>>       
>> # psrinfo -vp
>> The physical processor has 2 virtual processors (0 1)
>>   x86 (GenuineIntel 10674 family 6 model 23 step 4 clock 2400 MHz)
>>         Intel(r) CPU                  @ 2.40GHz
>>
>> So, I failed to understand the modules loading needs 4 minutes.
>>     
>
>
> If you run "dtrace -l" with no args, *every* single loadable
> module on the system will be loaded, interrogated by dtrace
> and then unloaded if possible.
>   You mean every kernel module currently loaded will be interrogated.  Not
every kernel module on the system.  So _init/_info/_fini/attach/detach 
should not enter into this.
max
> All those attach()es and detach()es need time, as does the
> probe collation.
>
>
>
> James C. McPherson
> --
> Senior Kernel Software Engineer, Solaris
> Sun Microsystems
> http://blogs.sun.com/jmcp	http://www.jmcp.homeunix.com/blog
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
>
>

Aubrey Li

2008-Jan-15 02:13 UTC

head link

[dtrace-discuss] dtrace wired issue

On Jan 14, 2008 10:46 PM, Bryan Cantrill <bmc at eng.sun.com>
wrote:>
> On Mon, Jan 14, 2008 at 09:16:34PM +0800, Aubrey Li wrote:
> > On Jan 14, 2008 8:52 PM, Sean McGrath - Sun Microsystems Ireland
> > <sean.mcgrath at sun.com> wrote:
> > > Aubrey Li stated:
> > > < Every first time to run dtrace command after the system boot
up,
> > > < It takes a very long time to get response.
> > > < But the second time is OK, as follows:
> > > <
> > > < # time dtrace -l > /dev/null
> > > <
> > > < real    4m8.011s
> > > < user    0m0.116s
> > > < sys     0m2.420s
> > >
> > >   This first time is probably when the kernel is loading the
dtrace modules.
> > >   Though still seems slow, 4 minutes.
> > >     What kind of system (cpu speed etc) is the machine ?
> >
> > # psrinfo -vp
> > The physical processor has 2 virtual processors (0 1)
> >   x86 (GenuineIntel 10674 family 6 model 23 step 4 clock 2400 MHz)
> >         Intel(r) CPU                  @ 2.40GHz
> >
> > So, I failed to understand the modules loading needs 4 minutes.
>
> Yes, this is definitely fishy.  Is this a highly memory constrained system?
# prtconf -vp | grep Mem
Memory size: 2023 Megabytes
> If you "modunload -i 0" enough times to get dtrace(7D) unloaded
(that
> is, "dtrace" doesn''t appear in modinfo), does it again
take 4 minutes?
> As you can imagine, it''s a little tough to investigate this
problem because
> we can''t use DTrace to do it! ;)
This doesn''t work on my side. modinfo always shows me
"dtrace".
>
> > > < # time dtrace -l > /dev/null
> > > <
> > > < real    0m0.632s
> > > < user    0m0.075s
> > > < sys     0m0.553s
>
> And 600+ milliseconds is still a long time.  How many probes are we talking
> about here?
# dtrace -l | wc -l
   52604

-Aubrey

Aubrey Li

2008-Jan-15 02:15 UTC

head link

[dtrace-discuss] dtrace wired issue

On Jan 15, 2008 12:44 AM, Adam Leventhal <ahl at eng.sun.com>
wrote:> On Mon, Jan 14, 2008 at 12:52:55PM +0000, Sean McGrath - Sun Microsystems
Ireland wrote:
> > Aubrey Li stated:
> > < Every first time to run dtrace command after the system boot up,
> > < It takes a very long time to get response.
> > < But the second time is OK, as follows:
> > <
> > < # time dtrace -l > /dev/null
> > <
> > < real    4m8.011s
> > < user    0m0.116s
> > < sys     0m2.420s
> >
> >   This first time is probably when the kernel is loading the dtrace
modules.
> >   Though still seems slow, 4 minutes.
> >     What kind of system (cpu speed etc) is the machine ?
>
> The first time DTrace is loaded it needs to uncompress the CTF (type)
> information for all of the currently loaded kernel modules which can take
> quite a while. Four minutes does seem a bit excessive. Are you seeing any
> swapping?
How to check the swapping?

-Aubrey

Aubrey Li

2008-Jan-15 02:39 UTC

head link

[dtrace-discuss] dtrace wired issue

Sorry for the delay(time difference). Now I got more details.
# truss dtrace -l
----snip----
mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANON, 4294967295, 0) = 0xFFFFFD7FFF180000
mmap(0x00010000, 24576, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFFFFFD7FFF170000
munmap(0xFFFFFD7FFF380000, 32768)               = 0
getcontext(0xFFFFFD7FFFDFE9D0)
mmap(0x00010000, 65536, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANON|MAP_ALIGN, 4294967295, 0) = 0xFFFFFD7FFF150000
getrlimit(RLIMIT_STACK, 0xFFFFFD7FFFDFED30)     = 0
getpid()                                        = 923 [922]
lwp_private(0, 0, 0xFFFFFD7FFF150200)           = 0x00000000
setustack(0xFFFFFD7FFF1502A8)
sysconfig(_CONFIG_PAGESIZE)                     = 4096
sigfillset(0xFFFFFD7FFF006880)                  = 0
brk(0x0041B580)                                 = 0
brk(0x0041F580)                                 = 0
getrlimit(RLIMIT_NOFILE, 0xFFFFFD7FFFDFEE80)    = 0
setrlimit(RLIMIT_NOFILE, 0xFFFFFD7FFFDFEE80)    = 0
openat(AT_FDCWD, "/dev/dtrace/provider",
O_RDONLY|O_NDELAY|O_LARGEFILE) = 3
fcntl(3, F_SETFD, 0x00000001)                   = 0
fstat(3, 0xFFFFFD7FFFDFE8E0)                    = 0
getdents(3, 0xFFFFFD7FFF174000, 8192) (sleeping...)
-----------here, dtrace sleep almost 4 minutes--------
...and continue...


Any thoughts?

Thanks,
-Aubrey

max at bruningsystems.com

2008-Jan-15 06:45 UTC

head link

[dtrace-discuss] dtrace wired issue

Hi Aubrey,

Aubrey Li wrote:> Sorry for the delay(time difference). Now I got more details.
> # truss dtrace -l
> ----snip----
> mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC,
> MAP_PRIVATE|MAP_ANON, 4294967295, 0) = 0xFFFFFD7FFF180000
> mmap(0x00010000, 24576, PROT_READ|PROT_WRITE|PROT_EXEC,
> MAP_PRIVATE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFFFFFD7FFF170000
> munmap(0xFFFFFD7FFF380000, 32768)               = 0
> getcontext(0xFFFFFD7FFFDFE9D0)
> mmap(0x00010000, 65536, PROT_READ|PROT_WRITE|PROT_EXEC,
> MAP_PRIVATE|MAP_ANON|MAP_ALIGN, 4294967295, 0) = 0xFFFFFD7FFF150000
> getrlimit(RLIMIT_STACK, 0xFFFFFD7FFFDFED30)     = 0
> getpid()                                        = 923 [922]
> lwp_private(0, 0, 0xFFFFFD7FFF150200)           = 0x00000000
> setustack(0xFFFFFD7FFF1502A8)
> sysconfig(_CONFIG_PAGESIZE)                     = 4096
> sigfillset(0xFFFFFD7FFF006880)                  = 0
> brk(0x0041B580)                                 = 0
> brk(0x0041F580)                                 = 0
> getrlimit(RLIMIT_NOFILE, 0xFFFFFD7FFFDFEE80)    = 0
> setrlimit(RLIMIT_NOFILE, 0xFFFFFD7FFFDFEE80)    = 0
> openat(AT_FDCWD, "/dev/dtrace/provider",
O_RDONLY|O_NDELAY|O_LARGEFILE) = 3
> fcntl(3, F_SETFD, 0x00000001)                   = 0
> fstat(3, 0xFFFFFD7FFFDFE8E0)                    = 0
> getdents(3, 0xFFFFFD7FFF174000, 8192) (sleeping...)
> -----------here, dtrace sleep almost 4 minutes--------
> ...and continue...
>
>   How long does an ls -l /dev/dtrace/provider take?  I doubt you are 
swapping with 2G of memory.
You can run vmstat -S 2  to see if there is any swapping.  I''ll look a 
bit more...
By the way, are you running on Indiana (just curious...).

max
> Any thoughts?
>
> Thanks,
> -Aubrey
>
>

max at bruningsystems.com

2008-Jan-15 07:06 UTC

head link

[dtrace-discuss] dtrace wired issue

Hi Aubrey,
Aubrey Li wrote:> Sorry for the delay(time difference). Now I got more details.
> # truss dtrace -l
> ----snip----
> mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC,
> MAP_PRIVATE|MAP_ANON, 4294967295, 0) = 0xFFFFFD7FFF180000
> mmap(0x00010000, 24576, PROT_READ|PROT_WRITE|PROT_EXEC,
> MAP_PRIVATE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFFFFFD7FFF170000
> munmap(0xFFFFFD7FFF380000, 32768)               = 0
> getcontext(0xFFFFFD7FFFDFE9D0)
> mmap(0x00010000, 65536, PROT_READ|PROT_WRITE|PROT_EXEC,
> MAP_PRIVATE|MAP_ANON|MAP_ALIGN, 4294967295, 0) = 0xFFFFFD7FFF150000
> getrlimit(RLIMIT_STACK, 0xFFFFFD7FFFDFED30)     = 0
> getpid()                                        = 923 [922]
> lwp_private(0, 0, 0xFFFFFD7FFF150200)           = 0x00000000
> setustack(0xFFFFFD7FFF1502A8)
> sysconfig(_CONFIG_PAGESIZE)                     = 4096
> sigfillset(0xFFFFFD7FFF006880)                  = 0
> brk(0x0041B580)                                 = 0
> brk(0x0041F580)                                 = 0
> getrlimit(RLIMIT_NOFILE, 0xFFFFFD7FFFDFEE80)    = 0
> setrlimit(RLIMIT_NOFILE, 0xFFFFFD7FFFDFEE80)    = 0
> openat(AT_FDCWD, "/dev/dtrace/provider",
O_RDONLY|O_NDELAY|O_LARGEFILE) = 3
> fcntl(3, F_SETFD, 0x00000001)                   = 0
> fstat(3, 0xFFFFFD7FFFDFE8E0)                    = 0
> getdents(3, 0xFFFFFD7FFF174000, 8192) (sleeping...)
> -----------here, dtrace sleep almost 4 minutes--------
> ...and continue...
>
>
>   So, it''s probably something obvious that I''m not seeing... 
What happens
if you try
tracing all calls made via getdents for dtrace itself, and do this 
instead of dtrace -l for the first
time you run dtrace?

max
> Any thoughts?
>
> Thanks,
> -Aubrey
>
>

Aubrey Li

2008-Jan-15 08:10 UTC

head link

[dtrace-discuss] dtrace wired issue

On Jan 15, 2008 2:45 PM, max at bruningsystems.com <max at
bruningsystems.com> wrote:> Hi Aubrey,
>> -------snip----------
> > setrlimit(RLIMIT_NOFILE, 0xFFFFFD7FFFDFEE80)    = 0
> > openat(AT_FDCWD, "/dev/dtrace/provider",
O_RDONLY|O_NDELAY|O_LARGEFILE) = 3
> > fcntl(3, F_SETFD, 0x00000001)                   = 0
> > fstat(3, 0xFFFFFD7FFFDFE8E0)                    = 0
> > getdents(3, 0xFFFFFD7FFF174000, 8192) (sleeping...)
> > -----------here, dtrace sleep almost 4 minutes--------
> > ...and continue...
> >
> >
> How long does an ls -l /dev/dtrace/provider take?
Yes, it costs almost 4 minutes.
> I doubt you are swapping with 2G of memory.
> You can run vmstat -S 2  to see if there is any swapping.  I''ll
look a
> bit more...
# vmstat -S 2
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  si  so pi po fr de sr cd -- -- --   in   sy   cs us sy id
 0 0 0 1601116 1352560 0  0 717 2 53  0 2502 71 0 0  0  718 7222 2813  4  3 93
 0 0 0 1516012 1236776 0  0  0  0  0  0  0  0  0  0  0  388  269  347  0  0 99
 0 0 0 1516012 1236808 0  0  0  0  0  0  0  0  0  0  0  394  262  354  0  0 100
 0 0 0 1516012 1236808 0  0  0  0  0  0  0  0  0  0  0  392  255  356  0  0 100
 0 0 0 1516016 1236812 0  0  0  0  0  0  0 23  0  0  0  425  380  396  1  0 99
 0 0 0 1516016 1236812 0  0  0  0  0  0  0  0  0  0  0  389  294  332  0  0 100
 0 0 0 1516016 1236812 0  0  0  0  0  0  0  0  0  0  0  382  466  444  0  0 99
 0 0 0 1516016 1236812 0  0  0  0  0  0  0  0  0  0  0  394  273  346  0  0 100
 0 0 0 1516016 1236812 0  0  0  0  0  0  0  0  0  0  0  387  250  328  0  0 99
 0 0 0 1516016 1236812 0  0  0  0  0  0  0  0  0  0  0  384  264  339  0  0 100
> By the way, are you running on Indiana (just curious...).
>
No, I''m using nevada.

[aubrey-monv@~]cat /etc/release
                   Solaris Express Community Edition snv_77 X86
           Copyright 2007 Sun Microsystems, Inc.  All Rights Reserved.
                        Use is subject to license terms.
                           Assembled 06 November 2007

Thanks,
-Aubrey

max at bruningsystems.com

2008-Jan-15 08:43 UTC

head link

[dtrace-discuss] dtrace wired issue

Aubrey Li wrote:> On Jan 15, 2008 2:45 PM, max at bruningsystems.com <max at
bruningsystems.com> wrote:
>   
>> Hi Aubrey,
>>     
>>> -------snip----------
>>> setrlimit(RLIMIT_NOFILE, 0xFFFFFD7FFFDFEE80)    = 0
>>> openat(AT_FDCWD, "/dev/dtrace/provider",
O_RDONLY|O_NDELAY|O_LARGEFILE) = 3
>>> fcntl(3, F_SETFD, 0x00000001)                   = 0
>>> fstat(3, 0xFFFFFD7FFFDFE8E0)                    = 0
>>> getdents(3, 0xFFFFFD7FFF174000, 8192) (sleeping...)
>>> -----------here, dtrace sleep almost 4 minutes--------
>>> ...and continue...
>>>
>>>
>>>       
>> How long does an ls -l /dev/dtrace/provider take?
>>     
>
> Yes, it costs almost 4 minutes.
>
>   So, what does ls -l /dev/dtrace/provider show?
max
>> I doubt you are swapping with 2G of memory.
>> You can run vmstat -S 2  to see if there is any swapping. 
I''ll look a
>> bit more...
>>     
>
> # vmstat -S 2
>  kthr      memory            page            disk          faults      cpu
>  r b w   swap  free  si  so pi po fr de sr cd -- -- --   in   sy   cs us sy
id
>  0 0 0 1601116 1352560 0  0 717 2 53  0 2502 71 0 0  0  718 7222 2813  4  3
93
>  0 0 0 1516012 1236776 0  0  0  0  0  0  0  0  0  0  0  388  269  347  0  0
99
>  0 0 0 1516012 1236808 0  0  0  0  0  0  0  0  0  0  0  394  262  354  0  0
100
>  0 0 0 1516012 1236808 0  0  0  0  0  0  0  0  0  0  0  392  255  356  0  0
100
>  0 0 0 1516016 1236812 0  0  0  0  0  0  0 23  0  0  0  425  380  396  1  0
99
>  0 0 0 1516016 1236812 0  0  0  0  0  0  0  0  0  0  0  389  294  332  0  0
100
>  0 0 0 1516016 1236812 0  0  0  0  0  0  0  0  0  0  0  382  466  444  0  0
99
>  0 0 0 1516016 1236812 0  0  0  0  0  0  0  0  0  0  0  394  273  346  0  0
100
>  0 0 0 1516016 1236812 0  0  0  0  0  0  0  0  0  0  0  387  250  328  0  0
99
>  0 0 0 1516016 1236812 0  0  0  0  0  0  0  0  0  0  0  384  264  339  0  0
100
>
>   
>> By the way, are you running on Indiana (just curious...).
>>
>>     
>
> No, I''m using nevada.
>
> [aubrey-monv@~]cat /etc/release
>                    Solaris Express Community Edition snv_77 X86
>            Copyright 2007 Sun Microsystems, Inc.  All Rights Reserved.
>                         Use is subject to license terms.
>                            Assembled 06 November 2007
>
> Thanks,
> -Aubrey
>
>

Aubrey Li

2008-Jan-15 09:07 UTC

head link

[dtrace-discuss] dtrace wired issue

On Jan 15, 2008 4:43 PM, max at bruningsystems.com <max at
bruningsystems.com> wrote:> Aubrey Li wrote:
> > On Jan 15, 2008 2:45 PM, max at bruningsystems.com <max at
bruningsystems.com> wrote:
> >
> >> Hi Aubrey,
> >>
> >>> -------snip----------
> >>> setrlimit(RLIMIT_NOFILE, 0xFFFFFD7FFFDFEE80)    = 0
> >>> openat(AT_FDCWD, "/dev/dtrace/provider",
O_RDONLY|O_NDELAY|O_LARGEFILE) = 3
> >>> fcntl(3, F_SETFD, 0x00000001)                   = 0
> >>> fstat(3, 0xFFFFFD7FFFDFE8E0)                    = 0
> >>> getdents(3, 0xFFFFFD7FFF174000, 8192) (sleeping...)
> >>> -----------here, dtrace sleep almost 4 minutes--------
> >>> ...and continue...
> >>>
> >>>
> >>>
> >> How long does an ls -l /dev/dtrace/provider take?
> >>
> >
> > Yes, it costs almost 4 minutes.
> >
> >
> So, what does ls -l /dev/dtrace/provider show?
> max
>
>
# ls -l /dev/dtrace/provider
total 14
lrwxrwxrwx   1 root     root          43 Sep  3 01:21 fasttrap ->
../../../devices/pseudo/fasttrap at 0:fasttrap
lrwxrwxrwx   1 root     root          33 Sep  3 01:21 fbt ->
../../../devices/pseudo/fbt at 0:fbt
lrwxrwxrwx   1 root     root          43 Sep  3 01:21 lockstat ->
../../../devices/pseudo/lockstat at 0:lockstat
lrwxrwxrwx   1 root     root          49 Sep  3 01:21 lx_systrace ->
../../../devices/pseudo/lx_systrace at 0:lx_systrace
lrwxrwxrwx   1 root     root          41 Sep  3 01:21 profile ->
../../../devices/pseudo/profile at 0:profile
lrwxrwxrwx   1 root     root          33 Sep  3 01:21 sdt ->
../../../devices/pseudo/sdt at 0:sdt
lrwxrwxrwx   1 root     root          43 Sep  3 01:21 systrace ->
../../../devices/pseudo/systrace at 0:systrace

err, now I got something interesting.
It looks like this wired issue is caused by the ata driver bug
which I reported on the opensolaris-bug mailing list.

When I switch the system to the console mode(X is disabled), I got
#dtrace -l
--------snip--------
scsi: WARNING: /pci at 0,0/pci-ide at 1f,2/ide at 1 (ata1):
       timeout: abort request, target=0 lun=0
scsi: WARNING: /pci at 0,0/pci-ide at 1f,2/ide at 1 (ata1):
       timeout: abort device, target=0 lun=0
scsi: WARNING: /pci at 0,0/pci-ide at 1f,2/ide at 1 (ata1):
       timeout: reset target, target=0 lun=0
scsi: WARNING: /pci at 0,0/pci-ide at 1f,2/ide at 1 (ata1):
       timeout: reset bus, target=0 lun=0
ID   PROVIDER            MODULE                          FUNCTION NAME
    1     dtrace                                                     BEGIN
    2     dtrace                                                     END
    3     dtrace                                                     ERROR
    4   lockstat           genunix                       mutex_enter
adaptive-acquire
    5   lockstat           genunix                       mutex_enter
adaptive-block
    6   lockstat           genunix                       mutex_enter
adaptive-spin
    7   lockstat           genunix                        mutex_exit
adaptive-release
    8   lockstat           genunix                     mutex_destroy
adaptive-release
    9   lockstat           genunix                    mutex_tryenter
adaptive-acquire
 ----snip----

-Aubrey

dtrace discuss - Jan 2008 - dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue

[dtrace-discuss] dtrace wired issue