Justin Lloyd
2008-May-21 17:50 UTC
[dtrace-discuss] Slow pkginstalls due to long door_calls to nscd
Hi all, I am installing a zone onto two different V445s running S10U4 and the zones are taking hours to install (about 1000 packages), that is, the problem is identical on both systems. A bit of trussing and dtracing has shown that the pkginstalls being run by the zoneadm install are making door_call calls to nscd that are taking very long, so far observed to be 5 to 40 seconds, but always in times divisible by 5 seconds (5, 10, 15, etc.), leading to the thought that it is one or more name service timeouts. Eventually, once the door_calls finish, the actual package install is very quick. There may be a simple misconfiguration on both systems, but I can''t figure out what that might be. Additionally, I don''t know why the pkginstalls are making door_calls to begin with, and I''m not sure how to figure that out. So that''s one my main question: *** How can I find out exactly what the pkginstalls are asking nscd to resolve? *** Hopefully that will lead me (and the Sun engineer with whom I have a case open) to an understanding of what is leading to the presumed timeouts. FWIW, here''s a partial truss with timestamps of an existing pkginstall process. Once it hit the ioctl(3,...) it flew through the package installation. Note the times around the door_calls: 13, 10 and 15 seconds (the first isn''t a multiple of 5 since I started the truss during the call). Base time stamp: 1210181159.9172 [ Wed May 7 17:25:59 GMT 2008 ] door_call(4, 0xFFBFD030) (sleeping...) 1210472.1302 door_call(4, 0xFFBFD030) = 0 1210472.1306 open("/dev/udp6", O_RDONLY) = 5 1210472.1310 ioctl(5, SIOCGLIFNUM, 0xFFBFD2DC) = 0 1210472.1312 close(5) = 0 1210472.1312 getuid() = 0 [0] 1210472.1313 getuid() = 0 [0] 1210472.1314 door_info(4, 0xFFBFCF88) = 0 door_call(4, 0xFFBFD030) (sleeping...) 1210482.1503 door_call(4, 0xFFBFD030) = 0 1210482.1505 getuid() = 0 [0] 1210482.1505 getuid() = 0 [0] 1210482.1506 door_info(4, 0xFFBFCF88) = 0 door_call(4, 0xFFBFD030) (sleeping...) 1210497.1802 door_call(4, 0xFFBFD030) = 0 1210497.1804 ioctl(3, ((''m''<<8)|7), 0xFFBFD8EC) = 0 Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20080521/b716187e/attachment.html>
Brian Utterback
2008-May-22 13:44 UTC
[dtrace-discuss] Slow pkginstalls due to long door_calls to nscd
Well, the simple way to find out is to stop it and look. Or even to not stop it. Since it is spending enough time in the door call for you to catch it with a truss, instead of running truss, try running pstack on the same processes. I bet it will be immediately obvious what is being looked up by the stack. Justin Lloyd wrote:> Hi all, > > I am installing a zone onto two different V445s running S10U4 and the > zones are taking hours to install (about 1000 packages), that is, the > problem is identical on both systems. A bit of trussing and dtracing has > shown that the pkginstalls being run by the zoneadm install are making > door_call calls to nscd that are taking very long, so far observed to be > 5 to 40 seconds, but always in times divisible by 5 seconds (5, 10, 15, > etc.), leading to the thought that it is one or more name service > timeouts. Eventually, once the door_calls finish, the actual package > install is very quick. > > There may be a simple misconfiguration on both systems, but I can?t > figure out what that might be. Additionally, I don?t know why the > pkginstalls are making door_calls to begin with, and I?m not sure how to > figure that out. So that?s one my main question: > > *** How can I find out exactly what the pkginstalls are asking nscd > to resolve? *** > > Hopefully that will lead me (and the Sun engineer with whom I have a > case open) to an understanding of what is leading to the presumed timeouts. > > FWIW, here?s a partial truss with timestamps of an existing pkginstall > process. Once it hit the ioctl(3,?) it flew through the package > installation. Note the times around the door_calls: 13, 10 and 15 > seconds (the first isn?t a multiple of 5 since I started the truss > during the call). > > Base time stamp: ***** 1210181159*.9172 [ Wed May 7 17:25:59 GMT 2008 ] > > door_call(4, 0xFFBFD030) (sleeping...) > > *****1210472*.1302 door_call(4, 0xFFBFD030) = 0 > > 1210472.1306 open("/dev/udp6", O_RDONLY) = 5 > > 1210472.1310 ioctl(5, SIOCGLIFNUM, 0xFFBFD2DC) = 0 > > 1210472.1312 close(5) = 0 > > 1210472.1312 getuid() = 0 [0] > > 1210472.1313 getuid() = 0 [0] > > *****1210472*.1314 door_info(4, 0xFFBFCF88) = 0 > > door_call(4, 0xFFBFD030) (sleeping...) > > *****1210482*.1503 door_call(4, 0xFFBFD030) = 0 > > 1210482.1505 getuid() = 0 [0] > > 1210482.1505 getuid() = 0 [0] > > *****1210482*.1506 door_info(4, 0xFFBFCF88) = 0 > > door_call(4, 0xFFBFD030) (sleeping...) > > *****1210497*.1802 door_call(4, 0xFFBFD030) = 0 > > 1210497.1804 ioctl(3, ((''m''<<8)|7), 0xFFBFD8EC) = 0 > > Thanks, > Justin > > > ------------------------------------------------------------------------ > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- blu There are two rules in life: Rule 1- Don''t tell people everything you know ---------------------------------------------------------------------- Brian Utterback - Solaris RPE, Sun Microsystems, Inc. Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom
Justin Lloyd
2008-May-22 16:18 UTC
[dtrace-discuss] Slow pkginstalls due to long door_calls to nscd
This is strange: bash-3.00# pstop `pgrep pkginstall` bash-3.00# pstack `pgrep pkginstall` 24319: pkginstall -S -C -N pkgadd -R /zones/daf98/root -a /etc/lu/zones_pkgad feec6104 door (4, ffbfcef0, 6400, fe9e0000, 0, 3) bash-3.00# Based on the number of args, this appears to be a call to doorfs() with a subcode of 3 (DOOR_CALL), leading to a tail call of door_call(4, ffbfcef0). However, it occurred to me to try mdb (with which I''m admittedly not an expert): bash-3.00# mdb -p `pgrep pkginstall` Loading modules: [ ld.so.1 libc.so.1 libnvpair.so.1 libavl.so.1 libuutil.so.1 libsysevent.so.1 libproc.so.1 ]> ::stacklibc.so.1`_door_call+0xc(feef1668, ffbfd064, ffbfd060, ffbfd05c, ffbfcf74, fe9e0000) libc.so.1`_nsc_trydoorcall_ext+0x1b8(ffbfd064, ffbfd060, ffbfd05c, 0, 0, deadbeed) libc.so.1`_nsc_search+0xc0(ff2a6848, ff223c14, 4, ffbfd138, fe9e0000, 7b1c8) libc.so.1`nss_search+0x34(ff2a6848, ff223c14, 4, ffbfd138, 1, 0) libnsl.so.1`_switch_getipnodebyname_r+0x60(ffbfd350, 78388, 7839c, 2120, 2, 5) libnsl.so.1`_get_hostserv_inetnetdir_byname+0x89c(6, ffbfd270, ffbfd268, 0, ffbfd204, ffbfd274) libnsl.so.1`getipnodebyname+0xf4(ffbfd350, 2, 5, ffbfd2ec, 1a, 2120) is_local_host+0x10(ffbfd350, 7014c, 5, 0, 63766673, 64617067) 0x293e4(763e8, 73, ffbfd354, 12, 1, 9) 0x2a0c4(ffbfe02c, ffbfe02c, 4f400, 4f400, 0, 77570) get_mntinfo+0x80(1, 0, 4f400, 0, ffbfe02c, 729e0) main+0x97c(19, ffbff76c, 73c00, ffbffa3f, 0, 0) _start+0x108(0, 0, 0, 0, 0, 0)> _door_call::dislibc.so.1`_door_call: mov 3, %o5 libc.so.1`_door_call+4: mov 0xc9, %g1 libc.so.1`_door_call+8: ta 8 libc.so.1`_door_call+0xc: blu -0xa2ca8 <libc.so.1`_cerror> libc.so.1`_door_call+0x10: nop libc.so.1`_door_call+0x14: retl libc.so.1`_door_call+0x18: nop>The call in pkginstall to is_local_host() seems to be what is sparking the lookup. The beginning of my /etc/hosts has the following ::1 localhost 127.0.0.1 localhost loghost Is something wrong with that? Justin -----Original Message----- From: Brian Utterback [mailto:brian.utterback at sun.com] Sent: Thursday, May 22, 2008 7:45 AM To: Justin Lloyd Cc: dtrace-discuss at opensolaris.org Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_calls to nscd Well, the simple way to find out is to stop it and look. Or even to not stop it. Since it is spending enough time in the door call for you to catch it with a truss, instead of running truss, try running pstack on the same processes. I bet it will be immediately obvious what is being looked up by the stack. Justin Lloyd wrote:> Hi all, > > I am installing a zone onto two different V445s running S10U4 and the > zones are taking hours to install (about 1000 packages), that is, the > problem is identical on both systems. A bit of trussing and dtracinghas> shown that the pkginstalls being run by the zoneadm install are making> door_call calls to nscd that are taking very long, so far observed tobe> 5 to 40 seconds, but always in times divisible by 5 seconds (5, 10,15,> etc.), leading to the thought that it is one or more name service > timeouts. Eventually, once the door_calls finish, the actual package > install is very quick. > > There may be a simple misconfiguration on both systems, but I can''t > figure out what that might be. Additionally, I don''t know why the > pkginstalls are making door_calls to begin with, and I''m not sure howto> figure that out. So that''s one my main question: > > *** How can I find out exactly what the pkginstalls are asking nscd> to resolve? *** > > Hopefully that will lead me (and the Sun engineer with whom I have a > case open) to an understanding of what is leading to the presumedtimeouts.> > FWIW, here''s a partial truss with timestamps of an existing pkginstall> process. Once it hit the ioctl(3,...) it flew through the package > installation. Note the times around the door_calls: 13, 10 and 15 > seconds (the first isn''t a multiple of 5 since I started the truss > during the call). > > Base time stamp: ***** 1210181159*.9172 [ Wed May 7 17:25:59 GMT2008 ]> > door_call(4, 0xFFBFD030) (sleeping...) > > *****1210472*.1302 door_call(4, 0xFFBFD030)= 0> > 1210472.1306 open("/dev/udp6", O_RDONLY) = 5 > > 1210472.1310 ioctl(5, SIOCGLIFNUM, 0xFFBFD2DC) = 0 > > 1210472.1312 close(5) = 0 > > 1210472.1312 getuid() = 0[0]> > 1210472.1313 getuid() = 0[0]> > *****1210472*.1314 door_info(4, 0xFFBFCF88)= 0> > door_call(4, 0xFFBFD030) (sleeping...) > > *****1210482*.1503 door_call(4, 0xFFBFD030)= 0> > 1210482.1505 getuid() = 0[0]> > 1210482.1505 getuid() = 0[0]> > *****1210482*.1506 door_info(4, 0xFFBFCF88)= 0> > door_call(4, 0xFFBFD030) (sleeping...) > > *****1210497*.1802 door_call(4, 0xFFBFD030)= 0> > 1210497.1804 ioctl(3, ((''m''<<8)|7), 0xFFBFD8EC) = 0 > > Thanks, > Justin > > >------------------------------------------------------------------------> > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- blu There are two rules in life: Rule 1- Don''t tell people everything you know ---------------------------------------------------------------------- Brian Utterback - Solaris RPE, Sun Microsystems, Inc. Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom
Brian Utterback
2008-May-22 16:29 UTC
[dtrace-discuss] Slow pkginstalls due to long door_calls to nscd
Okay, what does the host and ipnodes lines of your nsswitch.conf file look like? You can also try the same pstack trick on the nscd process to see where it is spending its time. Justin Lloyd wrote:> This is strange: > > bash-3.00# pstop `pgrep pkginstall` > bash-3.00# pstack `pgrep pkginstall` > 24319: pkginstall -S -C -N pkgadd -R /zones/daf98/root -a > /etc/lu/zones_pkgad > feec6104 door (4, ffbfcef0, 6400, fe9e0000, 0, 3) > bash-3.00# > > Based on the number of args, this appears to be a call to doorfs() with > a subcode of 3 (DOOR_CALL), leading to a tail call of door_call(4, > ffbfcef0). However, it occurred to me to try mdb (with which I''m > admittedly not an expert): > > bash-3.00# mdb -p `pgrep pkginstall` > Loading modules: [ ld.so.1 libc.so.1 libnvpair.so.1 libavl.so.1 > libuutil.so.1 libsysevent.so.1 libproc.so.1 ] >> ::stack > libc.so.1`_door_call+0xc(feef1668, ffbfd064, ffbfd060, ffbfd05c, > ffbfcf74, fe9e0000) > libc.so.1`_nsc_trydoorcall_ext+0x1b8(ffbfd064, ffbfd060, ffbfd05c, 0, 0, > deadbeed) > libc.so.1`_nsc_search+0xc0(ff2a6848, ff223c14, 4, ffbfd138, fe9e0000, > 7b1c8) > libc.so.1`nss_search+0x34(ff2a6848, ff223c14, 4, ffbfd138, 1, 0) > libnsl.so.1`_switch_getipnodebyname_r+0x60(ffbfd350, 78388, 7839c, 2120, > 2, 5) > libnsl.so.1`_get_hostserv_inetnetdir_byname+0x89c(6, ffbfd270, ffbfd268, > 0, ffbfd204, ffbfd274) > libnsl.so.1`getipnodebyname+0xf4(ffbfd350, 2, 5, ffbfd2ec, 1a, 2120) > is_local_host+0x10(ffbfd350, 7014c, 5, 0, 63766673, 64617067) > 0x293e4(763e8, 73, ffbfd354, 12, 1, 9) > 0x2a0c4(ffbfe02c, ffbfe02c, 4f400, 4f400, 0, 77570) > get_mntinfo+0x80(1, 0, 4f400, 0, ffbfe02c, 729e0) > main+0x97c(19, ffbff76c, 73c00, ffbffa3f, 0, 0) > _start+0x108(0, 0, 0, 0, 0, 0) >> _door_call::dis > libc.so.1`_door_call: mov 3, %o5 > libc.so.1`_door_call+4: mov 0xc9, %g1 > libc.so.1`_door_call+8: ta 8 > libc.so.1`_door_call+0xc: blu -0xa2ca8 > <libc.so.1`_cerror> > libc.so.1`_door_call+0x10: nop > libc.so.1`_door_call+0x14: retl > libc.so.1`_door_call+0x18: nop > > The call in pkginstall to is_local_host() seems to be what is sparking > the lookup. The beginning of my /etc/hosts has the following > > ::1 localhost > 127.0.0.1 localhost loghost > > Is something wrong with that? > > Justin > > > -----Original Message----- > From: Brian Utterback [mailto:brian.utterback at sun.com] > Sent: Thursday, May 22, 2008 7:45 AM > To: Justin Lloyd > Cc: dtrace-discuss at opensolaris.org > Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_calls to > nscd > > Well, the simple way to find out is to stop it and look. Or even to > not stop it. > > Since it is spending enough time in the door call for you to catch it > with a truss, instead of running truss, try running pstack on the same > processes. I bet it will be immediately obvious what is being looked > up by the stack. > > Justin Lloyd wrote: >> Hi all, >> >> I am installing a zone onto two different V445s running S10U4 and the >> zones are taking hours to install (about 1000 packages), that is, the >> problem is identical on both systems. A bit of trussing and dtracing > has >> shown that the pkginstalls being run by the zoneadm install are making > >> door_call calls to nscd that are taking very long, so far observed to > be >> 5 to 40 seconds, but always in times divisible by 5 seconds (5, 10, > 15, >> etc.), leading to the thought that it is one or more name service >> timeouts. Eventually, once the door_calls finish, the actual package >> install is very quick. >> >> There may be a simple misconfiguration on both systems, but I can''t >> figure out what that might be. Additionally, I don''t know why the >> pkginstalls are making door_calls to begin with, and I''m not sure how > to >> figure that out. So that''s one my main question: >> >> *** How can I find out exactly what the pkginstalls are asking nscd > >> to resolve? *** >> >> Hopefully that will lead me (and the Sun engineer with whom I have a >> case open) to an understanding of what is leading to the presumed > timeouts. >> FWIW, here''s a partial truss with timestamps of an existing pkginstall > >> process. Once it hit the ioctl(3,...) it flew through the package >> installation. Note the times around the door_calls: 13, 10 and 15 >> seconds (the first isn''t a multiple of 5 since I started the truss >> during the call). >> >> Base time stamp: ***** 1210181159*.9172 [ Wed May 7 17:25:59 GMT > 2008 ] >> door_call(4, 0xFFBFD030) (sleeping...) >> >> *****1210472*.1302 door_call(4, 0xFFBFD030) > = 0 >> 1210472.1306 open("/dev/udp6", O_RDONLY) = 5 >> >> 1210472.1310 ioctl(5, SIOCGLIFNUM, 0xFFBFD2DC) = 0 >> >> 1210472.1312 close(5) = 0 >> >> 1210472.1312 getuid() = 0 > [0] >> 1210472.1313 getuid() = 0 > [0] >> *****1210472*.1314 door_info(4, 0xFFBFCF88) > = 0 >> door_call(4, 0xFFBFD030) (sleeping...) >> >> *****1210482*.1503 door_call(4, 0xFFBFD030) > = 0 >> 1210482.1505 getuid() = 0 > [0] >> 1210482.1505 getuid() = 0 > [0] >> *****1210482*.1506 door_info(4, 0xFFBFCF88) > = 0 >> door_call(4, 0xFFBFD030) (sleeping...) >> >> *****1210497*.1802 door_call(4, 0xFFBFD030) > = 0 >> 1210497.1804 ioctl(3, ((''m''<<8)|7), 0xFFBFD8EC) = 0 >> >> Thanks, >> Justin >> >> >> > ------------------------------------------------------------------------ >> _______________________________________________ >> dtrace-discuss mailing list >> dtrace-discuss at opensolaris.org >-- blu There are two rules in life: Rule 1- Don''t tell people everything you know ---------------------------------------------------------------------- Brian Utterback - Solaris RPE, Sun Microsystems, Inc. Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom
Justin Lloyd
2008-May-22 17:13 UTC
[dtrace-discuss] Slow pkginstalls due to long door_calls to nscd
Both lines are "files dns". I''ll look into the nscd process, but first I''m doing some dtrace analysis of the long ustacks to see which ones are taking a long time, just to make sure the mdb stack I provided wasn''t an outlier. Justin -----Original Message----- From: Brian Utterback [mailto:brian.utterback at sun.com] Sent: Thursday, May 22, 2008 10:29 AM To: Justin Lloyd Cc: dtrace-discuss at opensolaris.org Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_calls to nscd Okay, what does the host and ipnodes lines of your nsswitch.conf file look like? You can also try the same pstack trick on the nscd process to see where it is spending its time. Justin Lloyd wrote:> This is strange: > > bash-3.00# pstop `pgrep pkginstall` > bash-3.00# pstack `pgrep pkginstall` > 24319: pkginstall -S -C -N pkgadd -R /zones/daf98/root -a > /etc/lu/zones_pkgad > feec6104 door (4, ffbfcef0, 6400, fe9e0000, 0, 3) > bash-3.00# > > Based on the number of args, this appears to be a call to doorfs()with> a subcode of 3 (DOOR_CALL), leading to a tail call of door_call(4, > ffbfcef0). However, it occurred to me to try mdb (with which I''m > admittedly not an expert): > > bash-3.00# mdb -p `pgrep pkginstall` > Loading modules: [ ld.so.1 libc.so.1 libnvpair.so.1 libavl.so.1 > libuutil.so.1 libsysevent.so.1 libproc.so.1 ] >> ::stack > libc.so.1`_door_call+0xc(feef1668, ffbfd064, ffbfd060, ffbfd05c, > ffbfcf74, fe9e0000) > libc.so.1`_nsc_trydoorcall_ext+0x1b8(ffbfd064, ffbfd060, ffbfd05c, 0,0,> deadbeed) > libc.so.1`_nsc_search+0xc0(ff2a6848, ff223c14, 4, ffbfd138, fe9e0000, > 7b1c8) > libc.so.1`nss_search+0x34(ff2a6848, ff223c14, 4, ffbfd138, 1, 0) > libnsl.so.1`_switch_getipnodebyname_r+0x60(ffbfd350, 78388, 7839c,2120,> 2, 5) > libnsl.so.1`_get_hostserv_inetnetdir_byname+0x89c(6, ffbfd270,ffbfd268,> 0, ffbfd204, ffbfd274) > libnsl.so.1`getipnodebyname+0xf4(ffbfd350, 2, 5, ffbfd2ec, 1a, 2120) > is_local_host+0x10(ffbfd350, 7014c, 5, 0, 63766673, 64617067) > 0x293e4(763e8, 73, ffbfd354, 12, 1, 9) > 0x2a0c4(ffbfe02c, ffbfe02c, 4f400, 4f400, 0, 77570) > get_mntinfo+0x80(1, 0, 4f400, 0, ffbfe02c, 729e0) > main+0x97c(19, ffbff76c, 73c00, ffbffa3f, 0, 0) > _start+0x108(0, 0, 0, 0, 0, 0) >> _door_call::dis > libc.so.1`_door_call: mov 3, %o5 > libc.so.1`_door_call+4: mov 0xc9, %g1 > libc.so.1`_door_call+8: ta 8 > libc.so.1`_door_call+0xc: blu -0xa2ca8 > <libc.so.1`_cerror> > libc.so.1`_door_call+0x10: nop > libc.so.1`_door_call+0x14: retl > libc.so.1`_door_call+0x18: nop > > The call in pkginstall to is_local_host() seems to be what is sparking > the lookup. The beginning of my /etc/hosts has the following > > ::1 localhost > 127.0.0.1 localhost loghost > > Is something wrong with that? > > Justin > > > -----Original Message----- > From: Brian Utterback [mailto:brian.utterback at sun.com] > Sent: Thursday, May 22, 2008 7:45 AM > To: Justin Lloyd > Cc: dtrace-discuss at opensolaris.org > Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_callsto> nscd > > Well, the simple way to find out is to stop it and look. Or even to > not stop it. > > Since it is spending enough time in the door call for you to catch it > with a truss, instead of running truss, try running pstack on the same> processes. I bet it will be immediately obvious what is being looked> up by the stack. > > Justin Lloyd wrote: >> Hi all, >> >> I am installing a zone onto two different V445s running S10U4 and the>> zones are taking hours to install (about 1000 packages), that is, the>> problem is identical on both systems. A bit of trussing and dtracing > has >> shown that the pkginstalls being run by the zoneadm install aremaking> >> door_call calls to nscd that are taking very long, so far observed to > be >> 5 to 40 seconds, but always in times divisible by 5 seconds (5, 10, > 15, >> etc.), leading to the thought that it is one or more name service >> timeouts. Eventually, once the door_calls finish, the actual package >> install is very quick. >> >> There may be a simple misconfiguration on both systems, but I can''t >> figure out what that might be. Additionally, I don''t know why the >> pkginstalls are making door_calls to begin with, and I''m not sure how > to >> figure that out. So that''s one my main question: >> >> *** How can I find out exactly what the pkginstalls are askingnscd> >> to resolve? *** >> >> Hopefully that will lead me (and the Sun engineer with whom I have a >> case open) to an understanding of what is leading to the presumed > timeouts. >> FWIW, here''s a partial truss with timestamps of an existingpkginstall> >> process. Once it hit the ioctl(3,...) it flew through the package >> installation. Note the times around the door_calls: 13, 10 and 15 >> seconds (the first isn''t a multiple of 5 since I started the truss >> during the call). >> >> Base time stamp: ***** 1210181159*.9172 [ Wed May 7 17:25:59 GMT > 2008 ] >> door_call(4, 0xFFBFD030) (sleeping...) >> >> *****1210472*.1302 door_call(4, 0xFFBFD030) > = 0 >> 1210472.1306 open("/dev/udp6", O_RDONLY) = 5 >> >> 1210472.1310 ioctl(5, SIOCGLIFNUM, 0xFFBFD2DC) = 0 >> >> 1210472.1312 close(5) = 0 >> >> 1210472.1312 getuid() = 0 > [0] >> 1210472.1313 getuid() = 0 > [0] >> *****1210472*.1314 door_info(4, 0xFFBFCF88) > = 0 >> door_call(4, 0xFFBFD030) (sleeping...) >> >> *****1210482*.1503 door_call(4, 0xFFBFD030) > = 0 >> 1210482.1505 getuid() = 0 > [0] >> 1210482.1505 getuid() = 0 > [0] >> *****1210482*.1506 door_info(4, 0xFFBFCF88) > = 0 >> door_call(4, 0xFFBFD030) (sleeping...) >> >> *****1210497*.1802 door_call(4, 0xFFBFD030) > = 0 >> 1210497.1804 ioctl(3, ((''m''<<8)|7), 0xFFBFD8EC) = 0 >> >> Thanks, >> Justin >> >> >> >------------------------------------------------------------------------>> _______________________________________________ >> dtrace-discuss mailing list >> dtrace-discuss at opensolaris.org >-- blu There are two rules in life: Rule 1- Don''t tell people everything you know ---------------------------------------------------------------------- Brian Utterback - Solaris RPE, Sun Microsystems, Inc. Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom
Raja Gopal Andra
2008-May-22 17:29 UTC
[dtrace-discuss] Slow pkginstalls due to long door_calls to nscd
Are there too many fds opened by nscd? A simple pfiles on nscd will reveal this info. I am currently working on fixing a CR (6685614) where there is a fd leak in nscd when files is used as backend and this hurts nscd performance. Thank you, Raja Justin Lloyd wrote:> Both lines are "files dns". > > I''ll look into the nscd process, but first I''m doing some dtrace > analysis of the long ustacks to see which ones are taking a long time, > just to make sure the mdb stack I provided wasn''t an outlier. > > Justin > > -----Original Message----- > From: Brian Utterback [mailto:brian.utterback at sun.com] > Sent: Thursday, May 22, 2008 10:29 AM > To: Justin Lloyd > Cc: dtrace-discuss at opensolaris.org > Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_calls to > nscd > > Okay, what does the host and ipnodes lines of your nsswitch.conf file > look like? You can also try the same pstack trick on the nscd process > to see where it is spending its time. > > Justin Lloyd wrote: >> This is strange: >> >> bash-3.00# pstop `pgrep pkginstall` >> bash-3.00# pstack `pgrep pkginstall` >> 24319: pkginstall -S -C -N pkgadd -R /zones/daf98/root -a >> /etc/lu/zones_pkgad >> feec6104 door (4, ffbfcef0, 6400, fe9e0000, 0, 3) >> bash-3.00# >> >> Based on the number of args, this appears to be a call to doorfs() > with >> a subcode of 3 (DOOR_CALL), leading to a tail call of door_call(4, >> ffbfcef0). However, it occurred to me to try mdb (with which I''m >> admittedly not an expert): >> >> bash-3.00# mdb -p `pgrep pkginstall` >> Loading modules: [ ld.so.1 libc.so.1 libnvpair.so.1 libavl.so.1 >> libuutil.so.1 libsysevent.so.1 libproc.so.1 ] >>> ::stack >> libc.so.1`_door_call+0xc(feef1668, ffbfd064, ffbfd060, ffbfd05c, >> ffbfcf74, fe9e0000) >> libc.so.1`_nsc_trydoorcall_ext+0x1b8(ffbfd064, ffbfd060, ffbfd05c, 0, > 0, >> deadbeed) >> libc.so.1`_nsc_search+0xc0(ff2a6848, ff223c14, 4, ffbfd138, fe9e0000, >> 7b1c8) >> libc.so.1`nss_search+0x34(ff2a6848, ff223c14, 4, ffbfd138, 1, 0) >> libnsl.so.1`_switch_getipnodebyname_r+0x60(ffbfd350, 78388, 7839c, > 2120, >> 2, 5) >> libnsl.so.1`_get_hostserv_inetnetdir_byname+0x89c(6, ffbfd270, > ffbfd268, >> 0, ffbfd204, ffbfd274) >> libnsl.so.1`getipnodebyname+0xf4(ffbfd350, 2, 5, ffbfd2ec, 1a, 2120) >> is_local_host+0x10(ffbfd350, 7014c, 5, 0, 63766673, 64617067) >> 0x293e4(763e8, 73, ffbfd354, 12, 1, 9) >> 0x2a0c4(ffbfe02c, ffbfe02c, 4f400, 4f400, 0, 77570) >> get_mntinfo+0x80(1, 0, 4f400, 0, ffbfe02c, 729e0) >> main+0x97c(19, ffbff76c, 73c00, ffbffa3f, 0, 0) >> _start+0x108(0, 0, 0, 0, 0, 0) >>> _door_call::dis >> libc.so.1`_door_call: mov 3, %o5 >> libc.so.1`_door_call+4: mov 0xc9, %g1 >> libc.so.1`_door_call+8: ta 8 >> libc.so.1`_door_call+0xc: blu -0xa2ca8 >> <libc.so.1`_cerror> >> libc.so.1`_door_call+0x10: nop >> libc.so.1`_door_call+0x14: retl >> libc.so.1`_door_call+0x18: nop >> >> The call in pkginstall to is_local_host() seems to be what is sparking >> the lookup. The beginning of my /etc/hosts has the following >> >> ::1 localhost >> 127.0.0.1 localhost loghost >> >> Is something wrong with that? >> >> Justin >> >> >> -----Original Message----- >> From: Brian Utterback [mailto:brian.utterback at sun.com] >> Sent: Thursday, May 22, 2008 7:45 AM >> To: Justin Lloyd >> Cc: dtrace-discuss at opensolaris.org >> Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_calls > to >> nscd >> >> Well, the simple way to find out is to stop it and look. Or even to >> not stop it. >> >> Since it is spending enough time in the door call for you to catch it >> with a truss, instead of running truss, try running pstack on the same > >> processes. I bet it will be immediately obvious what is being looked > >> up by the stack. >> >> Justin Lloyd wrote: >>> Hi all, >>> >>> I am installing a zone onto two different V445s running S10U4 and the > >>> zones are taking hours to install (about 1000 packages), that is, the > >>> problem is identical on both systems. A bit of trussing and dtracing >> has >>> shown that the pkginstalls being run by the zoneadm install are > making >>> door_call calls to nscd that are taking very long, so far observed to >> be >>> 5 to 40 seconds, but always in times divisible by 5 seconds (5, 10, >> 15, >>> etc.), leading to the thought that it is one or more name service >>> timeouts. Eventually, once the door_calls finish, the actual package >>> install is very quick. >>> >>> There may be a simple misconfiguration on both systems, but I can''t >>> figure out what that might be. Additionally, I don''t know why the >>> pkginstalls are making door_calls to begin with, and I''m not sure how >> to >>> figure that out. So that''s one my main question: >>> >>> *** How can I find out exactly what the pkginstalls are asking > nscd >>> to resolve? *** >>> >>> Hopefully that will lead me (and the Sun engineer with whom I have a >>> case open) to an understanding of what is leading to the presumed >> timeouts. >>> FWIW, here''s a partial truss with timestamps of an existing > pkginstall >>> process. Once it hit the ioctl(3,...) it flew through the package >>> installation. Note the times around the door_calls: 13, 10 and 15 >>> seconds (the first isn''t a multiple of 5 since I started the truss >>> during the call). >>> >>> Base time stamp: ***** 1210181159*.9172 [ Wed May 7 17:25:59 GMT >> 2008 ] >>> door_call(4, 0xFFBFD030) (sleeping...) >>> >>> *****1210472*.1302 door_call(4, 0xFFBFD030) >> = 0 >>> 1210472.1306 open("/dev/udp6", O_RDONLY) = 5 >>> >>> 1210472.1310 ioctl(5, SIOCGLIFNUM, 0xFFBFD2DC) = 0 >>> >>> 1210472.1312 close(5) = 0 >>> >>> 1210472.1312 getuid() = 0 >> [0] >>> 1210472.1313 getuid() = 0 >> [0] >>> *****1210472*.1314 door_info(4, 0xFFBFCF88) >> = 0 >>> door_call(4, 0xFFBFD030) (sleeping...) >>> >>> *****1210482*.1503 door_call(4, 0xFFBFD030) >> = 0 >>> 1210482.1505 getuid() = 0 >> [0] >>> 1210482.1505 getuid() = 0 >> [0] >>> *****1210482*.1506 door_info(4, 0xFFBFCF88) >> = 0 >>> door_call(4, 0xFFBFD030) (sleeping...) >>> >>> *****1210497*.1802 door_call(4, 0xFFBFD030) >> = 0 >>> 1210497.1804 ioctl(3, ((''m''<<8)|7), 0xFFBFD8EC) = 0 >>> >>> Thanks, >>> Justin >>> >>> >>> > ------------------------------------------------------------------------ >>> _______________________________________________ >>> dtrace-discuss mailing list >>> dtrace-discuss at opensolaris.org >-- Raja Gopal Andra Sun Microsystems Phone: +1 650 786 4273 Solaris RPE 17 Network Circle, Fax : +1 650 786 7994 Approachability Group Menlo Park, CA 94025 mailto: andra at sun.com -- The secret to success is to work less as individuals and more as a team.
Justin Lloyd
2008-May-22 17:35 UTC
[dtrace-discuss] Slow pkginstalls due to long door_calls to nscd
No. Here''s the pfiles output on the global zone''s nscd process. Interesting that it has a door open to itself. bash-3.00# pfiles 10815 10815: /usr/sbin/nscd Current rlimit: 256 file descriptors 0: S_IFCHR mode:0666 dev:311,0 ino:6815752 uid:0 gid:3 rdev:13,2 O_RDWR /devices/pseudo/mm at 0:null 1: S_IFCHR mode:0666 dev:311,0 ino:6815752 uid:0 gid:3 rdev:13,2 O_RDWR /devices/pseudo/mm at 0:null 2: S_IFCHR mode:0666 dev:311,0 ino:6815752 uid:0 gid:3 rdev:13,2 O_RDWR /devices/pseudo/mm at 0:null 3: S_IFDOOR mode:0777 dev:318,0 ino:0 uid:0 gid:0 size:0 O_RDWR FD_CLOEXEC door to nscd[10815] 4: S_IFSOCK mode:0666 dev:317,0 ino:64008 uid:0 gid:0 size:0 O_RDWR SOCK_RAW SO_SNDBUF(8192),SO_RCVBUF(8192) sockname: AF_INET 10.11.30.105 port: 53 peername: AF_ROUTE 5: S_IFSOCK mode:0666 dev:317,0 ino:40766 uid:0 gid:0 size:0 O_RDWR SOCK_DGRAM SO_DGRAM_ERRIND,SO_SNDBUF(57344),SO_RCVBUF(57344),IP_NEXTHOP(0.0.224.0) sockname: AF_INET 10.1.100.3 port: 35052 peername: AF_INET 10.11.30.105 port: 53 bash-3.00# -----Original Message----- From: Raja Gopal Andra [mailto:rajagopal.andra at sun.com] Sent: Thursday, May 22, 2008 11:30 AM To: Justin Lloyd Cc: Brian Utterback; dtrace-discuss at opensolaris.org Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_calls to nscd Are there too many fds opened by nscd? A simple pfiles on nscd will reveal this info. I am currently working on fixing a CR (6685614) where there is a fd leak in nscd when files is used as backend and this hurts nscd performance. Thank you, Raja Justin Lloyd wrote:> Both lines are "files dns". > > I''ll look into the nscd process, but first I''m doing some dtrace > analysis of the long ustacks to see which ones are taking a long time, > just to make sure the mdb stack I provided wasn''t an outlier. > > Justin > > -----Original Message----- > From: Brian Utterback [mailto:brian.utterback at sun.com] > Sent: Thursday, May 22, 2008 10:29 AM > To: Justin Lloyd > Cc: dtrace-discuss at opensolaris.org > Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_callsto> nscd > > Okay, what does the host and ipnodes lines of your nsswitch.conf file > look like? You can also try the same pstack trick on the nscd process > to see where it is spending its time. > > Justin Lloyd wrote: >> This is strange: >> >> bash-3.00# pstop `pgrep pkginstall` >> bash-3.00# pstack `pgrep pkginstall` >> 24319: pkginstall -S -C -N pkgadd -R /zones/daf98/root -a >> /etc/lu/zones_pkgad >> feec6104 door (4, ffbfcef0, 6400, fe9e0000, 0, 3) >> bash-3.00# >> >> Based on the number of args, this appears to be a call to doorfs() > with >> a subcode of 3 (DOOR_CALL), leading to a tail call of door_call(4, >> ffbfcef0). However, it occurred to me to try mdb (with which I''m >> admittedly not an expert): >> >> bash-3.00# mdb -p `pgrep pkginstall` >> Loading modules: [ ld.so.1 libc.so.1 libnvpair.so.1 libavl.so.1 >> libuutil.so.1 libsysevent.so.1 libproc.so.1 ] >>> ::stack >> libc.so.1`_door_call+0xc(feef1668, ffbfd064, ffbfd060, ffbfd05c, >> ffbfcf74, fe9e0000) >> libc.so.1`_nsc_trydoorcall_ext+0x1b8(ffbfd064, ffbfd060, ffbfd05c, 0, > 0, >> deadbeed) >> libc.so.1`_nsc_search+0xc0(ff2a6848, ff223c14, 4, ffbfd138, fe9e0000, >> 7b1c8) >> libc.so.1`nss_search+0x34(ff2a6848, ff223c14, 4, ffbfd138, 1, 0) >> libnsl.so.1`_switch_getipnodebyname_r+0x60(ffbfd350, 78388, 7839c, > 2120, >> 2, 5) >> libnsl.so.1`_get_hostserv_inetnetdir_byname+0x89c(6, ffbfd270, > ffbfd268, >> 0, ffbfd204, ffbfd274) >> libnsl.so.1`getipnodebyname+0xf4(ffbfd350, 2, 5, ffbfd2ec, 1a, 2120) >> is_local_host+0x10(ffbfd350, 7014c, 5, 0, 63766673, 64617067) >> 0x293e4(763e8, 73, ffbfd354, 12, 1, 9) >> 0x2a0c4(ffbfe02c, ffbfe02c, 4f400, 4f400, 0, 77570) >> get_mntinfo+0x80(1, 0, 4f400, 0, ffbfe02c, 729e0) >> main+0x97c(19, ffbff76c, 73c00, ffbffa3f, 0, 0) >> _start+0x108(0, 0, 0, 0, 0, 0) >>> _door_call::dis >> libc.so.1`_door_call: mov 3, %o5 >> libc.so.1`_door_call+4: mov 0xc9, %g1 >> libc.so.1`_door_call+8: ta 8 >> libc.so.1`_door_call+0xc: blu -0xa2ca8 >> <libc.so.1`_cerror> >> libc.so.1`_door_call+0x10: nop >> libc.so.1`_door_call+0x14: retl >> libc.so.1`_door_call+0x18: nop >> >> The call in pkginstall to is_local_host() seems to be what issparking>> the lookup. The beginning of my /etc/hosts has the following >> >> ::1 localhost >> 127.0.0.1 localhost loghost >> >> Is something wrong with that? >> >> Justin >> >> >> -----Original Message----- >> From: Brian Utterback [mailto:brian.utterback at sun.com] >> Sent: Thursday, May 22, 2008 7:45 AM >> To: Justin Lloyd >> Cc: dtrace-discuss at opensolaris.org >> Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_calls > to >> nscd >> >> Well, the simple way to find out is to stop it and look. Or even to >> not stop it. >> >> Since it is spending enough time in the door call for you to catch it>> with a truss, instead of running truss, try running pstack on thesame> >> processes. I bet it will be immediately obvious what is beinglooked> >> up by the stack. >> >> Justin Lloyd wrote: >>> Hi all, >>> >>> I am installing a zone onto two different V445s running S10U4 andthe> >>> zones are taking hours to install (about 1000 packages), that is,the> >>> problem is identical on both systems. A bit of trussing and dtracing >> has >>> shown that the pkginstalls being run by the zoneadm install are > making >>> door_call calls to nscd that are taking very long, so far observedto>> be >>> 5 to 40 seconds, but always in times divisible by 5 seconds (5, 10, >> 15, >>> etc.), leading to the thought that it is one or more name service >>> timeouts. Eventually, once the door_calls finish, the actual package>>> install is very quick. >>> >>> There may be a simple misconfiguration on both systems, but I can''t >>> figure out what that might be. Additionally, I don''t know why the >>> pkginstalls are making door_calls to begin with, and I''m not surehow>> to >>> figure that out. So that''s one my main question: >>> >>> *** How can I find out exactly what the pkginstalls are asking > nscd >>> to resolve? *** >>> >>> Hopefully that will lead me (and the Sun engineer with whom I have a>>> case open) to an understanding of what is leading to the presumed >> timeouts. >>> FWIW, here''s a partial truss with timestamps of an existing > pkginstall >>> process. Once it hit the ioctl(3,...) it flew through the package >>> installation. Note the times around the door_calls: 13, 10 and 15 >>> seconds (the first isn''t a multiple of 5 since I started the truss >>> during the call). >>> >>> Base time stamp: ***** 1210181159*.9172 [ Wed May 7 17:25:59 GMT >> 2008 ] >>> door_call(4, 0xFFBFD030) (sleeping...) >>> >>> *****1210472*.1302 door_call(4, 0xFFBFD030) >> = 0 >>> 1210472.1306 open("/dev/udp6", O_RDONLY) = 5 >>> >>> 1210472.1310 ioctl(5, SIOCGLIFNUM, 0xFFBFD2DC) = 0 >>> >>> 1210472.1312 close(5) = 0 >>> >>> 1210472.1312 getuid() = 0 >> [0] >>> 1210472.1313 getuid() = 0 >> [0] >>> *****1210472*.1314 door_info(4, 0xFFBFCF88) >> = 0 >>> door_call(4, 0xFFBFD030) (sleeping...) >>> >>> *****1210482*.1503 door_call(4, 0xFFBFD030) >> = 0 >>> 1210482.1505 getuid() = 0 >> [0] >>> 1210482.1505 getuid() = 0 >> [0] >>> *****1210482*.1506 door_info(4, 0xFFBFCF88) >> = 0 >>> door_call(4, 0xFFBFD030) (sleeping...) >>> >>> *****1210497*.1802 door_call(4, 0xFFBFD030) >> = 0 >>> 1210497.1804 ioctl(3, ((''m''<<8)|7), 0xFFBFD8EC) = 0 >>> >>> Thanks, >>> Justin >>> >>> >>> >------------------------------------------------------------------------>>> _______________________________________________ >>> dtrace-discuss mailing list >>> dtrace-discuss at opensolaris.org >-- Raja Gopal Andra Sun Microsystems Phone: +1 650 786 4273 Solaris RPE 17 Network Circle, Fax : +1 650 786 7994 Approachability Group Menlo Park, CA 94025 mailto: andra at sun.com -- The secret to success is to work less as individuals and more as a team.
Justin Lloyd
2008-May-22 17:36 UTC
[dtrace-discuss] Slow pkginstalls due to long door_calls to nscd
Here''s some output from a little dtrace script that prints the stacks of door_call() calls that take longer than 4 seconds: % sudo ./nscd_door.d pid 17917 call to doorfs(4) took 15 seconds pid 18029 call to doorfs(4) took 5 seconds pid 18029 call to doorfs(4) took 15 seconds pid 18029 call to doorfs(4) took 10 seconds pid 18029 call to doorfs(4) took 15 seconds pid 18146 call to doorfs(4) took 5 seconds pid 18146 call to doorfs(4) took 15 seconds ^C 0xfeec6104 0xfee50070 0xfee5ef18 0xfee5d9fc 0xff227da8 0xff226a90 0xff2241e8 0x338cc 0x293e4 0x2a0c4 0x2a4b4 0x20f08 0x184a4 1 0xfeec6104 0xfee50070 0xfee5ef18 0xfee5d9fc 0xff227da8 0xff226a90 0xff2241e8 0x33894 0x293e4 0x2a0c4 0x2a4b4 0x20f08 0x184a4 1 0xfeec6104 0xfee50070 0xfee5ef18 0xfee5d9fc 0xff227da8 0xff226a90 0xff2241e8 0x338cc 0x293e4 0x2a0c4 0x2a4b4 0x20f08 0x184a4 1 0xfeec6104 0xfee50070 0xfee5ef18 0xfee5d9fc 0xff227e78 0xff226ae8 0xff224118 0x33894 0x293e4 0x2a0c4 0x2a4b4 0x20f08 0x184a4 1 0xfeec6104 0xfee50070 0xfee5ef18 0xfee5d9fc 0xff227e78 0xff226ae8 0xff224118 0x338cc 0x293e4 0x2a0c4 0x2a4b4 0x20f08 0x184a4 1 libc.so.1`_door_call+0x8 libc.so.1`_nsc_trydoorcall_ext+0x1b8 libc.so.1`_nsc_search+0xc0 libc.so.1`nss_search+0x34 libnsl.so.1`_switch_gethostbyname_r+0x58 libnsl.so.1`_get_hostserv_inetnetdir_byname+0x844 libnsl.so.1`getipnodebyname+0x1c4 pkginstall`is_local_host+0x10 pkginstall`0x293e4 pkginstall`0x2a0c4 pkginstall`get_mntinfo+0x80 pkginstall`main+0x97c pkginstall`_start+0x108 1 libc.so.1`_door_call+0x8 libc.so.1`_nsc_trydoorcall_ext+0x1b8 libc.so.1`_nsc_search+0xc0 libc.so.1`nss_search+0x34 libnsl.so.1`_switch_getipnodebyname_r+0x60 libnsl.so.1`_get_hostserv_inetnetdir_byname+0x89c libnsl.so.1`getipnodebyname+0xf4 pkginstall`is_local_host+0x10 pkginstall`0x293e4 pkginstall`0x2a0c4 pkginstall`get_mntinfo+0x80 pkginstall`main+0x97c pkginstall`_start+0x108 1 [jlloyd at dapgold-b:~] -----Original Message----- From: dtrace-discuss-bounces at opensolaris.org [mailto:dtrace-discuss-bounces at opensolaris.org] On Behalf Of Justin Lloyd Sent: Thursday, May 22, 2008 11:14 AM To: Brian Utterback Cc: dtrace-discuss at opensolaris.org Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_calls to nscd Both lines are "files dns". I''ll look into the nscd process, but first I''m doing some dtrace analysis of the long ustacks to see which ones are taking a long time, just to make sure the mdb stack I provided wasn''t an outlier. Justin -----Original Message----- From: Brian Utterback [mailto:brian.utterback at sun.com] Sent: Thursday, May 22, 2008 10:29 AM To: Justin Lloyd Cc: dtrace-discuss at opensolaris.org Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_calls to nscd Okay, what does the host and ipnodes lines of your nsswitch.conf file look like? You can also try the same pstack trick on the nscd process to see where it is spending its time. Justin Lloyd wrote:> This is strange: > > bash-3.00# pstop `pgrep pkginstall` > bash-3.00# pstack `pgrep pkginstall` > 24319: pkginstall -S -C -N pkgadd -R /zones/daf98/root -a > /etc/lu/zones_pkgad > feec6104 door (4, ffbfcef0, 6400, fe9e0000, 0, 3) > bash-3.00# > > Based on the number of args, this appears to be a call to doorfs()with> a subcode of 3 (DOOR_CALL), leading to a tail call of door_call(4, > ffbfcef0). However, it occurred to me to try mdb (with which I''m > admittedly not an expert): > > bash-3.00# mdb -p `pgrep pkginstall` > Loading modules: [ ld.so.1 libc.so.1 libnvpair.so.1 libavl.so.1 > libuutil.so.1 libsysevent.so.1 libproc.so.1 ] >> ::stack > libc.so.1`_door_call+0xc(feef1668, ffbfd064, ffbfd060, ffbfd05c, > ffbfcf74, fe9e0000) > libc.so.1`_nsc_trydoorcall_ext+0x1b8(ffbfd064, ffbfd060, ffbfd05c, 0,0,> deadbeed) > libc.so.1`_nsc_search+0xc0(ff2a6848, ff223c14, 4, ffbfd138, fe9e0000, > 7b1c8) > libc.so.1`nss_search+0x34(ff2a6848, ff223c14, 4, ffbfd138, 1, 0) > libnsl.so.1`_switch_getipnodebyname_r+0x60(ffbfd350, 78388, 7839c,2120,> 2, 5) > libnsl.so.1`_get_hostserv_inetnetdir_byname+0x89c(6, ffbfd270,ffbfd268,> 0, ffbfd204, ffbfd274) > libnsl.so.1`getipnodebyname+0xf4(ffbfd350, 2, 5, ffbfd2ec, 1a, 2120) > is_local_host+0x10(ffbfd350, 7014c, 5, 0, 63766673, 64617067) > 0x293e4(763e8, 73, ffbfd354, 12, 1, 9) > 0x2a0c4(ffbfe02c, ffbfe02c, 4f400, 4f400, 0, 77570) > get_mntinfo+0x80(1, 0, 4f400, 0, ffbfe02c, 729e0) > main+0x97c(19, ffbff76c, 73c00, ffbffa3f, 0, 0) > _start+0x108(0, 0, 0, 0, 0, 0) >> _door_call::dis > libc.so.1`_door_call: mov 3, %o5 > libc.so.1`_door_call+4: mov 0xc9, %g1 > libc.so.1`_door_call+8: ta 8 > libc.so.1`_door_call+0xc: blu -0xa2ca8 > <libc.so.1`_cerror> > libc.so.1`_door_call+0x10: nop > libc.so.1`_door_call+0x14: retl > libc.so.1`_door_call+0x18: nop > > The call in pkginstall to is_local_host() seems to be what is sparking > the lookup. The beginning of my /etc/hosts has the following > > ::1 localhost > 127.0.0.1 localhost loghost > > Is something wrong with that? > > Justin > > > -----Original Message----- > From: Brian Utterback [mailto:brian.utterback at sun.com] > Sent: Thursday, May 22, 2008 7:45 AM > To: Justin Lloyd > Cc: dtrace-discuss at opensolaris.org > Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_callsto> nscd > > Well, the simple way to find out is to stop it and look. Or even to > not stop it. > > Since it is spending enough time in the door call for you to catch it > with a truss, instead of running truss, try running pstack on the same> processes. I bet it will be immediately obvious what is being looked> up by the stack. > > Justin Lloyd wrote: >> Hi all, >> >> I am installing a zone onto two different V445s running S10U4 and the>> zones are taking hours to install (about 1000 packages), that is, the>> problem is identical on both systems. A bit of trussing and dtracing > has >> shown that the pkginstalls being run by the zoneadm install aremaking> >> door_call calls to nscd that are taking very long, so far observed to > be >> 5 to 40 seconds, but always in times divisible by 5 seconds (5, 10, > 15, >> etc.), leading to the thought that it is one or more name service >> timeouts. Eventually, once the door_calls finish, the actual package >> install is very quick. >> >> There may be a simple misconfiguration on both systems, but I can''t >> figure out what that might be. Additionally, I don''t know why the >> pkginstalls are making door_calls to begin with, and I''m not sure how > to >> figure that out. So that''s one my main question: >> >> *** How can I find out exactly what the pkginstalls are askingnscd> >> to resolve? *** >> >> Hopefully that will lead me (and the Sun engineer with whom I have a >> case open) to an understanding of what is leading to the presumed > timeouts. >> FWIW, here''s a partial truss with timestamps of an existingpkginstall> >> process. Once it hit the ioctl(3,...) it flew through the package >> installation. Note the times around the door_calls: 13, 10 and 15 >> seconds (the first isn''t a multiple of 5 since I started the truss >> during the call). >> >> Base time stamp: ***** 1210181159*.9172 [ Wed May 7 17:25:59 GMT > 2008 ] >> door_call(4, 0xFFBFD030) (sleeping...) >> >> *****1210472*.1302 door_call(4, 0xFFBFD030) > = 0 >> 1210472.1306 open("/dev/udp6", O_RDONLY) = 5 >> >> 1210472.1310 ioctl(5, SIOCGLIFNUM, 0xFFBFD2DC) = 0 >> >> 1210472.1312 close(5) = 0 >> >> 1210472.1312 getuid() = 0 > [0] >> 1210472.1313 getuid() = 0 > [0] >> *****1210472*.1314 door_info(4, 0xFFBFCF88) > = 0 >> door_call(4, 0xFFBFD030) (sleeping...) >> >> *****1210482*.1503 door_call(4, 0xFFBFD030) > = 0 >> 1210482.1505 getuid() = 0 > [0] >> 1210482.1505 getuid() = 0 > [0] >> *****1210482*.1506 door_info(4, 0xFFBFCF88) > = 0 >> door_call(4, 0xFFBFD030) (sleeping...) >> >> *****1210497*.1802 door_call(4, 0xFFBFD030) > = 0 >> 1210497.1804 ioctl(3, ((''m''<<8)|7), 0xFFBFD8EC) = 0 >> >> Thanks, >> Justin >> >> >> >------------------------------------------------------------------------>> _______________________________________________ >> dtrace-discuss mailing list >> dtrace-discuss at opensolaris.org >-- blu There are two rules in life: Rule 1- Don''t tell people everything you know ---------------------------------------------------------------------- Brian Utterback - Solaris RPE, Sun Microsystems, Inc. Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom _______________________________________________ dtrace-discuss mailing list dtrace-discuss at opensolaris.org
Brian Utterback
2008-May-22 19:24 UTC
[dtrace-discuss] Slow pkginstalls due to long door_calls to nscd
Looking at the code for get_mntinfo, it looks like it is parsing the entries in the vfstab/mnttab and doing a lookup on the host for each entry that has a ":" in it. I think it is rather interesting that the pfiles shows an error indication on port 53. I think you may have a problem with your DNS lookups. Justin Lloyd wrote:> Here''s some output from a little dtrace script that prints the stacks of > door_call() calls that take longer than 4 seconds: > > % sudo ./nscd_door.d > pid 17917 call to doorfs(4) took 15 seconds > pid 18029 call to doorfs(4) took 5 seconds > pid 18029 call to doorfs(4) took 15 seconds > pid 18029 call to doorfs(4) took 10 seconds > pid 18029 call to doorfs(4) took 15 seconds > pid 18146 call to doorfs(4) took 5 seconds > pid 18146 call to doorfs(4) took 15 seconds > ^C > > > 0xfeec6104 > 0xfee50070 > 0xfee5ef18 > 0xfee5d9fc > 0xff227da8 > 0xff226a90 > 0xff2241e8 > 0x338cc > 0x293e4 > 0x2a0c4 > 0x2a4b4 > 0x20f08 > 0x184a4 > 1 > > 0xfeec6104 > 0xfee50070 > 0xfee5ef18 > 0xfee5d9fc > 0xff227da8 > 0xff226a90 > 0xff2241e8 > 0x33894 > 0x293e4 > 0x2a0c4 > 0x2a4b4 > 0x20f08 > 0x184a4 > 1 > > 0xfeec6104 > 0xfee50070 > 0xfee5ef18 > 0xfee5d9fc > 0xff227da8 > 0xff226a90 > 0xff2241e8 > 0x338cc > 0x293e4 > 0x2a0c4 > 0x2a4b4 > 0x20f08 > 0x184a4 > 1 > > 0xfeec6104 > 0xfee50070 > 0xfee5ef18 > 0xfee5d9fc > 0xff227e78 > 0xff226ae8 > 0xff224118 > 0x33894 > 0x293e4 > 0x2a0c4 > 0x2a4b4 > 0x20f08 > 0x184a4 > 1 > > 0xfeec6104 > 0xfee50070 > 0xfee5ef18 > 0xfee5d9fc > 0xff227e78 > 0xff226ae8 > 0xff224118 > 0x338cc > 0x293e4 > 0x2a0c4 > 0x2a4b4 > 0x20f08 > 0x184a4 > 1 > > libc.so.1`_door_call+0x8 > libc.so.1`_nsc_trydoorcall_ext+0x1b8 > libc.so.1`_nsc_search+0xc0 > libc.so.1`nss_search+0x34 > libnsl.so.1`_switch_gethostbyname_r+0x58 > libnsl.so.1`_get_hostserv_inetnetdir_byname+0x844 > libnsl.so.1`getipnodebyname+0x1c4 > pkginstall`is_local_host+0x10 > pkginstall`0x293e4 > pkginstall`0x2a0c4 > pkginstall`get_mntinfo+0x80 > pkginstall`main+0x97c > pkginstall`_start+0x108 > 1 > > libc.so.1`_door_call+0x8 > libc.so.1`_nsc_trydoorcall_ext+0x1b8 > libc.so.1`_nsc_search+0xc0 > libc.so.1`nss_search+0x34 > libnsl.so.1`_switch_getipnodebyname_r+0x60 > libnsl.so.1`_get_hostserv_inetnetdir_byname+0x89c > libnsl.so.1`getipnodebyname+0xf4 > pkginstall`is_local_host+0x10 > pkginstall`0x293e4 > pkginstall`0x2a0c4 > pkginstall`get_mntinfo+0x80 > pkginstall`main+0x97c > pkginstall`_start+0x108 > 1 > [jlloyd at dapgold-b:~] > > -----Original Message----- > From: dtrace-discuss-bounces at opensolaris.org > [mailto:dtrace-discuss-bounces at opensolaris.org] On Behalf Of Justin > Lloyd > Sent: Thursday, May 22, 2008 11:14 AM > To: Brian Utterback > Cc: dtrace-discuss at opensolaris.org > Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_calls to > nscd > > Both lines are "files dns". > > I''ll look into the nscd process, but first I''m doing some dtrace > analysis of the long ustacks to see which ones are taking a long time, > just to make sure the mdb stack I provided wasn''t an outlier. > > Justin > > -----Original Message----- > From: Brian Utterback [mailto:brian.utterback at sun.com] > Sent: Thursday, May 22, 2008 10:29 AM > To: Justin Lloyd > Cc: dtrace-discuss at opensolaris.org > Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_calls to > nscd > > Okay, what does the host and ipnodes lines of your nsswitch.conf file > look like? You can also try the same pstack trick on the nscd process > to see where it is spending its time. > > Justin Lloyd wrote: >> This is strange: >> >> bash-3.00# pstop `pgrep pkginstall` >> bash-3.00# pstack `pgrep pkginstall` >> 24319: pkginstall -S -C -N pkgadd -R /zones/daf98/root -a >> /etc/lu/zones_pkgad >> feec6104 door (4, ffbfcef0, 6400, fe9e0000, 0, 3) >> bash-3.00# >> >> Based on the number of args, this appears to be a call to doorfs() > with >> a subcode of 3 (DOOR_CALL), leading to a tail call of door_call(4, >> ffbfcef0). However, it occurred to me to try mdb (with which I''m >> admittedly not an expert): >> >> bash-3.00# mdb -p `pgrep pkginstall` >> Loading modules: [ ld.so.1 libc.so.1 libnvpair.so.1 libavl.so.1 >> libuutil.so.1 libsysevent.so.1 libproc.so.1 ] >>> ::stack >> libc.so.1`_door_call+0xc(feef1668, ffbfd064, ffbfd060, ffbfd05c, >> ffbfcf74, fe9e0000) >> libc.so.1`_nsc_trydoorcall_ext+0x1b8(ffbfd064, ffbfd060, ffbfd05c, 0, > 0, >> deadbeed) >> libc.so.1`_nsc_search+0xc0(ff2a6848, ff223c14, 4, ffbfd138, fe9e0000, >> 7b1c8) >> libc.so.1`nss_search+0x34(ff2a6848, ff223c14, 4, ffbfd138, 1, 0) >> libnsl.so.1`_switch_getipnodebyname_r+0x60(ffbfd350, 78388, 7839c, > 2120, >> 2, 5) >> libnsl.so.1`_get_hostserv_inetnetdir_byname+0x89c(6, ffbfd270, > ffbfd268, >> 0, ffbfd204, ffbfd274) >> libnsl.so.1`getipnodebyname+0xf4(ffbfd350, 2, 5, ffbfd2ec, 1a, 2120) >> is_local_host+0x10(ffbfd350, 7014c, 5, 0, 63766673, 64617067) >> 0x293e4(763e8, 73, ffbfd354, 12, 1, 9) >> 0x2a0c4(ffbfe02c, ffbfe02c, 4f400, 4f400, 0, 77570) >> get_mntinfo+0x80(1, 0, 4f400, 0, ffbfe02c, 729e0) >> main+0x97c(19, ffbff76c, 73c00, ffbffa3f, 0, 0) >> _start+0x108(0, 0, 0, 0, 0, 0) >>> _door_call::dis >> libc.so.1`_door_call: mov 3, %o5 >> libc.so.1`_door_call+4: mov 0xc9, %g1 >> libc.so.1`_door_call+8: ta 8 >> libc.so.1`_door_call+0xc: blu -0xa2ca8 >> <libc.so.1`_cerror> >> libc.so.1`_door_call+0x10: nop >> libc.so.1`_door_call+0x14: retl >> libc.so.1`_door_call+0x18: nop >> >> The call in pkginstall to is_local_host() seems to be what is sparking >> the lookup. The beginning of my /etc/hosts has the following >> >> ::1 localhost >> 127.0.0.1 localhost loghost >> >> Is something wrong with that? >> >> Justin >> >> >> -----Original Message----- >> From: Brian Utterback [mailto:brian.utterback at sun.com] >> Sent: Thursday, May 22, 2008 7:45 AM >> To: Justin Lloyd >> Cc: dtrace-discuss at opensolaris.org >> Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_calls > to >> nscd >> >> Well, the simple way to find out is to stop it and look. Or even to >> not stop it. >> >> Since it is spending enough time in the door call for you to catch it >> with a truss, instead of running truss, try running pstack on the same > >> processes. I bet it will be immediately obvious what is being looked > >> up by the stack. >> >> Justin Lloyd wrote: >>> Hi all, >>> >>> I am installing a zone onto two different V445s running S10U4 and the > >>> zones are taking hours to install (about 1000 packages), that is, the > >>> problem is identical on both systems. A bit of trussing and dtracing >> has >>> shown that the pkginstalls being run by the zoneadm install are > making >>> door_call calls to nscd that are taking very long, so far observed to >> be >>> 5 to 40 seconds, but always in times divisible by 5 seconds (5, 10, >> 15, >>> etc.), leading to the thought that it is one or more name service >>> timeouts. Eventually, once the door_calls finish, the actual package >>> install is very quick. >>> >>> There may be a simple misconfiguration on both systems, but I can''t >>> figure out what that might be. Additionally, I don''t know why the >>> pkginstalls are making door_calls to begin with, and I''m not sure how >> to >>> figure that out. So that''s one my main question: >>> >>> *** How can I find out exactly what the pkginstalls are asking > nscd >>> to resolve? *** >>> >>> Hopefully that will lead me (and the Sun engineer with whom I have a >>> case open) to an understanding of what is leading to the presumed >> timeouts. >>> FWIW, here''s a partial truss with timestamps of an existing > pkginstall >>> process. Once it hit the ioctl(3,...) it flew through the package >>> installation. Note the times around the door_calls: 13, 10 and 15 >>> seconds (the first isn''t a multiple of 5 since I started the truss >>> during the call). >>> >>> Base time stamp: ***** 1210181159*.9172 [ Wed May 7 17:25:59 GMT >> 2008 ] >>> door_call(4, 0xFFBFD030) (sleeping...) >>> >>> *****1210472*.1302 door_call(4, 0xFFBFD030) >> = 0 >>> 1210472.1306 open("/dev/udp6", O_RDONLY) = 5 >>> >>> 1210472.1310 ioctl(5, SIOCGLIFNUM, 0xFFBFD2DC) = 0 >>> >>> 1210472.1312 close(5) = 0 >>> >>> 1210472.1312 getuid() = 0 >> [0] >>> 1210472.1313 getuid() = 0 >> [0] >>> *****1210472*.1314 door_info(4, 0xFFBFCF88) >> = 0 >>> door_call(4, 0xFFBFD030) (sleeping...) >>> >>> *****1210482*.1503 door_call(4, 0xFFBFD030) >> = 0 >>> 1210482.1505 getuid() = 0 >> [0] >>> 1210482.1505 getuid() = 0 >> [0] >>> *****1210482*.1506 door_info(4, 0xFFBFCF88) >> = 0 >>> door_call(4, 0xFFBFD030) (sleeping...) >>> >>> *****1210497*.1802 door_call(4, 0xFFBFD030) >> = 0 >>> 1210497.1804 ioctl(3, ((''m''<<8)|7), 0xFFBFD8EC) = 0 >>> >>> Thanks, >>> Justin >>> >>> >>> > ------------------------------------------------------------------------ >>> _______________________________________________ >>> dtrace-discuss mailing list >>> dtrace-discuss at opensolaris.org >-- blu There are two rules in life: Rule 1- Don''t tell people everything you know ---------------------------------------------------------------------- Brian Utterback - Solaris RPE, Sun Microsystems, Inc. Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom
Justin Lloyd
2008-May-22 21:30 UTC
[dtrace-discuss] Slow pkginstalls due to long door_calls to nscd
BAM! It looks like the second of our two nameserver entries in the global zone''s /etc/resolv.conf is not pingable from this host (probably a routing or firewall issue). Once I removed it, packages started flying. Looks like this issue is resolved, but I''ve certainly learned a lot from the exercise. Huge thanks to everyone for the valuable, informative responses! Justin -----Original Message----- From: Brian Utterback [mailto:brian.utterback at sun.com] Sent: Thursday, May 22, 2008 1:24 PM To: Justin Lloyd Cc: dtrace-discuss at opensolaris.org Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_calls to nscd Looking at the code for get_mntinfo, it looks like it is parsing the entries in the vfstab/mnttab and doing a lookup on the host for each entry that has a ":" in it. I think it is rather interesting that the pfiles shows an error indication on port 53. I think you may have a problem with your DNS lookups. Justin Lloyd wrote:> Here''s some output from a little dtrace script that prints the stacksof> door_call() calls that take longer than 4 seconds: > > % sudo ./nscd_door.d > pid 17917 call to doorfs(4) took 15 seconds > pid 18029 call to doorfs(4) took 5 seconds > pid 18029 call to doorfs(4) took 15 seconds > pid 18029 call to doorfs(4) took 10 seconds > pid 18029 call to doorfs(4) took 15 seconds > pid 18146 call to doorfs(4) took 5 seconds > pid 18146 call to doorfs(4) took 15 seconds > ^C > > > 0xfeec6104 > 0xfee50070 > 0xfee5ef18 > 0xfee5d9fc > 0xff227da8 > 0xff226a90 > 0xff2241e8 > 0x338cc > 0x293e4 > 0x2a0c4 > 0x2a4b4 > 0x20f08 > 0x184a4 > 1 > > 0xfeec6104 > 0xfee50070 > 0xfee5ef18 > 0xfee5d9fc > 0xff227da8 > 0xff226a90 > 0xff2241e8 > 0x33894 > 0x293e4 > 0x2a0c4 > 0x2a4b4 > 0x20f08 > 0x184a4 > 1 > > 0xfeec6104 > 0xfee50070 > 0xfee5ef18 > 0xfee5d9fc > 0xff227da8 > 0xff226a90 > 0xff2241e8 > 0x338cc > 0x293e4 > 0x2a0c4 > 0x2a4b4 > 0x20f08 > 0x184a4 > 1 > > 0xfeec6104 > 0xfee50070 > 0xfee5ef18 > 0xfee5d9fc > 0xff227e78 > 0xff226ae8 > 0xff224118 > 0x33894 > 0x293e4 > 0x2a0c4 > 0x2a4b4 > 0x20f08 > 0x184a4 > 1 > > 0xfeec6104 > 0xfee50070 > 0xfee5ef18 > 0xfee5d9fc > 0xff227e78 > 0xff226ae8 > 0xff224118 > 0x338cc > 0x293e4 > 0x2a0c4 > 0x2a4b4 > 0x20f08 > 0x184a4 > 1 > > libc.so.1`_door_call+0x8 > libc.so.1`_nsc_trydoorcall_ext+0x1b8 > libc.so.1`_nsc_search+0xc0 > libc.so.1`nss_search+0x34 > libnsl.so.1`_switch_gethostbyname_r+0x58 > libnsl.so.1`_get_hostserv_inetnetdir_byname+0x844 > libnsl.so.1`getipnodebyname+0x1c4 > pkginstall`is_local_host+0x10 > pkginstall`0x293e4 > pkginstall`0x2a0c4 > pkginstall`get_mntinfo+0x80 > pkginstall`main+0x97c > pkginstall`_start+0x108 > 1 > > libc.so.1`_door_call+0x8 > libc.so.1`_nsc_trydoorcall_ext+0x1b8 > libc.so.1`_nsc_search+0xc0 > libc.so.1`nss_search+0x34 > libnsl.so.1`_switch_getipnodebyname_r+0x60 > libnsl.so.1`_get_hostserv_inetnetdir_byname+0x89c > libnsl.so.1`getipnodebyname+0xf4 > pkginstall`is_local_host+0x10 > pkginstall`0x293e4 > pkginstall`0x2a0c4 > pkginstall`get_mntinfo+0x80 > pkginstall`main+0x97c > pkginstall`_start+0x108 > 1 > [jlloyd at dapgold-b:~] > > -----Original Message----- > From: dtrace-discuss-bounces at opensolaris.org > [mailto:dtrace-discuss-bounces at opensolaris.org] On Behalf Of Justin > Lloyd > Sent: Thursday, May 22, 2008 11:14 AM > To: Brian Utterback > Cc: dtrace-discuss at opensolaris.org > Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_callsto> nscd > > Both lines are "files dns". > > I''ll look into the nscd process, but first I''m doing some dtrace > analysis of the long ustacks to see which ones are taking a long time, > just to make sure the mdb stack I provided wasn''t an outlier. > > Justin > > -----Original Message----- > From: Brian Utterback [mailto:brian.utterback at sun.com] > Sent: Thursday, May 22, 2008 10:29 AM > To: Justin Lloyd > Cc: dtrace-discuss at opensolaris.org > Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_callsto> nscd > > Okay, what does the host and ipnodes lines of your nsswitch.conf file > look like? You can also try the same pstack trick on the nscd process > to see where it is spending its time. > > Justin Lloyd wrote: >> This is strange: >> >> bash-3.00# pstop `pgrep pkginstall` >> bash-3.00# pstack `pgrep pkginstall` >> 24319: pkginstall -S -C -N pkgadd -R /zones/daf98/root -a >> /etc/lu/zones_pkgad >> feec6104 door (4, ffbfcef0, 6400, fe9e0000, 0, 3) >> bash-3.00# >> >> Based on the number of args, this appears to be a call to doorfs() > with >> a subcode of 3 (DOOR_CALL), leading to a tail call of door_call(4, >> ffbfcef0). However, it occurred to me to try mdb (with which I''m >> admittedly not an expert): >> >> bash-3.00# mdb -p `pgrep pkginstall` >> Loading modules: [ ld.so.1 libc.so.1 libnvpair.so.1 libavl.so.1 >> libuutil.so.1 libsysevent.so.1 libproc.so.1 ] >>> ::stack >> libc.so.1`_door_call+0xc(feef1668, ffbfd064, ffbfd060, ffbfd05c, >> ffbfcf74, fe9e0000) >> libc.so.1`_nsc_trydoorcall_ext+0x1b8(ffbfd064, ffbfd060, ffbfd05c, 0, > 0, >> deadbeed) >> libc.so.1`_nsc_search+0xc0(ff2a6848, ff223c14, 4, ffbfd138, fe9e0000, >> 7b1c8) >> libc.so.1`nss_search+0x34(ff2a6848, ff223c14, 4, ffbfd138, 1, 0) >> libnsl.so.1`_switch_getipnodebyname_r+0x60(ffbfd350, 78388, 7839c, > 2120, >> 2, 5) >> libnsl.so.1`_get_hostserv_inetnetdir_byname+0x89c(6, ffbfd270, > ffbfd268, >> 0, ffbfd204, ffbfd274) >> libnsl.so.1`getipnodebyname+0xf4(ffbfd350, 2, 5, ffbfd2ec, 1a, 2120) >> is_local_host+0x10(ffbfd350, 7014c, 5, 0, 63766673, 64617067) >> 0x293e4(763e8, 73, ffbfd354, 12, 1, 9) >> 0x2a0c4(ffbfe02c, ffbfe02c, 4f400, 4f400, 0, 77570) >> get_mntinfo+0x80(1, 0, 4f400, 0, ffbfe02c, 729e0) >> main+0x97c(19, ffbff76c, 73c00, ffbffa3f, 0, 0) >> _start+0x108(0, 0, 0, 0, 0, 0) >>> _door_call::dis >> libc.so.1`_door_call: mov 3, %o5 >> libc.so.1`_door_call+4: mov 0xc9, %g1 >> libc.so.1`_door_call+8: ta 8 >> libc.so.1`_door_call+0xc: blu -0xa2ca8 >> <libc.so.1`_cerror> >> libc.so.1`_door_call+0x10: nop >> libc.so.1`_door_call+0x14: retl >> libc.so.1`_door_call+0x18: nop >> >> The call in pkginstall to is_local_host() seems to be what issparking>> the lookup. The beginning of my /etc/hosts has the following >> >> ::1 localhost >> 127.0.0.1 localhost loghost >> >> Is something wrong with that? >> >> Justin >> >> >> -----Original Message----- >> From: Brian Utterback [mailto:brian.utterback at sun.com] >> Sent: Thursday, May 22, 2008 7:45 AM >> To: Justin Lloyd >> Cc: dtrace-discuss at opensolaris.org >> Subject: Re: [dtrace-discuss] Slow pkginstalls due to long door_calls > to >> nscd >> >> Well, the simple way to find out is to stop it and look. Or even to >> not stop it. >> >> Since it is spending enough time in the door call for you to catch it>> with a truss, instead of running truss, try running pstack on thesame> >> processes. I bet it will be immediately obvious what is beinglooked> >> up by the stack. >> >> Justin Lloyd wrote: >>> Hi all, >>> >>> I am installing a zone onto two different V445s running S10U4 andthe> >>> zones are taking hours to install (about 1000 packages), that is,the> >>> problem is identical on both systems. A bit of trussing and dtracing >> has >>> shown that the pkginstalls being run by the zoneadm install are > making >>> door_call calls to nscd that are taking very long, so far observedto>> be >>> 5 to 40 seconds, but always in times divisible by 5 seconds (5, 10, >> 15, >>> etc.), leading to the thought that it is one or more name service >>> timeouts. Eventually, once the door_calls finish, the actual package>>> install is very quick. >>> >>> There may be a simple misconfiguration on both systems, but I can''t >>> figure out what that might be. Additionally, I don''t know why the >>> pkginstalls are making door_calls to begin with, and I''m not surehow>> to >>> figure that out. So that''s one my main question: >>> >>> *** How can I find out exactly what the pkginstalls are asking > nscd >>> to resolve? *** >>> >>> Hopefully that will lead me (and the Sun engineer with whom I have a>>> case open) to an understanding of what is leading to the presumed >> timeouts. >>> FWIW, here''s a partial truss with timestamps of an existing > pkginstall >>> process. Once it hit the ioctl(3,...) it flew through the package >>> installation. Note the times around the door_calls: 13, 10 and 15 >>> seconds (the first isn''t a multiple of 5 since I started the truss >>> during the call). >>> >>> Base time stamp: ***** 1210181159*.9172 [ Wed May 7 17:25:59 GMT >> 2008 ] >>> door_call(4, 0xFFBFD030) (sleeping...) >>> >>> *****1210472*.1302 door_call(4, 0xFFBFD030) >> = 0 >>> 1210472.1306 open("/dev/udp6", O_RDONLY) = 5 >>> >>> 1210472.1310 ioctl(5, SIOCGLIFNUM, 0xFFBFD2DC) = 0 >>> >>> 1210472.1312 close(5) = 0 >>> >>> 1210472.1312 getuid() = 0 >> [0] >>> 1210472.1313 getuid() = 0 >> [0] >>> *****1210472*.1314 door_info(4, 0xFFBFCF88) >> = 0 >>> door_call(4, 0xFFBFD030) (sleeping...) >>> >>> *****1210482*.1503 door_call(4, 0xFFBFD030) >> = 0 >>> 1210482.1505 getuid() = 0 >> [0] >>> 1210482.1505 getuid() = 0 >> [0] >>> *****1210482*.1506 door_info(4, 0xFFBFCF88) >> = 0 >>> door_call(4, 0xFFBFD030) (sleeping...) >>> >>> *****1210497*.1802 door_call(4, 0xFFBFD030) >> = 0 >>> 1210497.1804 ioctl(3, ((''m''<<8)|7), 0xFFBFD8EC) = 0 >>> >>> Thanks, >>> Justin >>> >>> >>> >------------------------------------------------------------------------>>> _______________________________________________ >>> dtrace-discuss mailing list >>> dtrace-discuss at opensolaris.org >-- blu There are two rules in life: Rule 1- Don''t tell people everything you know ---------------------------------------------------------------------- Brian Utterback - Solaris RPE, Sun Microsystems, Inc. Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom