Hello, ps -ef cmd in local zones takes about 15 ~ 20 seconds randomly, but work perfect on the global zone. truss shows the time was spent on pollsys syscall (went to sleeping) after putmsg. the most of syscalls made by function traced by dtrace: ioctl(), gtime(), read() and write(), which were issued by application processes that are all single-threaded, but it seems that the syscalls are not related to pollsys sleeping, as the ps slowness happened during the time when ioctl() call reached over 50000 and over 5000. there are over 40 GB freemem, cpu is always 80% idle; and shortlived processes took about 07 ~ 1.8 sec out of 5 sec, could the those shortlived caused the slowness ? is there a way to find what causes pollsys went to sleep ? Thanks, James Yang Email : jianhua.yang at db.com --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081112/58cac13a/attachment.html>
On Wed, Nov 12, 2008 at 9:30 PM, Jianhua Yang <jianhua.yang at db.com> wrote:> Hello, > > ps -ef cmd in local zones takes about 15 ~ 20 seconds randomly, but work > perfect on the global zone. truss shows the time was spent on pollsys > syscall (went to sleeping) after putmsg.This is almost always slowness waiting for a name service (e.g. NIS or LDAP). My guess is that if you go into a directory with files owned by a lot of different people (try /tmp or /var/tmp) and do "ls -l" you will see similar slowness. Also "ps -o uid -e" should be fast and "ps -o user -e" will be slow.> the most of syscalls made by function traced by dtrace: ioctl(), gtime(), > read() and write(), which were issued by application processes that are all > single-threaded, but it seems that the syscalls are not related to pollsys > sleeping, as the ps slowness happened during the time when ioctl() call > reached over 50000 and over 5000. there are over 40 GB freemem, cpu is > always 80% idle; and shortlived processes took about 07 ~ 1.8 sec out of 5 > sec, could the those shortlived caused the slowness ?Add ustack() to the data you are collecting. I bet that you will find that there are getpwuid() calls in the stack traces during the pauses. -- Mike Gerdts http://mgerdts.blogspot.com/
>ps -ef cmd in local zones takes about 15 ~ 20 seconds randomly, but work >perfect on the global zone. truss shows the time was spent on pollsys >syscall (went to sleeping) after putmsg. > >the most of syscalls made by function traced by dtrace: ioctl(), gtime(), >read() and write(), which were issued by application processes that are all >single-threaded, but it seems that the syscalls are not related to pollsys >sleeping, as the ps slowness happened during the time when ioctl() call >reached over 50000 and over 5000. there are over 40 GB freemem, cpu is >always 80% idle; and shortlived processes took about 07 ~ 1.8 sec out of 5 >sec, could the those shortlived caused the slowness ? >Use dtrace and print ustack() when syspoll takes too long. I suspect this is to do with the nameservice. Casper
Hi Mike, thanks a lot for !!! yes, ustack show getpwuid() when ps shown slowness. it turned out to the non-existence of nscd which was disabled due to coredumping. Thanks, James Yang Global Unix Support, IES, GTO Deutsche Bank US Phone: 201-593-1360 Email : jianhua.yang at db.com Pager : 1-800-946-4646 PIN# 6105618 CR: NYC_UNIX_ES_US_UNIX_SUPPORT http://dcsupport.ies.gto.intranet.db.com/ "Mike Gerdts" <mgerdts at gmail.com > To Jianhua Yang/db/dbcom at DBAmericas 11/12/08 10:40 PM cc dtrace-discuss at opensolaris.org Subject Re: [dtrace-discuss] ps slowness in local zones On Wed, Nov 12, 2008 at 9:30 PM, Jianhua Yang <jianhua.yang at db.com> wrote:> Hello, > > ps -ef cmd in local zones takes about 15 ~ 20 seconds randomly, but work > perfect on the global zone. truss shows the time was spent on pollsys > syscall (went to sleeping) after putmsg.This is almost always slowness waiting for a name service (e.g. NIS or LDAP). My guess is that if you go into a directory with files owned by a lot of different people (try /tmp or /var/tmp) and do "ls -l" you will see similar slowness. Also "ps -o uid -e" should be fast and "ps -o user -e" will be slow.> the most of syscalls made by function traced by dtrace: ioctl(), gtime(), > read() and write(), which were issued by application processes that are all > single-threaded, but it seems that the syscalls are not related to pollsys > sleeping, as the ps slowness happened during the time when ioctl() call > reached over 50000 and over 5000. there are over 40 GB freemem, cpu is > always 80% idle; and shortlived processes took about 07 ~ 1.8 sec out of 5 > sec, could the those shortlived caused the slowness ?Add ustack() to the data you are collecting. I bet that you will find that there are getpwuid() calls in the stack traces during the pauses. -- Mike Gerdts http://mgerdts.blogspot.com/ --- This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081115/5106c71a/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081115/5106c71a/attachment.gif> -------------- next part -------------- A non-text attachment was scrubbed... Name: pic19912.gif Type: image/gif Size: 1255 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081115/5106c71a/attachment-0001.gif> -------------- next part -------------- A non-text attachment was scrubbed... Name: ecblank.gif Type: image/gif Size: 45 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20081115/5106c71a/attachment-0002.gif>