Hi, I really need help. How can i identify why my process are 100% in LCK and SLP ? PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 30 0 18 0 java/28 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 2 0 1 0 java/94 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 3 0 1 0 java/93 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 1 0 0 0 java/100 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 1 0 1 0 java/18 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/49 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/47 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/48 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/32 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/31 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/30 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/29 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/27 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/26 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/25 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/24 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/23 Atenciosamente, ------------------------------------------------------------------ Kleyson Rios. Ger?ncia de Suporte T?cnico Analista de Suporte / L?der de Equipe Governo do Estado de Goi?s Ag?ncia de Administra??o Diretoria de Inform?tica +55 62 3201-6582 "Voc? sabe a diferen?a entre pessoas bem sucedidas e pessoas mal sucedidas ? Pessoas bem sucedidas est?o dispostas a fazer as coisas que as pessoas mal sucedidas n?o est?o." -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20080416/daf46aa2/attachment.html>
Kleyson Rios wrote:> Hi, > > > > I really need help. > > > > How can i identify why my process are 100% in LCK and SLP ? > > > > PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG > PROCESS/LWPID > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 30 0 18 0 java/28 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 2 0 1 0 java/94 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 3 0 1 0 java/93 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 1 0 0 0 java/100 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 1 0 1 0 java/18 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/49 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/47 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/48 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/32 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/31 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/30 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/29 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/27 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/26 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/25 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/24 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/23from the prstat man-page: LCK The percentage of time the process has spent waiting for user locks. SLP The percentage of time the process has spent sleeping. it looks like your processes have deadlocked themselves - not much one can do without knowledge about what they''re actually doing and access to the source code. I''d talk to the people developing this app. Michael -- Michael Schuster http://blogs.sun.com/recursion Recursion, n.: see ''Recursion''
Michael Schuster wrote:> Kleyson Rios wrote: >> Hi, >> >> I really need help. >> >> How can i identify why my process are 100% in LCK and SLP ? >> >> PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID >> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 30 0 18 0 java/28 >> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 2 0 1 0 java/94 >> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 3 0 1 0 java/93 >> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 1 0 0 0 java/100 >> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 1 0 1 0 java/18 >> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/49 >> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/47 >> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/48 >> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/32 >> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/31 >> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/30 >> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/29 >> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/27 >> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/26 >> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/25 >> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/24 >> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/23 > > from the prstat man-page: > > LCK > > The percentage of time the process has spent waiting for > user locks. > > SLP > > The percentage of time the process has spent sleeping. > > it looks like your processes have deadlocked themselves - not much one can > do without knowledge about what they''re actually doing and access to the > source code. I''d talk to the people developing this app. > > MichaelIf it is a deadlock, and if it is a recent Java virtual machine, then you should be able to connect to it with jconsole (from the same directory where you get your java command), go to the "Threads" tab and use the "Detect Deadlock" button to get a thread stack dump showing where you have a cycle among your Java-level locks. ... peter
It could very well be that your process is waiting on condition variables and this behavior is normal. Please see the attached script as a place to start. A pstack of the process would also be informative. rick On Wed, Apr 16, 2008 at 12:35:53PM -0300, Kleyson Rios wrote:> Date: Wed, 16 Apr 2008 12:35:53 -0300 > From: Kleyson Rios <kleyson at administracao.go.gov.br> > Subject: [dtrace-discuss] Process in LCK / SLP (Please) > To: dtrace-discuss at opensolaris.org > Errors-to: dtrace-discuss-bounces at opensolaris.org > X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.3028 > X-Mailer: Microsoft Office Outlook, Build 11.0.5510 > Precedence: list > X-BeenThere: dtrace-discuss at opensolaris.org > Delivered-to: dtrace-discuss at opensolaris.org > Thread-index: Acif144rsI8gfNTOQjKP0oF/+VkN7Q=> X-PMX-Version: 5.4.1.325704 > X-Brightmail-Tracker: AAAAAA=> X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on > oss-mail1.opensolaris.org > X-Original-To: dtrace-discuss at opensolaris.org > X-Antispam: No, score=-2.6/5.0, scanned in 0.189sec at (localhost [127.0.0.1]) > by smf-spamd v1.3.1 - http://smfs.sf.net/ > X-Mailman-Version: 2.1.9 > List-Post: <mailto:dtrace-discuss at opensolaris.org> > List-Subscribe: <http://mail.opensolaris.org/mailman/listinfo/dtrace-discuss>, > <mailto:dtrace-discuss-request at opensolaris.org?subject=subscribe> > List-Unsubscribe: > <http://mail.opensolaris.org/mailman/listinfo/dtrace-discuss>, > <mailto:dtrace-discuss-request at opensolaris.org?subject=unsubscribe> > List-Archive: <http://mail.opensolaris.org/pipermail/dtrace-discuss> > List-Help: <mailto:dtrace-discuss-request at opensolaris.org?subject=help> > List-Id: DTrace General Discussion <dtrace-discuss.opensolaris.org> > X-Spam-Status: No, score=0.0 required=5.0 tests=AWL,HTML_MESSAGE > autolearn=unavailable version=3.2.3 > X-Spam-Level: > > > Hi, > > > I really need help. > > > How can i identify why my process are 100% in LCK and SLP ? > > > PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG > PROCESS/LWPID > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 30 0 18 0 > java/28 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 2 0 1 0 > java/94 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 3 0 1 0 > java/93 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 1 0 0 0 > java/100 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 1 0 1 0 > java/18 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 > java/49 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 > java/47 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 > java/48 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > java/32 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > java/31 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > java/30 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > java/29 > > 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > java/27 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 > java/26 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 > java/25 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 > java/24 > > 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 > java/23 > > > > Atenciosamente, > > > ------------------------------------------------------------------ > > > Kleyson Rios. > > Ger?ncia de Suporte T?cnico > > Analista de Suporte / L?der de Equipe > > > Governo do Estado de Goi?s > > Ag?ncia de Administra??o > > Diretoria de Inform?tica > > +55 62 3201-6582 > > > "Voc? sabe a diferen?a entre pessoas bem sucedidas e pessoas mal > sucedidas ? > > Pessoas bem sucedidas est?o dispostas a fazer as coisas que as pessoas > mal sucedidas n?o est?o."> _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- Rickey C. Weisner Software Development and Performance Specialist Sun Microsystems, INC cell phone: 615-308-1147 email: rick.weisner at sun.com -------------- next part -------------- #!/usr/sbin/dtrace -s sched:::off-cpu / arg1 != 0 && pid == $1 && curlwpsinfo->pr_state == SSLEEP/ { /* * We''re sleeping. Track our sobj type. */ self->sobj = curlwpsinfo->pr_stype; self->bedtime = timestamp; } sched:::off-cpu / arg1 != 0 && pid == $1 && curlwpsinfo->pr_state == SRUN/ { self->bedtime = timestamp; /* @s[pid,ustack(5)]=count(); */ } sched:::on-cpu /self->bedtime && !self->sobj / { @[execname,pid,"preempted"] = sum(timestamp - self->bedtime); /* @c[execname,pid,"preempted"]=count(); */ self->bedtime = 0; } sched:::on-cpu /self->sobj / { @[execname,pid,self->sobj == SOBJ_MUTEX ? "kernel-level lock" : self->sobj == SOBJ_RWLOCK ? "rwlock" : self->sobj == SOBJ_CV ? "condition variable" : self->sobj == SOBJ_SEMA ? "semaphore" : self->sobj == SOBJ_USER ? "user-level lock" : self->sobj == SOBJ_USER_PI ? "user-level prio-inheriting lock" : self->sobj == SOBJ_SHUTTLE ? "shuttle" : "unknown"] sum(timestamp - self->bedtime); /* @c[execname,pid,self->sobj == SOBJ_MUTEX ? "kernel-level lock" : self->sobj == SOBJ_RWLOCK ? "rwlock" : self->sobj == SOBJ_CV ? "condition variable" : self->sobj == SOBJ_SEMA ? "semaphore" : self->sobj == SOBJ_USER ? "user-level lock" : self->sobj == SOBJ_USER_PI ? "user-level prio-inheriting lock" : self->sobj == SOBJ_SHUTTLE ? "shuttle" : "unknown"] count(); */ self->sobj = 0; self->bedtime = 0; } tick-10sec { exit(0); }
Peter B. Kessler''s email at 4/16/08 2:01 PM, said:> Michael Schuster wrote: >> Kleyson Rios wrote: >>> Hi, >>> >>> I really need help. >>> >>> How can i identify why my process are 100% in LCK and SLP ? >>> >>> PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 30 0 18 0 java/28 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 2 0 1 0 java/94 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 3 0 1 0 java/93 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 1 0 0 0 java/100 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 1 0 1 0 java/18 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/49 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/47 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/48 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/32 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/31 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/30 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/29 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/27 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/26 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/25 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/24 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/23 >> from the prstat man-page: >> >> LCK >> >> The percentage of time the process has spent waiting for >> user locks. >> >> SLP >> >> The percentage of time the process has spent sleeping. >> >> it looks like your processes have deadlocked themselves - not much one can >> do without knowledge about what they''re actually doing and access to the >> source code. I''d talk to the people developing this app. >> >> Michael > > If it is a deadlock, and if it is a recent Java virtual machine, then > you should be able to connect to it with jconsole (from the same directory > where you get your java command), go to the "Threads" tab and use the > "Detect Deadlock" button to get a thread stack dump showing where you > have a cycle among your Java-level locks. > > ... peter > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >*Highly* unlikely it is a deadlock. Insanely, Solaris charges time in condition wait ([pthread_]cond_wait or lwp_cond_wait) as lock time. This has never made sense to me, and I have yet to wield a big enough stick to get it fixed. Take either a Java thread dump or a pstack(1) dump. If a Java dump, you will be in Object.wait (or similar) or sleep(). If a pstack(1), you will be in some form of cond_wait. If it really is a deadlock (and I''ll buy you a beer if it is because I am sure it isn''t), Peter''s suggestion is the way to go. As for sleep, the threads are tired. Long day of garbage collection I suppose. Seriously though, they have no (or very little) work to do since the top thread is 100% sleep but made 18 system calls (rounding error). Thanks, Jarod
Tanks everybody. I will try to do what you guys said. When I get new issues I post again. Regards, Kleyson Rios. -----Mensagem original----- De: Jarod Jenson [mailto:jarod at aeysis.com] Enviada em: quarta-feira, 16 de abril de 2008 16:56 Para: Peter B. Kessler Cc: Kleyson Rios; dtrace-discuss at opensolaris.org Assunto: Re: [dtrace-discuss] Process in LCK / SLP (Please) Peter B. Kessler''s email at 4/16/08 2:01 PM, said:> Michael Schuster wrote: >> Kleyson Rios wrote: >>> Hi, >>> >>> I really need help. >>> >>> How can i identify why my process are 100% in LCK and SLP ? >>> >>> PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIGPROCESS/LWPID>>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 30 0 18 0 java/28 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 2 0 1 0 java/94 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 3 0 1 0 java/93 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 1 0 0 0 java/100 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 1 0 1 0 java/18 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/49 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/47 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/48 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/32 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/31 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/30 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/29 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 java/27 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/26 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/25 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/24 >>> 6769 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 0 0 0 0 java/23 >> from the prstat man-page: >> >> LCK >> >> The percentage of time the process has spent waiting for >> user locks. >> >> SLP >> >> The percentage of time the process has spent sleeping. >> >> it looks like your processes have deadlocked themselves - not much onecan>> do without knowledge about what they''re actually doing and access to the >> source code. I''d talk to the people developing this app. >> >> Michael > > If it is a deadlock, and if it is a recent Java virtual machine, then > you should be able to connect to it with jconsole (from the same directory > where you get your java command), go to the "Threads" tab and use the > "Detect Deadlock" button to get a thread stack dump showing where you > have a cycle among your Java-level locks. > > ... peter > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >*Highly* unlikely it is a deadlock. Insanely, Solaris charges time in condition wait ([pthread_]cond_wait or lwp_cond_wait) as lock time. This has never made sense to me, and I have yet to wield a big enough stick to get it fixed. Take either a Java thread dump or a pstack(1) dump. If a Java dump, you will be in Object.wait (or similar) or sleep(). If a pstack(1), you will be in some form of cond_wait. If it really is a deadlock (and I''ll buy you a beer if it is because I am sure it isn''t), Peter''s suggestion is the way to go. As for sleep, the threads are tired. Long day of garbage collection I suppose. Seriously though, they have no (or very little) work to do since the top thread is 100% sleep but made 18 system calls (rounding error). Thanks, Jarod
przemolicc at poczta.fm
2008-Apr-18 09:49 UTC
[dtrace-discuss] Process in LCK / SLP (Please)
On Wed, Apr 16, 2008 at 12:35:53PM -0300, Kleyson Rios wrote:> Hi, > > > > I really need help. > > > > How can i identify why my process are 100% in LCK and SLP ?Could you please use ''plockstat'' and show us the output ? E.g.: plockstat -e 30 -s 10 -A -p `<pid of java proc>'' Regards przemol -- http://przemol.blogspot.com/ ---------------------------------------------------------------------- Poprowadz swoj klub do zwyciestwa! Sprawdz >>> http://link.interia.pl/f1d76
Hi przemol, Bellow output of plockstat for malloc and libumem. Both many locks. Why changing to libumem I didn''t get less locks ? ********** Plockstat using malloc (many many locks): Mutex hold Count nsec Lock Caller ---------------------------------------------------------------------------- --- 4036 14391 libc.so.1`libc_malloc_lock libjava.so`JNU_GetStringPlatformChars+0x290 2118 7385 libc.so.1`libc_malloc_lock libjava.so`JNU_ReleaseStringPlatformChars+0x18 3174 4700 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x29 181 63407 libc.so.1`_uberdata+0x40 libjvm.so`__1cCosRpd_suspend_thread6FpnGThread_i_i_+0x46 3170 3407 libc.so.1`libc_malloc_lock libjvm.so`__1cCosEfree6Fpv_v_+0x18 588 12443 libc.so.1`libc_malloc_lock libnet.so`Java_java_net_SocketOutputStream_socketWrite0+0x7f 172 37572 libc.so.1`_uberdata+0x40 libjvm.so`__1cCosQpd_resume_thread6FpnGThread__i_+0x1f 176 26701 libc.so.1`libc_malloc_lock libjvm.so`__1cCosGmalloc6FI_pv_+0x29 596 7124 libc.so.1`libc_malloc_lock libnet.so`Java_java_net_SocketOutputStream_socketWrite0+0x1e0 450 7254 libc.so.1`libc_malloc_lock 0x858b2167 (...) ********** Plockstat using libumem (many many locks too): Mutex hold Count nsec Lock Caller ---------------------------------------------------------------------------- --- 4 1450455 libumem.so.1`umem_cache_lock libumem.so.1`umem_cache_applyall+0x51 100 46388 libc.so.1`_uberdata+0x40 libjvm.so`__1cCosRpd_suspend_thread6FpnGThread_i_i_+0x46 100 23226 libc.so.1`_uberdata+0x40 libjvm.so`__1cCosQpd_resume_thread6FpnGThread__i_+0x1f 486 4314 0x807b680 libumem.so.1`umem_cache_alloc+0xc1 488 4236 0x807b680 libumem.so.1`umem_cache_free+0x194 356 4859 0x807b300 libumem.so.1`umem_cache_alloc+0xc1 150 11499 0x8073030 libumem.so.1`vmem_xfree+0xfe 115 14473 0x8073030 libumem.so.1`vmem_alloc+0x13c 297 5374 0x807ad80 libumem.so.1`umem_cache_alloc+0xc1 362 4258 0x807b300 libumem.so.1`umem_cache_free+0x194 297 4304 0x807ad80 libumem.so.1`umem_cache_free+0x194 48 21805 0x807b6c0 libumem.so.1`umem_cache_free+0x194 150 6635 libumem.so.1`vmem0+0x30 libumem.so.1`vmem_alloc+0x126 (...) Mutex hold ---------------------------------------------------------------------------- --- Count nsec Lock Caller 543 13110 0x807b680 libumem.so.1`umem_cache_alloc+0xc1 nsec ---- Time Distribution --- count Stack 8192 |@@@@@@@@@@@@ | 289 libumem.so.1`umem_cache_alloc+0xc1 16384 |@@@@@@@@@ | 224 libumem.so.1`umem_alloc+0x3f 32768 |@ | 27 libumem.so.1`malloc+0x23 65536 | | 3 libjava.so`JNU_GetStringPlatformChars+0x290 0x20ac ---------------------------------------------------------------------------- --- Count nsec Lock Caller 78 89901 libc.so.1`_uberdata+0x40 libjvm.so`__1cCosRpd_suspend_thread6FpnGThread_i_i_+0x46 nsec ---- Time Distribution --- count Stack 65536 |@@@@@@@@@@@@@@@@@ | 57 libc.so.1`fork_lock_exit+0x2f 131072 |@@@@@ | 19 libc.so.1`_thrp_suspend+0x22c 262144 | | 1 libc.so.1`thr_suspend+0x1a 524288 | | 1 libjvm.so`__1cCosRpd_suspend_thread6FpnGThread_i_i_+0x46 libjvm.so`__1cGThreadNdo_vm_suspend6M_i_+0x42 libjvm.so`__1cGThreadKvm_suspend6M_i_+0x33 libjvm.so`__1cUThreadSafepointStateXexamine_state_of_thread6Mi_v_+0x101 libjvm.so`__1cUSafepointSynchronizeFbegin6F_v_+0x12b libjvm.so`__1cIVMThreadEloop6M_v_+0x1b4 (...) Regards. Kleyson Rios. -----Mensagem original----- De: dtrace-discuss-bounces at opensolaris.org [mailto:dtrace-discuss-bounces at opensolaris.org] Em nome de przemolicc at poczta.fm Enviada em: sexta-feira, 18 de abril de 2008 06:49 Para: dtrace-discuss at opensolaris.org Assunto: Re: [dtrace-discuss] Process in LCK / SLP (Please) On Wed, Apr 16, 2008 at 12:35:53PM -0300, Kleyson Rios wrote:> Hi, > > > > I really need help. > > > > How can i identify why my process are 100% in LCK and SLP ?Could you please use ''plockstat'' and show us the output ? E.g.: plockstat -e 30 -s 10 -A -p `<pid of java proc>'' We had such case in the past. I don''t think it is the same reason but might give you some ideas. We had a lot of locks. By using plockstat it turned out that those locks comes from malloc. Using mtmalloc or libumem reduced a number of locks but it was still a huge number (and eating our CPU). By using dtrace it turned out that those locks comes from threads which were ... supposed to be killed. Our programmers found a bug in the application (already unnecessary threads were still alive and all of the them going in the endless loop), changed the code and ... all locks went away. As Jim said: start with watching who is using mallocs (which thread(s)). Regards przemol -- http://przemol.blogspot.com/ ---------------------------------------------------------------------- Poprowadz swoj klub do zwyciestwa! Sprawdz >>> http://link.interia.pl/f1d76 _______________________________________________ dtrace-discuss mailing list dtrace-discuss at opensolaris.org
On Apr 18, 2008, at 1:03 PM, Kleyson Rios wrote:> Hi przemol, > > Bellow output of plockstat for malloc and libumem. Both many locks. > Why changing to libumem I didn''t get less locks ? >You''re looking at Mutex hold statistics, which don''t mean a lot (unless contention is caused by long hold times) The important thing for multi-threaded performance is *contention*. (Spinning and blocking) Those are the statistics you should be looking at. Both malloc and libumem use locks to protect their state; libumem just uses many locks, in order to reduce contention. Cheers, - jonathan> ********** Plockstat using malloc (many many locks): > > Mutex hold > > Count nsec Lock Caller > ---------------------------------------------------------------------------- > --- > 4036 14391 libc.so.1`libc_malloc_lock > libjava.so`JNU_GetStringPlatformChars+0x290 > 2118 7385 libc.so.1`libc_malloc_lock > libjava.so`JNU_ReleaseStringPlatformChars+0x18 > 3174 4700 libc.so.1`libc_malloc_lock > libjvm.so`__1cCosGmalloc6FI_pv_+0x29 > 181 63407 libc.so.1`_uberdata+0x40 > libjvm.so`__1cCosRpd_suspend_thread6FpnGThread_i_i_+0x46 > 3170 3407 libc.so.1`libc_malloc_lock > libjvm.so`__1cCosEfree6Fpv_v_+0x18 > 588 12443 libc.so.1`libc_malloc_lock > libnet.so`Java_java_net_SocketOutputStream_socketWrite0+0x7f > 172 37572 libc.so.1`_uberdata+0x40 > libjvm.so`__1cCosQpd_resume_thread6FpnGThread__i_+0x1f > 176 26701 libc.so.1`libc_malloc_lock > libjvm.so`__1cCosGmalloc6FI_pv_+0x29 > 596 7124 libc.so.1`libc_malloc_lock > libnet.so`Java_java_net_SocketOutputStream_socketWrite0+0x1e0 > 450 7254 libc.so.1`libc_malloc_lock 0x858b2167 > (...) > > > ********** Plockstat using libumem (many many locks too): > > Mutex hold > > Count nsec Lock Caller > ---------------------------------------------------------------------------- > --- > 4 1450455 libumem.so.1`umem_cache_lock > libumem.so.1`umem_cache_applyall+0x51 > 100 46388 libc.so.1`_uberdata+0x40 > libjvm.so`__1cCosRpd_suspend_thread6FpnGThread_i_i_+0x46 > 100 23226 libc.so.1`_uberdata+0x40 > libjvm.so`__1cCosQpd_resume_thread6FpnGThread__i_+0x1f > 486 4314 0x807b680 > libumem.so.1`umem_cache_alloc+0xc1 > 488 4236 0x807b680 > libumem.so.1`umem_cache_free+0x194 > 356 4859 0x807b300 > libumem.so.1`umem_cache_alloc+0xc1 > 150 11499 0x8073030 libumem.so.1`vmem_xfree > +0xfe > 115 14473 0x8073030 libumem.so.1`vmem_alloc > +0x13c > 297 5374 0x807ad80 > libumem.so.1`umem_cache_alloc+0xc1 > 362 4258 0x807b300 > libumem.so.1`umem_cache_free+0x194 > 297 4304 0x807ad80 > libumem.so.1`umem_cache_free+0x194 > 48 21805 0x807b6c0 > libumem.so.1`umem_cache_free+0x194 > 150 6635 libumem.so.1`vmem0+0x30 libumem.so.1`vmem_alloc > +0x126 > (...) > > > > Mutex hold > > ---------------------------------------------------------------------------- > --- > Count nsec Lock Caller > 543 13110 0x807b680 > libumem.so.1`umem_cache_alloc+0xc1 > > nsec ---- Time Distribution --- count Stack > 8192 |@@@@@@@@@@@@ | 289 > libumem.so.1`umem_cache_alloc+0xc1 > 16384 |@@@@@@@@@ | 224 libumem.so.1`umem_alloc > +0x3f > 32768 |@ | 27 libumem.so.1`malloc+0x23 > 65536 | | 3 > libjava.so`JNU_GetStringPlatformChars+0x290 > 0x20ac > ---------------------------------------------------------------------------- > --- > Count nsec Lock Caller > 78 89901 libc.so.1`_uberdata+0x40 > libjvm.so`__1cCosRpd_suspend_thread6FpnGThread_i_i_+0x46 > > nsec ---- Time Distribution --- count Stack > 65536 |@@@@@@@@@@@@@@@@@ | 57 libc.so.1`fork_lock_exit > +0x2f > 131072 |@@@@@ | 19 libc.so.1`_thrp_suspend > +0x22c > 262144 | | 1 libc.so.1`thr_suspend+0x1a > 524288 | | 1 > libjvm.so`__1cCosRpd_suspend_thread6FpnGThread_i_i_+0x46 > > libjvm.so`__1cGThreadNdo_vm_suspend6M_i_+0x42 > > libjvm.so`__1cGThreadKvm_suspend6M_i_+0x33 > > libjvm.so`__1cUThreadSafepointStateXexamine_state_of_thread6Mi_v_ > +0x101 > > libjvm.so`__1cUSafepointSynchronizeFbegin6F_v_+0x12b > > libjvm.so`__1cIVMThreadEloop6M_v_+0x1b4 > > (...) > > > Regards. > Kleyson Rios. > > -----Mensagem original----- > De: dtrace-discuss-bounces at opensolaris.org > [mailto:dtrace-discuss-bounces at opensolaris.org] Em nome de > przemolicc at poczta.fm > Enviada em: sexta-feira, 18 de abril de 2008 06:49 > Para: dtrace-discuss at opensolaris.org > Assunto: Re: [dtrace-discuss] Process in LCK / SLP (Please) > > On Wed, Apr 16, 2008 at 12:35:53PM -0300, Kleyson Rios wrote: >> Hi, >> >> >> >> I really need help. >> >> >> >> How can i identify why my process are 100% in LCK and SLP ? > > Could you please use ''plockstat'' and show us the output ? > E.g.: plockstat -e 30 -s 10 -A -p `<pid of java proc>'' > > We had such case in the past. I don''t think it is the same reason > but might > give you some ideas. > We had a lot of locks. By using plockstat it turned out that those > locks > comes from malloc. Using mtmalloc or libumem reduced a number of > locks but > it was still a huge number (and eating our CPU). By using dtrace it > turned > out that those locks comes from threads which were ... supposed to be > killed. Our programmers found a bug in the application (already > unnecessary > threads were still alive and all of the them going in the endless > loop), > changed the code and ... all locks went away. > > As Jim said: start with watching who is using mallocs (which > thread(s)). > > Regards > przemol > > -- > http://przemol.blogspot.com/ > > > > > > > > > > > > > > > > > > > > > > ---------------------------------------------------------------------- > Poprowadz swoj klub do zwyciestwa! > Sprawdz >>> http://link.interia.pl/f1d76 > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org > > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-------------------------------------------------------------------------- Jonathan Adams, Sun Microsystems, ZFS Team http://blogs.sun.com/jwadams
przemolicc at poczta.fm
2008-Apr-21 08:37 UTC
[dtrace-discuss] RES: Process in LCK / SLP (Please)
On Fri, Apr 18, 2008 at 02:40:15PM -0400, Jonathan Adams wrote:> > On Apr 18, 2008, at 1:03 PM, Kleyson Rios wrote: > > Hi przemol, > > > > Bellow output of plockstat for malloc and libumem. Both many locks. > > Why changing to libumem I didn''t get less locks ? > > > > You''re looking at Mutex hold statistics, which don''t mean a lot > (unless contention is caused by long hold times) > > The important thing for multi-threaded performance is *contention*. > (Spinning and blocking) Those are the statistics you should be > looking at. > > Both malloc and libumem use locks to protect their state; libumem > just uses many locks, in order to reduce contention.So it seems Kleyson that I was wrong and rickey c weisner script is the way to go. Regards przemol -- http://przemol.blogspot.com/ ---------------------------------------------------------------------- Posluchaj najlepszych kawalkow w Sieci! http://link.interia.pl/f1da1
Kleyson Rios
2008-Apr-22 17:36 UTC
[dtrace-discuss] RES: RES: Process in LCK / SLP (Please)
OK jonathan, I understand. So, looking on right place now, i can see few locks and sometimes no locks (just Mutex Hold). But I still have many threads in 100% LCK. If I don''t have a lot of locks, where is my problem ? Running rickey c weisner''s script a get: (...) 25736 libc.so.1`_so_send+0x15 libjvm.so`__1cDhpiEsend6Fipcii_i_+0x67 libjvm.so`JVM_Send+0x32 libnet.so`Java_java_net_SocketOutputStream_socketWrite0+0x131 0xc3c098d3 10 25736 0xc3d2a33a 14 25736 libc.so.1`_write+0x15 libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d libjvm.so`JVM_Write+0x30 libjava.so`0xc8f7c04b 16 25736 libc.so.1`stat64+0x15 21 25736 libc.so.1`_write+0x15 libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d libjvm.so`JVM_Write+0x30 libjava.so`0xc8f80ce9 76 java 25736 kernel-level lock 1 java 25736 shuttle 6 java 25736 preempted 7 java 25736 user-level lock 511 java 25736 condition variable 748 Atenciosamente, ------------------------------------------------------------------ Kleyson Rios. Ger?ncia de Suporte T?cnico Analista de Suporte / L?der de Equipe -----Mensagem original----- De: Jonathan Adams [mailto:jonathan.adams at sun.com] Enviada em: sexta-feira, 18 de abril de 2008 15:40 Para: Kleyson Rios Cc: dtrace-discuss at opensolaris.org Assunto: Re: [dtrace-discuss] RES: Process in LCK / SLP (Please) On Apr 18, 2008, at 1:03 PM, Kleyson Rios wrote:> Hi przemol, > > Bellow output of plockstat for malloc and libumem. Both many locks. > Why changing to libumem I didn''t get less locks ? >You''re looking at Mutex hold statistics, which don''t mean a lot (unless contention is caused by long hold times) The important thing for multi-threaded performance is *contention*. (Spinning and blocking) Those are the statistics you should be looking at. Both malloc and libumem use locks to protect their state; libumem just uses many locks, in order to reduce contention. Cheers, - jonathan
You may want to cross-post to a Java alias, but I''ve been down this road before. Java will call into malloc() for buffers for network reads and writes that are larger than 2k bytes (the 2k is from memory, and I think it was a 1.5 JVM). A large number of malloc calls, and resulting contention on locks in the library, are due to the application doing network writes of larger than 2k. Newer JVMs (1.6) may improve this, but I''m not sure. There''s also an alternative set of classes and methods, NIO, which also can help (although I''ve heard tell that NIO brings other problems along with it, but I can not speak from personal experience). At this point, I think you need to consult with Java experts to determine what options you have for buffer allocation for network IO from the Java heap, versus the current behavior of the JVM dropping back to malloc for allocating buffers for network IO. The other option of course is determining if the code can be changed to use buffers smaller than 2k. Thanks, /jim Kleyson Rios wrote:> OK jonathan, > > I understand. > > So, looking on right place now, i can see few locks and sometimes no locks > (just Mutex Hold). But I still have many threads in 100% LCK. > > If I don''t have a lot of locks, where is my problem ? > > Running rickey c weisner''s script a get: > > (...) > 25736 > libc.so.1`_so_send+0x15 > libjvm.so`__1cDhpiEsend6Fipcii_i_+0x67 > libjvm.so`JVM_Send+0x32 > libnet.so`Java_java_net_SocketOutputStream_socketWrite0+0x131 > 0xc3c098d3 > 10 > 25736 > 0xc3d2a33a > 14 > 25736 > libc.so.1`_write+0x15 > libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d > libjvm.so`JVM_Write+0x30 > libjava.so`0xc8f7c04b > 16 > 25736 > libc.so.1`stat64+0x15 > 21 > 25736 > libc.so.1`_write+0x15 > libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d > libjvm.so`JVM_Write+0x30 > libjava.so`0xc8f80ce9 > 76 > java 25736 kernel-level lock 1 > java 25736 shuttle 6 > java 25736 preempted 7 > java 25736 user-level lock 511 > java 25736 condition variable 748 > > > Atenciosamente, > > ------------------------------------------------------------------ > > Kleyson Rios. > Ger?ncia de Suporte T?cnico > Analista de Suporte / L?der de Equipe > > > -----Mensagem original----- > De: Jonathan Adams [mailto:jonathan.adams at sun.com] > Enviada em: sexta-feira, 18 de abril de 2008 15:40 > Para: Kleyson Rios > Cc: dtrace-discuss at opensolaris.org > Assunto: Re: [dtrace-discuss] RES: Process in LCK / SLP (Please) > > > On Apr 18, 2008, at 1:03 PM, Kleyson Rios wrote: > >> Hi przemol, >> >> Bellow output of plockstat for malloc and libumem. Both many locks. >> Why changing to libumem I didn''t get less locks ? >> >> > > You''re looking at Mutex hold statistics, which don''t mean a lot > (unless contention is caused by long hold times) > > The important thing for multi-threaded performance is *contention*. > (Spinning and blocking) Those are the statistics you should be > looking at. > > Both malloc and libumem use locks to protect their state; libumem > just uses many locks, in order to reduce contention. > > Cheers, > - jonathan > > > > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
Hi Jim, But, if there are problems in malloc for buffers for network, I should see the locks when running plockstat, don''t see ? Regards, ------------------------------------------------------------------ Kleyson Rios. -----Mensagem original----- De: James.Mauro at Sun.COM [mailto:James.Mauro at Sun.COM] Enviada em: quarta-feira, 23 de abril de 2008 01:11 Para: Kleyson Rios Cc: dtrace-discuss at opensolaris.org Assunto: Re: [dtrace-discuss] RES: RES: Process in LCK / SLP (Please) You may want to cross-post to a Java alias, but I''ve been down this road before. Java will call into malloc() for buffers for network reads and writes that are larger than 2k bytes (the 2k is from memory, and I think it was a 1.5 JVM). A large number of malloc calls, and resulting contention on locks in the library, are due to the application doing network writes of larger than 2k. Newer JVMs (1.6) may improve this, but I''m not sure. There''s also an alternative set of classes and methods, NIO, which also can help (although I''ve heard tell that NIO brings other problems along with it, but I can not speak from personal experience). At this point, I think you need to consult with Java experts to determine what options you have for buffer allocation for network IO from the Java heap, versus the current behavior of the JVM dropping back to malloc for allocating buffers for network IO. The other option of course is determining if the code can be changed to use buffers smaller than 2k. Thanks, /jim Kleyson Rios wrote:> OK jonathan, > > I understand. > > So, looking on right place now, i can see few locks and sometimes no locks > (just Mutex Hold). But I still have many threads in 100% LCK. > > If I don''t have a lot of locks, where is my problem ? > > Running rickey c weisner''s script a get: > > (...) > 25736 > libc.so.1`_so_send+0x15 > libjvm.so`__1cDhpiEsend6Fipcii_i_+0x67 > libjvm.so`JVM_Send+0x32 >libnet.so`Java_java_net_SocketOutputStream_socketWrite0+0x131> 0xc3c098d3 > 10 > 25736 > 0xc3d2a33a > 14 > 25736 > libc.so.1`_write+0x15 > libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d > libjvm.so`JVM_Write+0x30 > libjava.so`0xc8f7c04b > 16 > 25736 > libc.so.1`stat64+0x15 > 21 > 25736 > libc.so.1`_write+0x15 > libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d > libjvm.so`JVM_Write+0x30 > libjava.so`0xc8f80ce9 > 76 > java 25736 kernel-level lock 1 > java 25736 shuttle 6 > java 25736 preempted 7 > java 25736 user-level lock 511 > java 25736 condition variable 748 > > > Atenciosamente, > > ------------------------------------------------------------------ > > Kleyson Rios. > Ger?ncia de Suporte T?cnico > Analista de Suporte / L?der de Equipe > > > -----Mensagem original----- > De: Jonathan Adams [mailto:jonathan.adams at sun.com] > Enviada em: sexta-feira, 18 de abril de 2008 15:40 > Para: Kleyson Rios > Cc: dtrace-discuss at opensolaris.org > Assunto: Re: [dtrace-discuss] RES: Process in LCK / SLP (Please) > > > On Apr 18, 2008, at 1:03 PM, Kleyson Rios wrote: > >> Hi przemol, >> >> Bellow output of plockstat for malloc and libumem. Both many locks. >> Why changing to libumem I didn''t get less locks ? >> >> > > You''re looking at Mutex hold statistics, which don''t mean a lot > (unless contention is caused by long hold times) > > The important thing for multi-threaded performance is *contention*. > (Spinning and blocking) Those are the statistics you should be > looking at. > > Both malloc and libumem use locks to protect their state; libumem > just uses many locks, in order to reduce contention. > > Cheers, > - jonathan > > > > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
D''oh! My bad. My brain was thinking malloc and lock contention, I saw the network calls on the stack, connected the dots and starting typing. I should have looked at the top of the stack. Sorry. Let''s back up a minute. You have a Java application and the JVM has a large number of threads in LCK time (user lock), right? plockstat showed a lot of lock activity in libc malloc, at which point you tried mtmalloc and libumem, neither of which made much of a difference, right? We know empirically (and heard from Jarod) that LCK can be deceiving, because threads that call cond_wait(3C) to wait on a condition variable will be charged with LCK time, but it''s really sleep time most of the time. So it is sometimes an artifact of the application design that threads show up at 100% LCK time - threads are waiting for a cond_signal() (or cond_broadcast()) to be put to work, but there''s no work to do. Anyway... The problem I thought we were chasing was lock activity in malloc. If you have more recent data (plockstat, prstat) please pass it to me (sorry I missed it), and I''ll get back in sync on the current problem. I would add that there''s nothing here that I think belongs on dtrace_discuss. I would start posting to the performance group and a Java performance alias. Also, and I''m sorry if I missed this, how well (or poorly) is the Java application performing, and what metric do we have to determine application performance. Ricky''s script grabs a user stack when a thread goes off CPU to sleep, and tally''s what the threads are sleeping on. It''s mostly condition variables, which doesn''t really help us much (everything sleeps on CVs...mostly). There''s a lof of user locks in there as well. So... The CVs look like threads blocking on writes, including writes to a network socket. You need to figure out the write target, and go from there. Get the PID of the target JVM - dtrace -n ''syscall::write:entry / pid == $target / @[fds[arg0].fi_pathname] = count(); }'' -p [PID_OF_JVM] The above will list the files being written to. The next step depends on what we see. The same goes for the user locks. If they are indeed malloc locks (which I think they are), I suspect that will come back around to network IOs larger than 2k. Try this: dtrace -n ''pid$target:libc:malloc:entry { @j[jstack()] = count(); }'' -p [PID_OF_JVM] Thanks, /jim Kleyson Rios wrote:> Hi Jim, > > But, if there are problems in malloc for buffers for network, I should see > the locks when running plockstat, don''t see ? > > > Regards, > > ------------------------------------------------------------------ > > Kleyson Rios. > > -----Mensagem original----- > De: James.Mauro at Sun.COM [mailto:James.Mauro at Sun.COM] > Enviada em: quarta-feira, 23 de abril de 2008 01:11 > Para: Kleyson Rios > Cc: dtrace-discuss at opensolaris.org > Assunto: Re: [dtrace-discuss] RES: RES: Process in LCK / SLP (Please) > > You may want to cross-post to a Java alias, but I''ve been down this > road before. > > Java will call into malloc() for buffers for network reads and writes > that are larger than 2k bytes (the 2k is from memory, and I think it > was a 1.5 JVM). A large number of malloc calls, > and resulting contention on locks in the library, are due to the application > doing network writes of larger than 2k. > > Newer JVMs (1.6) may improve this, but I''m not sure. There''s also > an alternative set of classes and methods, NIO, which also can > help (although I''ve heard tell that NIO brings other problems along > with it, but I can not speak from personal experience). > > At this point, I think you need to consult with Java experts to determine > what options you have for buffer allocation for network IO from the > Java heap, versus the current behavior of the JVM dropping back > to malloc for allocating buffers for network IO. > > The other option of course is determining if the code can be changed to > use buffers smaller than 2k. > > Thanks, > /jim > > > > > Kleyson Rios wrote: > >> OK jonathan, >> >> I understand. >> >> So, looking on right place now, i can see few locks and sometimes no locks >> (just Mutex Hold). But I still have many threads in 100% LCK. >> >> If I don''t have a lot of locks, where is my problem ? >> >> Running rickey c weisner''s script a get: >> >> (...) >> 25736 >> libc.so.1`_so_send+0x15 >> libjvm.so`__1cDhpiEsend6Fipcii_i_+0x67 >> libjvm.so`JVM_Send+0x32 >> >> > libnet.so`Java_java_net_SocketOutputStream_socketWrite0+0x131 > >> 0xc3c098d3 >> 10 >> 25736 >> 0xc3d2a33a >> 14 >> 25736 >> libc.so.1`_write+0x15 >> libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d >> libjvm.so`JVM_Write+0x30 >> libjava.so`0xc8f7c04b >> 16 >> 25736 >> libc.so.1`stat64+0x15 >> 21 >> 25736 >> libc.so.1`_write+0x15 >> libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d >> libjvm.so`JVM_Write+0x30 >> libjava.so`0xc8f80ce9 >> 76 >> java 25736 kernel-level lock 1 >> java 25736 shuttle 6 >> java 25736 preempted 7 >> java 25736 user-level lock 511 >> java 25736 condition variable 748 >> >> >> Atenciosamente, >> >> ------------------------------------------------------------------ >> >> Kleyson Rios. >> Ger?ncia de Suporte T?cnico >> Analista de Suporte / L?der de Equipe >> >> >> -----Mensagem original----- >> De: Jonathan Adams [mailto:jonathan.adams at sun.com] >> Enviada em: sexta-feira, 18 de abril de 2008 15:40 >> Para: Kleyson Rios >> Cc: dtrace-discuss at opensolaris.org >> Assunto: Re: [dtrace-discuss] RES: Process in LCK / SLP (Please) >> >> >> On Apr 18, 2008, at 1:03 PM, Kleyson Rios wrote: >> >> >>> Hi przemol, >>> >>> Bellow output of plockstat for malloc and libumem. Both many locks. >>> Why changing to libumem I didn''t get less locks ? >>> >>> >>> >> You''re looking at Mutex hold statistics, which don''t mean a lot >> (unless contention is caused by long hold times) >> >> The important thing for multi-threaded performance is *contention*. >> (Spinning and blocking) Those are the statistics you should be >> looking at. >> >> Both malloc and libumem use locks to protect their state; libumem >> just uses many locks, in order to reduce contention. >> >> Cheers, >> - jonathan >> >> >> >> >> _______________________________________________ >> dtrace-discuss mailing list >> dtrace-discuss at opensolaris.org >> >> > > >
OK. Let''s go. Now i have about 100 users simultaneous. That box has 4 CPU and it''s running JBoss. My application it''s very slow. Looking at prstat I saw that I have many threads and most of time one or two threads are running in the same time. Using -Lm prstat options I saw that the most of time the threads are in LCK/SLP state. Using plockstat I saw many locks for malloc, so I changed for libumem. I don''t have more contention locks (Spin, block), I have just many Mutex hold locks. But the problem still exist. I have 7 box running jboss, all box have the same problem. Doubt: Are there some difference for output running prstat for 1 second or 5 second? Because running prstat for 1 second I get thread with 100% LCK. ******************************************************************* Prstat (now), I got a little better LCK when I changed for libumem: [root at aga253distp156:/] # prstat -Lmp 25736 PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID 25736 root 57 0.0 0.0 0.0 0.0 43 0.0 0.1 36 83 76 0 java/2 25736 root 0.6 0.5 0.0 0.0 0.0 16 78 0.0 32 0 1K 0 java/7211 25736 root 0.4 0.6 0.0 0.0 0.0 29 70 0.0 19 5 1K 0 java/7237 25736 root 0.4 0.6 0.0 0.0 0.0 27 61 0.2 25 15 1K 0 java/7364 25736 root 0.4 0.5 0.0 0.0 0.0 20 80 0.0 19 8 1K 0 java/7218 25736 root 0.3 0.4 0.0 0.0 0.0 29 70 0.0 18 5 1K 0 java/7204 25736 root 0.3 0.4 0.0 0.0 0.0 16 73 0.0 23 6 864 0 java/7374 25736 root 0.2 0.4 0.0 0.0 0.0 20 79 0.0 9 11 1K 0 java/7305 25736 root 0.3 0.4 0.0 0.0 0.0 23 76 0.0 7 9 900 0 java/7475 25736 root 0.4 0.3 0.0 0.0 0.0 11 88 0.0 15 1 553 0 java/7236 25736 root 0.3 0.3 0.0 0.0 0.0 0.8 93 0.0 17 2 734 0 java/7370 25736 root 0.1 0.3 0.0 0.0 0.0 8.7 91 0.0 7 2 773 0 java/7235 25736 root 0.2 0.2 0.0 0.0 0.0 9.5 90 0.0 8 0 316 0 java/7219 25736 root 0.2 0.1 0.0 0.0 0.0 9.8 84 0.0 17 11 306 0 java/7269 25736 root 0.1 0.1 0.0 0.0 0.0 0.1 100 0.0 2 1 419 0 java/7471 25736 root 0.2 0.0 0.0 0.0 0.0 8.0 92 0.0 14 3 68 0 java/7264 25736 root 0.1 0.1 0.0 0.0 0.0 10 89 0.0 6 0 327 0 java/7470 25736 root 0.1 0.1 0.0 0.0 0.0 0.0 100 0.0 10 0 302 0 java/7365 25736 root 0.2 0.0 0.0 0.0 0.0 6.4 88 0.1 14 5 49 0 java/7317 25736 root 0.1 0.1 0.0 0.0 0.0 93 7.0 0.1 4 4 209 0 java/7367 25736 root 0.1 0.0 0.0 0.0 0.0 4.0 91 0.0 10 1 55 0 java/7229 25736 root 0.1 0.0 0.0 0.0 0.0 3.9 96 0.0 11 1 57 0 java/7304 25736 root 0.1 0.1 0.0 0.0 0.0 8.6 91 0.0 4 1 196 0 java/14 25736 root 0.0 0.1 0.0 0.0 0.0 0.0 100 0.0 1 1 263 0 java/7297 25736 root 0.1 0.1 0.0 0.0 0.0 0.0 100 0.1 154 0 86 0 java/10 25736 root 0.1 0.0 0.0 0.0 0.0 0.6 99 0.0 9 0 55 0 java/7377 25736 root 0.1 0.0 0.0 0.0 0.0 100 0.0 0.0 12 4 54 0 java/4 25736 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 16 0 40 0 java/3 25736 root 0.0 0.0 0.0 0.0 0.0 3.6 96 0.0 2 0 4 0 java/7481 25736 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 8 0 12 0 java/170 25736 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 12 0 12 0 java/231 25736 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 19 4 25 0 java/171 25736 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 1 0 2 0 java/7221 25736 root 0.0 0.0 0.0 0.0 0.0 95 0.0 0.1 20 0 20 0 java/167 25736 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 1 0 2 0 java/7378 Total: 1 processes, 256 lwps, load averages: 0.74, 0.68, 0.62 ****************************************************************** Plockstat (now): [root at aga253distp213:/dtrace] # plockstat -A -e 30 -p 25736 0 Mutex hold Count nsec Lock Caller ---------------------------------------------------------------------------- --- 3546 4765 0x8079580 libumem.so.1`umem_cache_alloc+0xc1 3509 3865 0x8079580 libumem.so.1`umem_cache_free+0x194 328 12542 0x8073030 libumem.so.1`vmem_xfree+0xfe 3 1340378 libumem.so.1`umem_cache_lock libumem.so.1`umem_cache_applyall+0x51 261 13414 0x8073030 libumem.so.1`vmem_alloc+0x13c 471 5696 0x807b680 libumem.so.1`umem_cache_alloc+0xc1 475 5441 0x807b680 libumem.so.1`umem_cache_free+0x194 123 20767 0x807b6c0 libumem.so.1`umem_cache_free+0x194 28 90103 0x807a1c0 libumem.so.1`umem_cache_free+0x194 329 6788 libumem.so.1`vmem0+0x30 libumem.so.1`vmem_alloc+0x126 309 6900 0x807ad80 libumem.so.1`umem_cache_alloc+0xc1 329 6310 libumem.so.1`vmem0+0x4a4 libumem.so.1`vmem_xalloc+0x43b 328 6248 libumem.so.1`vmem0+0x30 libumem.so.1`vmem_xfree+0x127 355 5727 0x807b300 libumem.so.1`umem_cache_alloc+0xc1 216 9080 0x807b940 libumem.so.1`umem_cache_alloc+0xc1 355 5338 0x807b300 libumem.so.1`umem_cache_free+0x194 328 5393 libumem.so.1`vmem0+0x4a4 libumem.so.1`vmem_xfree+0xfe 310 5308 0x807ad80 libumem.so.1`umem_cache_free+0x194 329 4921 0x8073030 libumem.so.1`vmem_xalloc+0x43b 154 10299 0x807b9c0 libumem.so.1`umem_cache_alloc+0xc1 216 6043 0x807b940 libumem.so.1`umem_cache_free+0x194 329 3960 0x8073030 libumem.so.1`vmem_xalloc+0x315 161 7968 0x807b980 libumem.so.1`umem_cache_alloc+0xc1 235 5438 0x807b500 libumem.so.1`umem_cache_alloc+0xc1 329 3784 libumem.so.1`vmem0+0x4a4 libumem.so.1`vmem_alloc+0x13c 241 5029 0x807b500 libumem.so.1`umem_cache_free+0x194 85 13906 0x8082c80 libumem.so.1`umem_cache_alloc+0xc1 329 3577 libumem.so.1`vmem0+0x4a4 libumem.so.1`vmem_xalloc+0x315 39 29926 0x8079980 libumem.so.1`umem_cache_free+0x194 130 8876 0x807ba00 libumem.so.1`umem_cache_alloc+0xc1 34 33388 0xc8e149c0 libc.so.1`thr_suspend+0x1a 155 7304 0x807ba40 libumem.so.1`umem_cache_alloc+0xc1 13 86856 0x807ac80 libumem.so.1`umem_cache_alloc+0xc1 94 11979 0x80828c0 libumem.so.1`umem_cache_alloc+0xc1 143 7742 0x807b5c0 libumem.so.1`umem_cache_alloc+0xc1 153 5759 0x807b5c0 libumem.so.1`umem_cache_free+0x194 98 8816 0x807b540 libumem.so.1`umem_cache_alloc+0xc1 115 7499 0x807b8c0 libumem.so.1`umem_cache_alloc+0xc1 26 32518 0xc8e12100 libc.so.1`thr_suspend+0x1a 158 5210 0x807b640 libumem.so.1`umem_cache_free+0x194 95 8504 0x80828c0 libumem.so.1`umem_cache_free+0x194 155 5199 0x807ba40 libumem.so.1`umem_cache_free+0x194 8 97711 0xc8e145c0 libc.so.1`thr_suspend+0x1a 130 5915 0x807ba00 libumem.so.1`umem_cache_free+0x194 41 18736 0x807b2c0 libumem.so.1`umem_cache_alloc+0xc1 101 7524 0x8082940 libumem.so.1`umem_cache_alloc+0xc1 33 22842 0x807d640 libumem.so.1`umem_cache_free+0x194 35 21064 0x807acc0 libumem.so.1`umem_cache_alloc+0xc1 16 45814 0xc8e12900 libc.so.1`thr_suspend+0x1a 129 5641 0x807b880 libumem.so.1`umem_cache_free+0x194 5 142887 0xc8e13100 libc.so.1`thr_suspend+0x1a 43 16584 0x807b240 libumem.so.1`umem_cache_alloc+0xc1 21 33838 0xc8e12b80 libc.so.1`thr_suspend+0x1a 101 7012 0x8082940 libumem.so.1`umem_cache_free+0x194 8 42677 0xc8e13f00 libc.so.1`thr_suspend+0x1a 7 48677 0xc8e12940 libc.so.1`thr_suspend+0x1a 9 37763 0xc8e13080 libc.so.1`thr_suspend+0x1a 30 11317 0x8079880 libumem.so.1`umem_cache_alloc+0xc1 61 5553 0x807fcc0 libumem.so.1`umem_cache_alloc+0xc1 7 38522 0xc8e14240 libc.so.1`thr_suspend+0x1a 47 5664 0x8079cc0 26 9348 0x807a280 libumem.so.1`umem_cache_alloc+0xc1 6 39964 0xc8e14480 libc.so.1`thr_suspend+0x1a 26 9198 0x807bc40 libumem.so.1`umem_cache_alloc+0xc1 5 47600 0xc8e15440 libc.so.1`thr_suspend+0x1a 30 7853 0x807a240 libumem.so.1`umem_cache_alloc+0xc1 14 16771 0x80824b4 libumem.so.1`umem_depot_alloc+0xa4 3 3747 0x807f4b4 libumem.so.1`umem_cache_update+0xe5 3 3743 0x807d134 libumem.so.1`umem_cache_update+0xe5 (...) Mutex block Count nsec Lock Caller ---------------------------------------------------------------------------- --- 1 1624523 0x8073030 libumem.so.1`vmem_xfree+0x24 1 23803 0x80825c0 libumem.so.1`umem_cache_free+0x51 1 18619 0x8073030 libumem.so.1`vmem_xfree+0x24 1 15304 0x807b5c0 libumem.so.1`umem_cache_free+0x51 ****************************************************************** # time dtrace -n ''syscall::write:entry / pid == $target / { @[fds[arg0].fi_pathname] = count(); }'' -p 25736 dtrace: description ''syscall::write:entry '' matched 1 probe ^C /zonas/srvintra/root/var/tmp/imageio158908.tmp 16 /zonas/srvintra/root/var/tmp/imageio158909.tmp 16 /zonas/srvintra/root/var/tmp/imageio158910.tmp 16 /zonas/srvintra/root/var/tmp/imageio158911.tmp 16 /zonas/srvintra/root/var/tmp/imageio158912.tmp 16 /zonas/srvintra/root/var/tmp/imageio158913.tmp 16 /zonas/srvintra/root/var/tmp/imageio158914.tmp 16 /zonas/srvintra/root/var/tmp/imageio158915.tmp 16 /zonas/srvintra/root/var/tmp/imageio158916.tmp 16 /zonas/srvintra/root/var/tmp/imageio158917.tmp 16 /zonas/srvintra/root/var/tmp/imageio158918.tmp 16 /zonas/srvintra/root/var/tmp/imageio158919.tmp 16 /zonas/srvintra/root/var/tmp/imageio158920.tmp 16 /zonas/srvintra/root/var/tmp/imageio158921.tmp 16 /zonas/srvintra/root/var/tmp/imageio158922.tmp 16 /zonas/srvintra/root/var/tmp/imageio158923.tmp 16 /zonas/srvintra/root/var/tmp/imageio158924.tmp 16 /zonas/srvintra/root/var/tmp/imageio158925.tmp 16 /zonas/srvintra/root/var/tmp/imageio158926.tmp 16 /zonas/srvintra/root/var/tmp/imageio158927.tmp 16 /zonas/srvintra/root/var/tmp/imageio158928.tmp 16 /zonas/srvintra/root/var/tmp/imageio158929.tmp 16 /zonas/srvintra/root/var/tmp/imageio158930.tmp 16 /zonas/srvintra/root/var/tmp/imageio158931.tmp 16 /zonas/srvintra/root/var/tmp/imageio158932.tmp 16 /zonas/srvintra/root/var/tmp/imageio158933.tmp 16 /zonas/srvintra/root/var/tmp/imageio158934.tmp 16 /zonas/srvintra/root/var/tmp/imageio158935.tmp 16 /zonas/srvintra/root/var/tmp/imageio158936.tmp 16 /zonas/srvintra/root/var/tmp/imageio158937.tmp 16 /zonas/srvintra/root/var/tmp/imageio158938.tmp 16 /zonas/srvintra/root/var/tmp/imageio158939.tmp 16 /zonas/srvintra/root/var/tmp/imageio158940.tmp 16 /zonas/srvintra/root/var/tmp/imageio158941.tmp 16 /zonas/srvintra/root/var/tmp/imageio158942.tmp 16 /zonas/srvintra/root/var/tmp/imageio158943.tmp 16 /zonas/srvintra/root/var/tmp/imageio158944.tmp 16 /zonas/srvintra/root/var/tmp/imageio158945.tmp 16 /zonas/srvintra/root/var/tmp/imageio158946.tmp 16 /zonas/srvintra/root/var/tmp/imageio158947.tmp 16 /zonas/srvintra/root/opt/jboss-3.2.3_jetty-4.2.17/server/default/log/2008_04 _23.controller.csv 247 /zonas/srvintra/root/opt/jboss-3.2.3_jetty-4.2.17/server/default/deploy/detr an.war/WEB-INF/control-detran.log 784 /zonas/srvintra/root/opt/jboss-3.2.3_jetty-4.2.17/server/default/log/2008_04 _23.request.log 89872 real 0m46.632s user 0m0.535s sys 0m0.412s ************************************************************* # time dtrace -n ''pid$target:libc:malloc:entry { @j[jstack()] = count(); }'' -p 5736 dtrace: description ''pid$target:libc:malloc:entry '' matched 1 probe ^C real 0m52.971s user 0m0.581s sys 0m0.452s Regards, ------------------------------------------------------------------ Kleyson Rios. Ger?ncia de Suporte T?cnico Analista de Suporte / L?der de Equipe Governo do Estado de Goi?s Ag?ncia de Administra??o Diretoria de Inform?tica +55 62 3201-6582 "Voc? sabe a diferen?a entre pessoas bem sucedidas e pessoas mal sucedidas ? Pessoas bem sucedidas est?o dispostas a fazer as coisas que as pessoas mal sucedidas n?o est?o." -----Mensagem original----- De: James.Mauro at Sun.COM [mailto:James.Mauro at Sun.COM] Enviada em: quarta-feira, 23 de abril de 2008 13:08 Para: Kleyson Rios Cc: dtrace-discuss at opensolaris.org Assunto: Re: RES: Process in LCK / SLP (Please) D''oh! My bad. My brain was thinking malloc and lock contention, I saw the network calls on the stack, connected the dots and starting typing. I should have looked at the top of the stack. Sorry. Let''s back up a minute. You have a Java application and the JVM has a large number of threads in LCK time (user lock), right? plockstat showed a lot of lock activity in libc malloc, at which point you tried mtmalloc and libumem, neither of which made much of a difference, right? We know empirically (and heard from Jarod) that LCK can be deceiving, because threads that call cond_wait(3C) to wait on a condition variable will be charged with LCK time, but it''s really sleep time most of the time. So it is sometimes an artifact of the application design that threads show up at 100% LCK time - threads are waiting for a cond_signal() (or cond_broadcast()) to be put to work, but there''s no work to do. Anyway... The problem I thought we were chasing was lock activity in malloc. If you have more recent data (plockstat, prstat) please pass it to me (sorry I missed it), and I''ll get back in sync on the current problem. I would add that there''s nothing here that I think belongs on dtrace_discuss. I would start posting to the performance group and a Java performance alias. Also, and I''m sorry if I missed this, how well (or poorly) is the Java application performing, and what metric do we have to determine application performance. Ricky''s script grabs a user stack when a thread goes off CPU to sleep, and tally''s what the threads are sleeping on. It''s mostly condition variables, which doesn''t really help us much (everything sleeps on CVs...mostly). There''s a lof of user locks in there as well. So... The CVs look like threads blocking on writes, including writes to a network socket. You need to figure out the write target, and go from there. Get the PID of the target JVM - dtrace -n ''syscall::write:entry / pid == $target / @[fds[arg0].fi_pathname] = count(); }'' -p [PID_OF_JVM] The above will list the files being written to. The next step depends on what we see. The same goes for the user locks. If they are indeed malloc locks (which I think they are), I suspect that will come back around to network IOs larger than 2k. Try this: dtrace -n ''pid$target:libc:malloc:entry { @j[jstack()] = count(); }'' -p [PID_OF_JVM] Thanks, /jim Kleyson Rios wrote:> Hi Jim, > > But, if there are problems in malloc for buffers for network, I should see > the locks when running plockstat, don''t see ? > > > Regards, > > ------------------------------------------------------------------ > > Kleyson Rios. > > -----Mensagem original----- > De: James.Mauro at Sun.COM [mailto:James.Mauro at Sun.COM] > Enviada em: quarta-feira, 23 de abril de 2008 01:11 > Para: Kleyson Rios > Cc: dtrace-discuss at opensolaris.org > Assunto: Re: [dtrace-discuss] RES: RES: Process in LCK / SLP (Please) > > You may want to cross-post to a Java alias, but I''ve been down this > road before. > > Java will call into malloc() for buffers for network reads and writes > that are larger than 2k bytes (the 2k is from memory, and I think it > was a 1.5 JVM). A large number of malloc calls, > and resulting contention on locks in the library, are due to theapplication> doing network writes of larger than 2k. > > Newer JVMs (1.6) may improve this, but I''m not sure. There''s also > an alternative set of classes and methods, NIO, which also can > help (although I''ve heard tell that NIO brings other problems along > with it, but I can not speak from personal experience). > > At this point, I think you need to consult with Java experts to determine > what options you have for buffer allocation for network IO from the > Java heap, versus the current behavior of the JVM dropping back > to malloc for allocating buffers for network IO. > > The other option of course is determining if the code can be changed to > use buffers smaller than 2k. > > Thanks, > /jim > > > > > Kleyson Rios wrote: > >> OK jonathan, >> >> I understand. >> >> So, looking on right place now, i can see few locks and sometimes nolocks>> (just Mutex Hold). But I still have many threads in 100% LCK. >> >> If I don''t have a lot of locks, where is my problem ? >> >> Running rickey c weisner''s script a get: >> >> (...) >> 25736 >> libc.so.1`_so_send+0x15 >> libjvm.so`__1cDhpiEsend6Fipcii_i_+0x67 >> libjvm.so`JVM_Send+0x32 >> >> > libnet.so`Java_java_net_SocketOutputStream_socketWrite0+0x131 > >> 0xc3c098d3 >> 10 >> 25736 >> 0xc3d2a33a >> 14 >> 25736 >> libc.so.1`_write+0x15 >> libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d >> libjvm.so`JVM_Write+0x30 >> libjava.so`0xc8f7c04b >> 16 >> 25736 >> libc.so.1`stat64+0x15 >> 21 >> 25736 >> libc.so.1`_write+0x15 >> libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d >> libjvm.so`JVM_Write+0x30 >> libjava.so`0xc8f80ce9 >> 76 >> java 25736 kernel-level lock 1 >> java 25736 shuttle 6 >> java 25736 preempted 7 >> java 25736 user-level lock 511 >> java 25736 condition variable 748 >> >> >> Atenciosamente, >> >> ------------------------------------------------------------------ >> >> Kleyson Rios. >> Ger?ncia de Suporte T?cnico >> Analista de Suporte / L?der de Equipe >> >> >> -----Mensagem original----- >> De: Jonathan Adams [mailto:jonathan.adams at sun.com] >> Enviada em: sexta-feira, 18 de abril de 2008 15:40 >> Para: Kleyson Rios >> Cc: dtrace-discuss at opensolaris.org >> Assunto: Re: [dtrace-discuss] RES: Process in LCK / SLP (Please) >> >> >> On Apr 18, 2008, at 1:03 PM, Kleyson Rios wrote: >> >> >>> Hi przemol, >>> >>> Bellow output of plockstat for malloc and libumem. Both many locks. >>> Why changing to libumem I didn''t get less locks ? >>> >>> >>> >> You''re looking at Mutex hold statistics, which don''t mean a lot >> (unless contention is caused by long hold times) >> >> The important thing for multi-threaded performance is *contention*. >> (Spinning and blocking) Those are the statistics you should be >> looking at. >> >> Both malloc and libumem use locks to protect their state; libumem >> just uses many locks, in order to reduce contention. >> >> Cheers, >> - jonathan >> >> >> >> >> _______________________________________________ >> dtrace-discuss mailing list >> dtrace-discuss at opensolaris.org >> >> > > >
> OK. Let''s go. > Now i have about 100 users simultaneous. That box has 4 CPU and it''s running > JBoss. > My application it''s very slow. Looking at prstat I saw that I have many > threads and most of time one or two threads are running in the same time. > Using -Lm prstat options I saw that the most of time the threads are in > LCK/SLP state. >Yep. Performance is slow because this threaded application isn''t really get much in the way of concurrency. In fact, there are 256 threads in this app, with only 1 getting work done, so my first question is why 256 threads? Did you try tweaking this down to, say, 8 threads? Or 4 threads?> Using plockstat I saw many locks for malloc, so I changed for libumem. I > don''t have more contention locks (Spin, block), I have just many Mutex hold > locks.The hold times are not atrociously bad, but some are fairly long. More to the point, this indicates all the LCK time is in malloc-related locks. So you may have improved the situation here with the move to libumem, but it doesn''t necesarily mean the problem has been solved. The data tells us that some Java threads are spending 20% or more of their time waiting on locks, and those locks all seem to be in malloc. So we need to figure out why the JVM is calling malloc. That''s the second dtrace line I sent, and I''m not sure why it came back empty. I would run it again. We need to know where the malloc calls are coming from.> But the problem still exist. I have 7 box running jboss, all box have > the same problem. >That''s not surprising. It looks like an application issue to me.> Doubt: Are there some difference for output running prstat for 1 second or 5 > second? Because running prstat for 1 second I get thread with 100% LCK. >Only the difference one would expect by averaging over a longer period. The actual microstates are event-based, and prstat computes the percentage of time spent in each microstate based on the interval, so changing the interval from 1 second to 5 seconds means a longer window from which to compute averages. I would not expect things to change appreciably in terms of what''s reported (and, in my experience, it generally does not). Your prstat data shows significantly more sleep time than lock time, although between the two you account for 98% of time for almost all the threads. So.... 1. The locks. You need to determine where the mallocs are coming from. Only an application change can improve this. 2. Sleep. What are the threads sleeping on? Looks like writes to files and network sockets account for a fare bit (based on earlier stack traces). You could also grab some random pstack(1) samples (pstack PID/LWPID) for long sleeping threads. If you want to reduce sleep time on writes, you need to make writes faster, or change the code to spin off a "writer" thread to do the writes asynchronously to the other work the application does (assuming the application design and specs permit such a thing) - otherwise, you need to optimize writes. Looks like the file writes are all to temp files in /var/tmp. Check your iostat data and make sure you''re getting good write performance. Are the files opened O_SYNC/O_DSYNC? That will really slow writes down, and should only be done if it''s an absolute requirement for the application (run pfiles on the JVM to determine if the files SYNC flags are set. If they are, ask the programmers if sync writes are necessary). Again, this isn''t a dtrace issue, it''s a Java application performance issue. You''ll probably get more help by posting to a performance group (perf-discuss at opensolaris.org) and a Java interest group (sorry, I''m not a Java expert). You should also consider trying a newer JVM - the 1.6 JVM includes the hotspot dtrace provider, so we can learn more about what the code is doing with dtrace. Alternatively, download and link in agent libs for 1.4/1.5 Java to get some dtrace probes: https://solaris10-dtrace-vm-agents.dev.java.net/ And again, I don''t know why 256 threads are being started. I would seriously consider taking the thread count way down for some testing. Too many threads puts more pressure on already contended resources, and the threads are not getting any work done anyway... HTH, /jim> ******************************************************************* > Prstat (now), I got a little better LCK when I changed for libumem: > > [root at aga253distp156:/] # prstat -Lmp 25736 > PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG > PROCESS/LWPID > 25736 root 57 0.0 0.0 0.0 0.0 43 0.0 0.1 36 83 76 0 java/2 > 25736 root 0.6 0.5 0.0 0.0 0.0 16 78 0.0 32 0 1K 0 java/7211 > 25736 root 0.4 0.6 0.0 0.0 0.0 29 70 0.0 19 5 1K 0 java/7237 > 25736 root 0.4 0.6 0.0 0.0 0.0 27 61 0.2 25 15 1K 0 java/7364 > 25736 root 0.4 0.5 0.0 0.0 0.0 20 80 0.0 19 8 1K 0 java/7218 > 25736 root 0.3 0.4 0.0 0.0 0.0 29 70 0.0 18 5 1K 0 java/7204 > 25736 root 0.3 0.4 0.0 0.0 0.0 16 73 0.0 23 6 864 0 java/7374 > 25736 root 0.2 0.4 0.0 0.0 0.0 20 79 0.0 9 11 1K 0 java/7305 > 25736 root 0.3 0.4 0.0 0.0 0.0 23 76 0.0 7 9 900 0 java/7475 > 25736 root 0.4 0.3 0.0 0.0 0.0 11 88 0.0 15 1 553 0 java/7236 > 25736 root 0.3 0.3 0.0 0.0 0.0 0.8 93 0.0 17 2 734 0 java/7370 > 25736 root 0.1 0.3 0.0 0.0 0.0 8.7 91 0.0 7 2 773 0 java/7235 > 25736 root 0.2 0.2 0.0 0.0 0.0 9.5 90 0.0 8 0 316 0 java/7219 > 25736 root 0.2 0.1 0.0 0.0 0.0 9.8 84 0.0 17 11 306 0 java/7269 > 25736 root 0.1 0.1 0.0 0.0 0.0 0.1 100 0.0 2 1 419 0 java/7471 > 25736 root 0.2 0.0 0.0 0.0 0.0 8.0 92 0.0 14 3 68 0 java/7264 > 25736 root 0.1 0.1 0.0 0.0 0.0 10 89 0.0 6 0 327 0 java/7470 > 25736 root 0.1 0.1 0.0 0.0 0.0 0.0 100 0.0 10 0 302 0 java/7365 > 25736 root 0.2 0.0 0.0 0.0 0.0 6.4 88 0.1 14 5 49 0 java/7317 > 25736 root 0.1 0.1 0.0 0.0 0.0 93 7.0 0.1 4 4 209 0 java/7367 > 25736 root 0.1 0.0 0.0 0.0 0.0 4.0 91 0.0 10 1 55 0 java/7229 > 25736 root 0.1 0.0 0.0 0.0 0.0 3.9 96 0.0 11 1 57 0 java/7304 > 25736 root 0.1 0.1 0.0 0.0 0.0 8.6 91 0.0 4 1 196 0 java/14 > 25736 root 0.0 0.1 0.0 0.0 0.0 0.0 100 0.0 1 1 263 0 java/7297 > 25736 root 0.1 0.1 0.0 0.0 0.0 0.0 100 0.1 154 0 86 0 java/10 > 25736 root 0.1 0.0 0.0 0.0 0.0 0.6 99 0.0 9 0 55 0 java/7377 > 25736 root 0.1 0.0 0.0 0.0 0.0 100 0.0 0.0 12 4 54 0 java/4 > 25736 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 16 0 40 0 java/3 > 25736 root 0.0 0.0 0.0 0.0 0.0 3.6 96 0.0 2 0 4 0 java/7481 > 25736 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 8 0 12 0 java/170 > 25736 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 12 0 12 0 java/231 > 25736 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 19 4 25 0 java/171 > 25736 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 1 0 2 0 java/7221 > 25736 root 0.0 0.0 0.0 0.0 0.0 95 0.0 0.1 20 0 20 0 java/167 > 25736 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 1 0 2 0 java/7378 > Total: 1 processes, 256 lwps, load averages: 0.74, 0.68, 0.62 > > ****************************************************************** > > > Plockstat (now): > > [root at aga253distp213:/dtrace] # plockstat -A -e 30 -p 25736 > 0 > Mutex hold > > Count nsec Lock Caller > ---------------------------------------------------------------------------- > --- > 3546 4765 0x8079580 libumem.so.1`umem_cache_alloc+0xc1 > 3509 3865 0x8079580 libumem.so.1`umem_cache_free+0x194 > 328 12542 0x8073030 libumem.so.1`vmem_xfree+0xfe > 3 1340378 libumem.so.1`umem_cache_lock > libumem.so.1`umem_cache_applyall+0x51 > 261 13414 0x8073030 libumem.so.1`vmem_alloc+0x13c > 471 5696 0x807b680 > libumem.so.1`umem_cache_alloc+0xc1 > 475 5441 0x807b680 > libumem.so.1`umem_cache_free+0x194 > 123 20767 0x807b6c0 > libumem.so.1`umem_cache_free+0x194 > 28 90103 0x807a1c0 > libumem.so.1`umem_cache_free+0x194 > 329 6788 libumem.so.1`vmem0+0x30 libumem.so.1`vmem_alloc+0x126 > 309 6900 0x807ad80 > libumem.so.1`umem_cache_alloc+0xc1 > 329 6310 libumem.so.1`vmem0+0x4a4 libumem.so.1`vmem_xalloc+0x43b > 328 6248 libumem.so.1`vmem0+0x30 libumem.so.1`vmem_xfree+0x127 > 355 5727 0x807b300 > libumem.so.1`umem_cache_alloc+0xc1 > 216 9080 0x807b940 > libumem.so.1`umem_cache_alloc+0xc1 > 355 5338 0x807b300 > libumem.so.1`umem_cache_free+0x194 > 328 5393 libumem.so.1`vmem0+0x4a4 libumem.so.1`vmem_xfree+0xfe > 310 5308 0x807ad80 > libumem.so.1`umem_cache_free+0x194 > 329 4921 0x8073030 libumem.so.1`vmem_xalloc+0x43b > 154 10299 0x807b9c0 > libumem.so.1`umem_cache_alloc+0xc1 > 216 6043 0x807b940 > libumem.so.1`umem_cache_free+0x194 > 329 3960 0x8073030 libumem.so.1`vmem_xalloc+0x315 > 161 7968 0x807b980 > libumem.so.1`umem_cache_alloc+0xc1 > 235 5438 0x807b500 > libumem.so.1`umem_cache_alloc+0xc1 > 329 3784 libumem.so.1`vmem0+0x4a4 libumem.so.1`vmem_alloc+0x13c > 241 5029 0x807b500 > libumem.so.1`umem_cache_free+0x194 > 85 13906 0x8082c80 > libumem.so.1`umem_cache_alloc+0xc1 > 329 3577 libumem.so.1`vmem0+0x4a4 libumem.so.1`vmem_xalloc+0x315 > 39 29926 0x8079980 > libumem.so.1`umem_cache_free+0x194 > 130 8876 0x807ba00 > libumem.so.1`umem_cache_alloc+0xc1 > 34 33388 0xc8e149c0 libc.so.1`thr_suspend+0x1a > 155 7304 0x807ba40 > libumem.so.1`umem_cache_alloc+0xc1 > 13 86856 0x807ac80 > libumem.so.1`umem_cache_alloc+0xc1 > 94 11979 0x80828c0 > libumem.so.1`umem_cache_alloc+0xc1 > 143 7742 0x807b5c0 > libumem.so.1`umem_cache_alloc+0xc1 > 153 5759 0x807b5c0 > libumem.so.1`umem_cache_free+0x194 > 98 8816 0x807b540 > libumem.so.1`umem_cache_alloc+0xc1 > 115 7499 0x807b8c0 > libumem.so.1`umem_cache_alloc+0xc1 > 26 32518 0xc8e12100 libc.so.1`thr_suspend+0x1a > 158 5210 0x807b640 > libumem.so.1`umem_cache_free+0x194 > 95 8504 0x80828c0 > libumem.so.1`umem_cache_free+0x194 > 155 5199 0x807ba40 > libumem.so.1`umem_cache_free+0x194 > 8 97711 0xc8e145c0 libc.so.1`thr_suspend+0x1a > 130 5915 0x807ba00 > libumem.so.1`umem_cache_free+0x194 > 41 18736 0x807b2c0 > libumem.so.1`umem_cache_alloc+0xc1 > 101 7524 0x8082940 > libumem.so.1`umem_cache_alloc+0xc1 > 33 22842 0x807d640 > libumem.so.1`umem_cache_free+0x194 > 35 21064 0x807acc0 > libumem.so.1`umem_cache_alloc+0xc1 > 16 45814 0xc8e12900 libc.so.1`thr_suspend+0x1a > 129 5641 0x807b880 > libumem.so.1`umem_cache_free+0x194 > 5 142887 0xc8e13100 libc.so.1`thr_suspend+0x1a > 43 16584 0x807b240 > libumem.so.1`umem_cache_alloc+0xc1 > 21 33838 0xc8e12b80 libc.so.1`thr_suspend+0x1a > 101 7012 0x8082940 > libumem.so.1`umem_cache_free+0x194 > 8 42677 0xc8e13f00 libc.so.1`thr_suspend+0x1a > 7 48677 0xc8e12940 libc.so.1`thr_suspend+0x1a > 9 37763 0xc8e13080 libc.so.1`thr_suspend+0x1a > 30 11317 0x8079880 > libumem.so.1`umem_cache_alloc+0xc1 > 61 5553 0x807fcc0 > libumem.so.1`umem_cache_alloc+0xc1 > 7 38522 0xc8e14240 libc.so.1`thr_suspend+0x1a > 47 5664 0x8079cc0 > 26 9348 0x807a280 > libumem.so.1`umem_cache_alloc+0xc1 > 6 39964 0xc8e14480 libc.so.1`thr_suspend+0x1a > 26 9198 0x807bc40 > libumem.so.1`umem_cache_alloc+0xc1 > 5 47600 0xc8e15440 libc.so.1`thr_suspend+0x1a > 30 7853 0x807a240 > libumem.so.1`umem_cache_alloc+0xc1 > 14 16771 0x80824b4 > libumem.so.1`umem_depot_alloc+0xa4 > 3 3747 0x807f4b4 > libumem.so.1`umem_cache_update+0xe5 > 3 3743 0x807d134 > libumem.so.1`umem_cache_update+0xe5 > (...) > > Mutex block > > Count nsec Lock Caller > ---------------------------------------------------------------------------- > --- > 1 1624523 0x8073030 libumem.so.1`vmem_xfree+0x24 > 1 23803 0x80825c0 > libumem.so.1`umem_cache_free+0x51 > 1 18619 0x8073030 libumem.so.1`vmem_xfree+0x24 > 1 15304 0x807b5c0 > libumem.so.1`umem_cache_free+0x51 > > > ****************************************************************** > > > # time dtrace -n ''syscall::write:entry / pid == $target / { > @[fds[arg0].fi_pathname] = count(); }'' -p 25736 > > dtrace: description ''syscall::write:entry '' matched 1 probe > ^C > > /zonas/srvintra/root/var/tmp/imageio158908.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158909.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158910.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158911.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158912.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158913.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158914.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158915.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158916.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158917.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158918.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158919.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158920.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158921.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158922.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158923.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158924.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158925.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158926.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158927.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158928.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158929.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158930.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158931.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158932.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158933.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158934.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158935.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158936.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158937.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158938.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158939.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158940.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158941.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158942.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158943.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158944.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158945.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158946.tmp 16 > /zonas/srvintra/root/var/tmp/imageio158947.tmp 16 > > /zonas/srvintra/root/opt/jboss-3.2.3_jetty-4.2.17/server/default/log/2008_04 > _23.controller.csv 247 > > /zonas/srvintra/root/opt/jboss-3.2.3_jetty-4.2.17/server/default/deploy/detr > an.war/WEB-INF/control-detran.log 784 > > /zonas/srvintra/root/opt/jboss-3.2.3_jetty-4.2.17/server/default/log/2008_04 > _23.request.log 89872 > > real 0m46.632s > user 0m0.535s > sys 0m0.412s > > ************************************************************* > > # time dtrace -n ''pid$target:libc:malloc:entry { @j[jstack()] = count(); }'' > -p 5736 > > dtrace: description ''pid$target:libc:malloc:entry '' matched 1 probe > ^C > > > real 0m52.971s > user 0m0.581s > sys 0m0.452s > > > > Regards, > > ------------------------------------------------------------------ > > Kleyson Rios. > Ger?ncia de Suporte T?cnico > Analista de Suporte / L?der de Equipe > > Governo do Estado de Goi?s > Ag?ncia de Administra??o > Diretoria de Inform?tica > +55 62 3201-6582 > > "Voc? sabe a diferen?a entre pessoas bem sucedidas e pessoas mal sucedidas ? > Pessoas bem sucedidas est?o dispostas a fazer as coisas que as pessoas mal > sucedidas n?o est?o." > > -----Mensagem original----- > De: James.Mauro at Sun.COM [mailto:James.Mauro at Sun.COM] > Enviada em: quarta-feira, 23 de abril de 2008 13:08 > Para: Kleyson Rios > Cc: dtrace-discuss at opensolaris.org > Assunto: Re: RES: Process in LCK / SLP (Please) > > D''oh! My bad. My brain was thinking malloc and lock contention, > I saw the network calls on the stack, connected the dots and starting > typing. I should have looked at the top of the stack. Sorry. > > Let''s back up a minute. You have a Java application and the JVM has a > large number of threads in LCK time (user lock), right? > > plockstat showed a lot of lock activity in libc malloc, at which point > you tried mtmalloc and libumem, neither of which made much of > a difference, right? > > We know empirically (and heard from Jarod) that LCK can be deceiving, > because threads that call cond_wait(3C) to wait on a condition variable > will be charged with LCK time, but it''s really sleep time most of the time. > So it is sometimes an artifact of the application design that threads show > up at 100% LCK time - threads are waiting for a cond_signal() (or > cond_broadcast()) > to be put to work, but there''s no work to do. Anyway... > > The problem I thought we were chasing was lock activity in malloc. > > If you have more recent data (plockstat, prstat) please pass it to me (sorry > I missed it), and I''ll get back in sync on the current problem. > > I would add that there''s nothing here that I think belongs on > dtrace_discuss. > I would start posting to the performance group and a Java performance > alias. Also, and I''m sorry if I missed this, how well (or poorly) is the > Java application performing, and what metric do we have to determine > application performance. > > Ricky''s script grabs a user stack when a thread goes off CPU to sleep, and > tally''s what the threads are sleeping on. It''s mostly condition > variables, which > doesn''t really help us much (everything sleeps on CVs...mostly). There''s a > lof of user locks in there as well. So... > > The CVs look like threads blocking on writes, including writes to a > network socket. > You need to figure out the write target, and go from there. > > Get the PID of the target JVM - > > dtrace -n ''syscall::write:entry / pid == $target / > @[fds[arg0].fi_pathname] = count(); }'' -p [PID_OF_JVM] > > The above will list the files being written to. The next step depends on > what we see. > > The same goes for the user locks. If they are indeed malloc locks (which > I think they are), > I suspect that will come back around to network IOs larger than 2k. Try > this: > > dtrace -n ''pid$target:libc:malloc:entry { @j[jstack()] = count(); }'' -p > [PID_OF_JVM] > > Thanks, > /jim > > > > Kleyson Rios wrote: > >> Hi Jim, >> >> But, if there are problems in malloc for buffers for network, I should see >> the locks when running plockstat, don''t see ? >> >> >> Regards, >> >> ------------------------------------------------------------------ >> >> Kleyson Rios. >> >> -----Mensagem original----- >> De: James.Mauro at Sun.COM [mailto:James.Mauro at Sun.COM] >> Enviada em: quarta-feira, 23 de abril de 2008 01:11 >> Para: Kleyson Rios >> Cc: dtrace-discuss at opensolaris.org >> Assunto: Re: [dtrace-discuss] RES: RES: Process in LCK / SLP (Please) >> >> You may want to cross-post to a Java alias, but I''ve been down this >> road before. >> >> Java will call into malloc() for buffers for network reads and writes >> that are larger than 2k bytes (the 2k is from memory, and I think it >> was a 1.5 JVM). A large number of malloc calls, >> and resulting contention on locks in the library, are due to the >> > application > >> doing network writes of larger than 2k. >> >> Newer JVMs (1.6) may improve this, but I''m not sure. There''s also >> an alternative set of classes and methods, NIO, which also can >> help (although I''ve heard tell that NIO brings other problems along >> with it, but I can not speak from personal experience). >> >> At this point, I think you need to consult with Java experts to determine >> what options you have for buffer allocation for network IO from the >> Java heap, versus the current behavior of the JVM dropping back >> to malloc for allocating buffers for network IO. >> >> The other option of course is determining if the code can be changed to >> use buffers smaller than 2k. >> >> Thanks, >> /jim >> >> >> >> >> Kleyson Rios wrote: >> >> >>> OK jonathan, >>> >>> I understand. >>> >>> So, looking on right place now, i can see few locks and sometimes no >>> > locks > >>> (just Mutex Hold). But I still have many threads in 100% LCK. >>> >>> If I don''t have a lot of locks, where is my problem ? >>> >>> Running rickey c weisner''s script a get: >>> >>> (...) >>> 25736 >>> libc.so.1`_so_send+0x15 >>> libjvm.so`__1cDhpiEsend6Fipcii_i_+0x67 >>> libjvm.so`JVM_Send+0x32 >>> >>> >>> >> libnet.so`Java_java_net_SocketOutputStream_socketWrite0+0x131 >> >> >>> 0xc3c098d3 >>> 10 >>> 25736 >>> 0xc3d2a33a >>> 14 >>> 25736 >>> libc.so.1`_write+0x15 >>> libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d >>> libjvm.so`JVM_Write+0x30 >>> libjava.so`0xc8f7c04b >>> 16 >>> 25736 >>> libc.so.1`stat64+0x15 >>> 21 >>> 25736 >>> libc.so.1`_write+0x15 >>> libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d >>> libjvm.so`JVM_Write+0x30 >>> libjava.so`0xc8f80ce9 >>> 76 >>> java 25736 kernel-level lock 1 >>> java 25736 shuttle 6 >>> java 25736 preempted 7 >>> java 25736 user-level lock 511 >>> java 25736 condition variable 748 >>> >>> >>> Atenciosamente, >>> >>> ------------------------------------------------------------------ >>> >>> Kleyson Rios. >>> Ger?ncia de Suporte T?cnico >>> Analista de Suporte / L?der de Equipe >>> >>> >>> -----Mensagem original----- >>> De: Jonathan Adams [mailto:jonathan.adams at sun.com] >>> Enviada em: sexta-feira, 18 de abril de 2008 15:40 >>> Para: Kleyson Rios >>> Cc: dtrace-discuss at opensolaris.org >>> Assunto: Re: [dtrace-discuss] RES: Process in LCK / SLP (Please) >>> >>> >>> On Apr 18, 2008, at 1:03 PM, Kleyson Rios wrote: >>> >>> >>> >>>> Hi przemol, >>>> >>>> Bellow output of plockstat for malloc and libumem. Both many locks. >>>> Why changing to libumem I didn''t get less locks ? >>>> >>>> >>>> >>>> >>> You''re looking at Mutex hold statistics, which don''t mean a lot >>> (unless contention is caused by long hold times) >>> >>> The important thing for multi-threaded performance is *contention*. >>> (Spinning and blocking) Those are the statistics you should be >>> looking at. >>> >>> Both malloc and libumem use locks to protect their state; libumem >>> just uses many locks, in order to reduce contention. >>> >>> Cheers, >>> - jonathan >>> >>> >>> >>> >>> _______________________________________________ >>> dtrace-discuss mailing list >>> dtrace-discuss at opensolaris.org >>> >>> >>> >> >> > > >