Mika Borner
2008-May-01 08:55 UTC
[dtrace-discuss] Ruby: Lots of setcontext/nanosleep syscalls
Hi We are hosting a ruby on rails application for a customer. We first run the application on a T2000 and single thread performance sucked. Because of tight schedules we moved the app to an X2100, and performance seems ok so far. Anyway, I''m trying to find out where the bottleneck is, and run a dtrace to count the system calls for the mongrel process. It seems there are a lot of setcontext/nanosleep syscalls when issuing an http request. Can anybody explain me shortly, what these system calls do? I tried to google them up, but couldn''t find an explanation that I''mable to understand :-) dtrace -n ''syscall:::entry/execname == "mongrel_rails"/ { @[probefunc] = count() }'' dtrace: description ''syscall:::entry'' matched 233 probes ^C fsat 1 getpid 1 accept 2 access 2 getdents64 2 getpeername 2 ioctl 2 rename 2 unlink 3 connect 4 doorfs 4 fxstat 4 mmap 4 munmap 4 sigaction 4 so_socket 4 getuid 6 open 6 open64 7 read 10 times 16 write 18 fstat64 19 lstat64 19 stat64 20 close 22 llseek 25 fcntl 33 recv 108 send 108 gtime 300 lwp_sigmask 537 pollsys 640 nanosleep 8489 setcontext 58846 -- This message posted from opensolaris.org
Jason A. Hoffman
2008-May-01 09:08 UTC
[dtrace-discuss] Ruby: Lots of setcontext/nanosleep syscalls
Hi Mike, That''s what it does when it stops serving requests. And yes ruby on a T2000 is a bad idea, ruby needs faster chips. Also there''s DTrace-enabled Ruby at https://dev.joyent.com/projects/ruby-dtrace/wiki/Ruby+DTrace if you''d like to go deeper. Regards, Jason On May 1, 2008, at 5:55 AM, Mika Borner wrote:> Hi > > We are hosting a ruby on rails application for a customer. We first > run the application on a T2000 and single thread performance sucked. > > Because of tight schedules we moved the app to an X2100, and > performance seems ok so far. > > Anyway, I''m trying to find out where the bottleneck is, and run a > dtrace to count the system calls for the mongrel process. > > It seems there are a lot of setcontext/nanosleep syscalls when > issuing an http request. Can anybody explain me shortly, what these > system calls do? I tried to google them up, but couldn''t find an > explanation that I''mable to understand :-) > > dtrace -n ''syscall:::entry/execname == "mongrel_rails"/ > { @[probefunc] = count() }'' > dtrace: description ''syscall:::entry'' matched 233 probes > ^C > > fsat 1 > getpid 1 > accept 2 > access 2 > getdents64 2 > getpeername 2 > ioctl 2 > rename 2 > unlink 3 > connect 4 > doorfs 4 > fxstat 4 > mmap 4 > munmap 4 > sigaction 4 > so_socket 4 > getuid 6 > open 6 > open64 7 > read 10 > times 16 > write 18 > fstat64 19 > lstat64 19 > stat64 20 > close 22 > llseek 25 > fcntl 33 > recv 108 > send 108 > gtime 300 > lwp_sigmask 537 > pollsys 640 > nanosleep 8489 > setcontext 58846 > > > -- > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Mike Shapiro
2008-May-01 17:00 UTC
[dtrace-discuss] Ruby: Lots of setcontext/nanosleep syscalls
On Thu, May 01, 2008 at 01:55:40AM -0700, Mika Borner wrote:> Hi > > We are hosting a ruby on rails application for a customer. We first run the application on a T2000 and single thread performance sucked. > > Because of tight schedules we moved the app to an X2100, and performance seems ok so far. > > Anyway, I''m trying to find out where the bottleneck is, and run a dtrace to count the system calls for the mongrel process. > > It seems there are a lot of setcontext/nanosleep syscalls when issuing an http request. Can anybody explain me shortly, what these system calls do? I tried to google them up, but couldn''t find an explanation that I''mable to understand :-) > > dtrace -n ''syscall:::entry/execname == "mongrel_rails"/ { @[probefunc] = count() }'' > dtrace: description ''syscall:::entry'' matched 233 probes > ^C >Change your program to something like: syscall::setcontext:entry /execname == "mongrel_rails"/ { @[ustack()] = count(); } and let''s see what is causing that. The likely causes are either that it''s raining signals, or mongrel is co-routine crazy. Either way setcontext() is quite an expensive thing to do and this will likely lead to something useful to optimize. -Mike -- Mike Shapiro, Sun Microsystems Fishworks. blogs.sun.com/mws/
Mika Borner
2008-May-01 19:34 UTC
[dtrace-discuss] Ruby: Lots of setcontext/nanosleep syscalls
> Change your program to something like: > > syscall::setcontext:entry > /execname == "mongrel_rails"/ > { > @[ustack()] = count(); > }I did, an the topmost entries are below. I''m not a programmer, but I guess it calls a lot of subroutines. Can the setcontext syscall be avoided somehow, by optimization of program code? Thanks anyway Mika libc.so.1`__getcontext+0xb libruby.so.1`rb_yield+0x16 libruby.so.1`rb_ary_each+0x35 libruby.so.1`call_cfunc+0x2e6 libruby.so.1`rb_call0+0x3d7 libruby.so.1`rb_call+0x196 libruby.so.1`vafuncall+0xc2 libruby.so.1`rb_funcall+0x21 libruby.so.1`rb_each+0x25 libruby.so.1`rb_iterate+0x296 libruby.so.1`enum_each_with_index+0x3b libruby.so.1`call_cfunc+0x2e6 libruby.so.1`rb_call0+0x3d7 libruby.so.1`rb_call+0x196 libruby.so.1`rb_eval+0x1a00 libruby.so.1`rb_eval+0x2f28 libruby.so.1`rb_yield_0+0x4e6 libruby.so.1`rb_yield+0x16 libruby.so.1`rb_ary_each+0x35 libruby.so.1`call_cfunc+0x2e6 438 libc.so.1`__getcontext+0xb libruby.so.1`rb_yield_values+0x5d libruby.so.1`each_with_index_i+0x29 libruby.so.1`rb_yield_0+0x545 libruby.so.1`rb_yield+0x16 libruby.so.1`rb_ary_each+0x35 libruby.so.1`call_cfunc+0x2e6 libruby.so.1`rb_call0+0x3d7 libruby.so.1`rb_call+0x196 libruby.so.1`vafuncall+0xc2 libruby.so.1`rb_funcall+0x21 libruby.so.1`rb_each+0x25 libruby.so.1`rb_iterate+0x296 libruby.so.1`enum_each_with_index+0x3b libruby.so.1`call_cfunc+0x2e6 libruby.so.1`rb_call0+0x3d7 libruby.so.1`rb_call+0x196 libruby.so.1`rb_eval+0x1a00 libruby.so.1`rb_eval+0x2f28 libruby.so.1`rb_yield_0+0x4e6 452 libc.so.1`__getcontext+0xb libruby.so.1`rb_yield+0x16 libruby.so.1`rb_ary_each_index+0x2e libruby.so.1`call_cfunc+0x2e6 libruby.so.1`rb_call0+0x3d7 libruby.so.1`rb_call+0x196 libruby.so.1`rb_eval+0x1a00 libruby.so.1`rb_eval+0x2f28 libruby.so.1`rb_yield_0+0x4e6 libruby.so.1`rb_yield+0x16 libruby.so.1`rb_ary_each+0x35 libruby.so.1`call_cfunc+0x2e6 libruby.so.1`rb_call0+0x3d7 libruby.so.1`rb_call+0x196 libruby.so.1`rb_eval+0x1a00 libruby.so.1`rb_eval+0x2f28 libruby.so.1`rb_eval+0x2a74 libruby.so.1`rb_call0+0x7c6 libruby.so.1`rb_call+0x196 libruby.so.1`rb_eval+0x1888 452 libc.so.1`__getcontext+0xb libruby.so.1`rb_yield+0x16 libruby.so.1`rb_ary_each+0x35 libruby.so.1`call_cfunc+0x2e6 libruby.so.1`rb_call0+0x3d7 libruby.so.1`rb_call+0x196 libruby.so.1`rb_eval+0x1a00 libruby.so.1`rb_eval+0x2f28 libruby.so.1`rb_call0+0x7c6 libruby.so.1`rb_call+0x196 libruby.so.1`rb_eval+0x1888 libruby.so.1`rb_eval+0xf10 libruby.so.1`rb_call0+0x7c6 libruby.so.1`rb_call+0x196 libruby.so.1`rb_eval+0x1888 libruby.so.1`rb_call0+0x7c6 libruby.so.1`rb_call+0x196 libruby.so.1`rb_eval+0x1a00 libruby.so.1`rb_call0+0x7c6 libruby.so.1`rb_call+0x196 480 libc.so.1`__getcontext+0xb libruby.so.1`rb_yield_values+0x5d libruby.so.1`inject_i+0x30 libruby.so.1`rb_yield_0+0x545 libruby.so.1`rb_yield+0x16 libruby.so.1`each_i+0x2d libruby.so.1`hash_foreach_iter+0x34 libruby.so.1`st_foreach+0x64 libruby.so.1`hash_foreach_call+0x28 libruby.so.1`rb_ensure+0xb7 libruby.so.1`rb_hash_foreach+0x47 libruby.so.1`rb_hash_each+0x26 libruby.so.1`call_cfunc+0x2e6 libruby.so.1`rb_call0+0x3d7 libruby.so.1`rb_call+0x196 libruby.so.1`vafuncall+0xc2 libruby.so.1`rb_funcall+0x21 libruby.so.1`rb_each+0x25 libruby.so.1`rb_iterate+0x296 libruby.so.1`enum_inject+0x56 774 libc.so.1`__getcontext+0xb libruby.so.1`rb_yield_values+0x5d libruby.so.1`each_with_index_i+0x29 libruby.so.1`rb_yield_0+0x545 libruby.so.1`rb_yield+0x16 libruby.so.1`rb_ary_each+0x35 libruby.so.1`call_cfunc+0x2e6 libruby.so.1`rb_call0+0x3d7 libruby.so.1`rb_call+0x196 libruby.so.1`vafuncall+0xc2 libruby.so.1`rb_funcall+0x21 libruby.so.1`rb_each+0x25 libruby.so.1`rb_iterate+0x296 libruby.so.1`enum_each_with_index+0x3b libruby.so.1`call_cfunc+0x2e6 libruby.so.1`rb_call0+0x3d7 libruby.so.1`rb_call+0x196 libruby.so.1`rb_eval+0x1a00 libruby.so.1`rb_eval+0x2f28 libruby.so.1`rb_yield_0+0x4e6 876 libc.so.1`__getcontext+0xb libruby.so.1`rb_yield+0x16 libruby.so.1`rb_ary_each_index+0x2e libruby.so.1`call_cfunc+0x2e6 libruby.so.1`rb_call0+0x3d7 libruby.so.1`rb_call+0x196 libruby.so.1`rb_eval+0x1a00 libruby.so.1`rb_eval+0x2f28 libruby.so.1`rb_yield_0+0x4e6 libruby.so.1`rb_yield+0x16 libruby.so.1`rb_ary_each+0x35 libruby.so.1`call_cfunc+0x2e6 libruby.so.1`rb_call0+0x3d7 libruby.so.1`rb_call+0x196 libruby.so.1`rb_eval+0x1a00 libruby.so.1`rb_eval+0x2f28 libruby.so.1`rb_eval+0x2a74 libruby.so.1`rb_call0+0x7c6 libruby.so.1`rb_call+0x196 libruby.so.1`rb_eval+0x1888 876 -- This message posted from opensolaris.org
Mike Shapiro
2008-May-01 20:26 UTC
[dtrace-discuss] Ruby: Lots of setcontext/nanosleep syscalls
On Thu, May 01, 2008 at 12:34:11PM -0700, Mika Borner wrote:> > Change your program to something like: > > > > syscall::setcontext:entry > > /execname == "mongrel_rails"/ > > { > > @[ustack()] = count(); > > } > > I did, an the topmost entries are below. I''m not a programmer, but I guess it calls a lot of subroutines. > > Can the setcontext syscall be avoided somehow, by optimization of program code? > > Thanks anyway > Mika > > libc.so.1`__getcontext+0xb > libruby.so.1`rb_yield+0x16And the winner is ... co-routines. I''m glancing at the code for rb_yield() and basically Ruby has co-routines and I see some rather complex #defines around whether this should use setjmp/longjmp or setcontext()/getcontext() depending on OS and compiler. Clearly from your DTrace experiment, your Solaris version is ending up with the context-based implementation, and that is going to be massively more expensive (and likely unnecessary). This is something I will point out to the Solaris ruby people and not sure if Jason or others already have. Basically setjmp/longjmp are just inline assembly, whereas the context stuff is a system call that restores not only your stack pointer but also a complete register set, signal masks, floating-point state, and about a hundred other things. Much of that is entirely unnecessary for a co-routine implementation because you''re not changing signal masks and so forth. Whether you need the rest depends on how the surrounding code is written. If nothing else, one immediate optimization I see is that Ruby''s context code doesn''t touch uc_flags -- it could at minimum clear out the bits for things it doesn''t need to change. This will save some work in the kernel, but not the copyin() and the system calls. We''ll need to get in touch with the Ruby VM team and figure out whether indeed the context code is necessary on Solaris etc. But you have found something very interesting. -Mike -- Mike Shapiro, Sun Microsystems Fishworks. blogs.sun.com/mws/
Mika Borner
2008-May-02 06:19 UTC
[dtrace-discuss] Ruby: Lots of setcontext/nanosleep syscalls
>And the winner is ... co-routines. > >I''m glancing at the code for rb_yield() and basically Ruby has co-routines >and I see some rather complex #defines around whether this should use >setjmp/longjmp or setcontext()/getcontext() depending on OS and compiler. > >Clearly from your DTrace experiment, your Solaris version is ending up >with the context-based implementation, and that is going to be massively >more expensive (and likely unnecessary). This is something I will point >out to the Solaris ruby people and not sure if Jason or others already have. > >Basically setjmp/longjmp are just inline assembly, whereas the context >stuff is a system call that restores not only your stack pointer but also >a complete register set, signal masks, floating-point state, and about >a hundred other things. Much of that is entirely unnecessary for a co-routine >implementation because you''re not changing signal masks and so forth. >Whether you need the rest depends on how the surrounding code is written. > >If nothing else, one immediate optimization I see is that Ruby''s context >code doesn''t touch uc_flags -- it could at minimum clear out the bits >for things it doesn''t need to change. This will save some work in >the kernel, but not the copyin() and the system calls. > >We''ll need to get in touch with the Ruby VM team and figure out whether >indeed the context code is necessary on Solaris etc. But you have found >something very interesting. > >-Mike > >-- >Mike Shapiro, Sun Microsystems Fishworks. blogs.sun.com/mws/That''s great news! If it helps, I''m running this on Solaris 10 05/08 with current Blastwave Ruby (1.8.6). BTW: Anybody seen a recent binary of Joyents Ruby CSW package? All I can find is 1.8.5. -- This message posted from opensolaris.org