I have a couple of zones. On one zone, I have lighttpd. On another, I got a customer using Apache. This is on a 2009.06 release. Every once in a while, maybe every other day, the server becomes unstable. HTTP requests no longer get a response. Sometimes, it works after a while w/o a problem. But sometimes, a restart of the server is necessary. I am using crossbow for these zones, with virtual links. I was doing a dtrace for all open* calls and I saw something like this: Could it be that there is something wrong with the /dev/urandom device in OpenSolaris? It is very peculiar and I can''t figure it out. I find it strange that this same type of behavior happens in two different zones running two different applications. Do you guys have any pointers? Next time this happens, what can I do to capture more data? The customer wants to move away from OpenSolaris and go back to Solaris because of this. Thanks! 2 62318 open:entry postgres pg_stat_tmp/pgstat.tmp 2 62318 open:entry postgres pg_stat_tmp/pgstat.tmp 2 62720 open64:entry httpd /dev/urandom 2 62318 open:entry httpd /etc/crypto/pkcs11.conf dtrace: error on enabled probe ID 2 (ID 62318: syscall::open:entry): invalid address (0xfd4abc63) in action #2 at DIF offset 28 2 62318 open:entry httpd /usr/lib/security/pkcs11_softtoken.so 2 62318 open:entry httpd /usr/lib/libsoftcrypto.so.1 2 62318 open:entry httpd /usr/lib/libsoftcrypto/libsoftcrypto_hwcap1.so.1 -- This message posted from opensolaris.org
The syscall:::entry probe shows, is it stuck doing something?? How can I dig deeper? All I get is this when doing "GET /index.htm HTTP/1.1" through telnet port 80: ... 3 83394 ___errno:entry 3 80993 apr_atomic_dec32:entry 3 83062 atomic_dec_32_nv:entry 3 83394 ___errno:entry 3 83394 ___errno:entry 3 80853 apr_pollset_poll:entry 3 83124 __div64:entry 3 83121 UDiv:entry 3 83125 __rem64:entry 3 83120 UDivRem:entry 3 80992 apr_atomic_inc32:entry 3 83055 atomic_inc_32_nv:entry 3 83401 port_getn:entry 3 85371 _portfs:entry 3 83394 ___errno:entry 3 80993 apr_atomic_dec32:entry 3 83062 atomic_dec_32_nv:entry 3 83394 ___errno:entry 3 83394 ___errno:entry 3 80853 apr_pollset_poll:entry 3 83124 __div64:entry 3 83121 UDiv:entry 3 83125 __rem64:entry 3 83120 UDivRem:entry 3 80992 apr_atomic_inc32:entry 3 83055 atomic_inc_32_nv:entry 3 83401 port_getn:entry 3 85371 _portfs:entry 1 83394 ___errno:entry 1 80993 apr_atomic_dec32:entry 1 83062 atomic_dec_32_nv:entry 1 83394 ___errno:entry 1 83394 ___errno:entry 1 80853 apr_pollset_poll:entry 1 83124 __div64:entry 1 83121 UDiv:entry 1 83125 __rem64:entry 1 83120 UDivRem:entry 1 80992 apr_atomic_inc32:entry 1 83055 atomic_inc_32_nv:entry 1 83401 port_getn:entry 1 85371 _portfs:entry 1 83394 ___errno:entry 1 80993 apr_atomic_dec32:entry 1 83062 atomic_dec_32_nv:entry 1 83394 ___errno:entry 1 83394 ___errno:entry 1 80853 apr_pollset_poll:entry 1 83124 __div64:entry 1 83121 UDiv:entry 1 83125 __rem64:entry 1 83120 UDivRem:entry 1 80992 apr_atomic_inc32:entry 1 83055 atomic_inc_32_nv:entry 1 83401 port_getn:entry 1 85371 _portfs:entry 0 83394 ___errno:entry 0 80993 apr_atomic_dec32:entry 0 83062 atomic_dec_32_nv:entry 0 83394 ___errno:entry 0 83394 ___errno:entry 0 80853 apr_pollset_poll:entry 0 83124 __div64:entry 0 83121 UDiv:entry 0 83125 __rem64:entry 0 83120 UDivRem:entry 0 80992 apr_atomic_inc32:entry 0 83055 atomic_inc_32_nv:entry 0 83401 port_getn:entry 0 85371 _portfs:entry 0 83394 ___errno:entry 0 80993 apr_atomic_dec32:entry 0 83062 atomic_dec_32_nv:entry 0 83394 ___errno:entry 0 83394 ___errno:entry 0 80853 apr_pollset_poll:entry 0 83124 __div64:entry 0 83121 UDiv:entry 0 83125 __rem64:entry 0 83120 UDivRem:entry 0 80992 apr_atomic_inc32:entry 0 83055 atomic_inc_32_nv:entry 0 83401 port_getn:entry 0 85371 _portfs:entry 2 83394 ___errno:entry 2 80993 apr_atomic_dec32:entry 2 83062 atomic_dec_32_nv:entry 2 83394 ___errno:entry 2 83394 ___errno:entry 2 80853 apr_pollset_poll:entry 2 83124 __div64:entry 2 83121 UDiv:entry 2 83125 __rem64:entry 2 83120 UDivRem:entry 2 80992 apr_atomic_inc32:entry 2 83055 atomic_inc_32_nv:entry 2 83401 port_getn:entry 2 85371 _portfs:entry -- This message posted from opensolaris.org
... error, correction... I believe I was using the PID provider not the syscall. -- This message posted from opensolaris.org
> The syscall:::entry probe shows, is it stuck doing something?? How can> I dig deeper? All I get is this when doing "GET /index.htm HTTP/1.1" > through telnet port 80: Maybe: 6873771 APR uses Solaris Event Ports incorrectly See https://issues.apache.org/bugzilla/show_bug.cgi?id=47645 -- meem