thr3ads.net - mongrel unicorn - Fix hang when running tests on OpenBSD by skipping two tests [Nov 2011]

If this information is useful, please help other people find it:
Share via:

Jeremy Evans

2011-Nov-14 19:33 UTC

Fix hang when running tests on OpenBSD by skipping two tests

This skips two tests on OpenBSD when cause hangs when running the tests.

This is obviously not a permanent fix, but I''m not sure why the tests
are hanging, and hanging during a test is bad.  I suppose you could
also use a timeout, so the test fails instead of hangs.  I''ll be happy
to test other patches to either the test suite or the library code
(assuming the hang is fixable on OpenBSD).

After this patch, the test suite passes fine on OpenBSD.

Jeremy

--- test/exec/test_exec.rb.orig Mon Nov 14 18:38:09 2011
+++ test/exec/test_exec.rb      Mon Nov 14 18:38:37 2011
@@ -968,7 +968,7 @@ EOF
       assert_nothing_raised { Process.kill(:QUIT, daemon_pid) }
       wait_for_death(daemon_pid)
     end
-  end
+  end unless RUBY_PLATFORM =~ /openbsd/i

   def test_default_listen_upgrade_holds_listener
     default_listen_lock do
@@ -998,7 +998,7 @@ EOF
       assert_nothing_raised { Process.kill(:QUIT, daemon_pid) }
       wait_for_death(daemon_pid)
     end
-  end
+  end unless RUBY_PLATFORM =~ /openbsd/i

   def default_listen_setup
     File.open("config.ru", "wb") { |fp|
fp.syswrite(HI.gsub("HI", ''#$$'')) }

Eric Wong

2011-Nov-14 20:54 UTC

head link

Fix hang when running tests on OpenBSD by skipping two tests

Jeremy Evans <jeremyevans0 at gmail.com> wrote:> This is obviously not a permanent fix, but I''m not sure why the
tests
> are hanging, and hanging during a test is bad.  I suppose you could
> also use a timeout, so the test fails instead of hangs.  I''ll be
happy
> to test other patches to either the test suite or the library code
> (assuming the hang is fixable on OpenBSD).
I''d like to investigate why this fails.

I''ve never been entirely happy with test_exec, so this would be a good
reason to improve or start porting problematic tests over to the
shell-based system under t/
> After this patch, the test suite passes fine on OpenBSD.
So everything under t/ (gmake test-integration) works, too?
I much prefer shell-based the test suite myself

Jeremy Evans

2011-Nov-14 23:46 UTC

head link

Fix hang when running tests on OpenBSD by skipping two tests

On Mon, Nov 14, 2011 at 9:54 PM, Eric Wong <normalperson at yhbt.net>
wrote:> Jeremy Evans <jeremyevans0 at gmail.com> wrote:
>> This is obviously not a permanent fix, but I''m not sure why
the tests
>> are hanging, and hanging during a test is bad. ?I suppose you could
>> also use a timeout, so the test fails instead of hangs. ?I''ll
be happy
>> to test other patches to either the test suite or the library code
>> (assuming the hang is fixable on OpenBSD).
>
> I''d like to investigate why this fails.
Makes sense.  Is there something I can do to help debug?
> So everything under t/ (gmake test-integration) works, too?
> I much prefer shell-based the test suite myself
I didn''t even know about that test suite till now. :)

I started running those tests on OpenBSD.  With the way the test suite
is currently setup, it''s a bit of a pain to debug, as it stops at the
first error.  So I have to:

1) run the entire test suite
2) wait for it to halt/fail
3) skip the test that halts/fails
4) go to step 1

With some small changes, you can make it through the entire test suite
even if there are errors.  It would greatly speed up the debugging
process, but it does require you read the logged output.  See the
patch below.  With it, I determined the following integration tests
hang/fail:

ruby 1.8.7 and ruby 1.9.3 integration test failures:

t0011-active-unix-socket.sh
# fails: not ok 11 - no errors
t0100-rack-input-tests.sh
# hangs: requires kill to unicorn and kill -9 to sh in the 2nd test

ruby 1.9.3 has a couple regular test failures (skipping the two hangs
with the earlier patch):

:   1) Error:
: test_parse_error(HttpParserTest):
: RuntimeError: can''t set length of shared string
:     test/unit/test_http_parser.rb:350:in `headers''
:     test/unit/test_http_parser.rb:350:in `test_parse_error''

:   1) Failure:
: test_help(ExecTest) [test/exec/test_exec.rb:319]:
: <0> expected but was
: <158>.

Thanks,
Jeremy

$OpenBSD$
--- GNUmakefile.orig    Tue Nov 15 00:12:58 2011
+++ GNUmakefile Tue Nov 15 00:13:32 2011
@@ -124,14 +124,14 @@ run_test = $(quiet_pre) \
 %.n: export PATH := $(test_prefix)/bin:$(PATH)
 %.n: export RUBYLIB := $(test_prefix):$(test_prefix)/lib:$(MYLIBS)
 %.n: $(test_prefix)/.stamp
-       $(run_test)
+       -$(run_test)

 $(T): arg = $@
 $(T): t = $(subst .rb,$(log_suffix),$@)
 $(T): export PATH := $(test_prefix)/bin:$(PATH)
 $(T): export RUBYLIB := $(test_prefix):$(test_prefix)/lib:$(MYLIBS)
 $(T): $(test_prefix)/.stamp
-       $(run_test)
+       -$(run_test)

 install: $(bins) $(ext)/unicorn_http.c
        $(prep_setup_rb)
@@ -221,7 +221,7 @@ $(T_r).%.r: export RUBYLIB := $(test_prefix):$(test_pr
 $(T_r).%.r: export UNICORN_RAILS_TEST_VERSION = $(rv)
 $(T_r).%.r: export RAILS_GIT_REPO = $(CURDIR)/$(rails_git)
 $(T_r).%.r: $(test_prefix)/.stamp $(rails_git)/info/v2.2.3-stamp
-       $(run_test)
+       -$(run_test)

 ifneq ($(VERSION),)
 rfproject := mongrel

--- t/GNUmakefile.orig  Tue Nov 15 00:12:39 2011
+++ t/GNUmakefile       Tue Nov 15 00:12:50 2011
@@ -66,7 +66,7 @@ $(T): export RAKE := $(RAKE)
 $(T): export PATH := $(test_prefix)/bin:$(PATH)
 $(T): export RUBYLIB := $(test_prefix)/lib:$(MYLIBS)
 $(T): dep $(test_prefix)/.stamp trash/.gitignore
-       $(TRACER) $(SHELL) $(SH_TEST_OPTS) $@ $(TEST_OPTS)
+       -$(TRACER) $(SHELL) $(SH_TEST_OPTS) $@ $(TEST_OPTS)

 trash/.gitignore:
        mkdir -p $(@D)

Eric Wong

2011-Nov-15 03:17 UTC

head link

Fix hang when running tests on OpenBSD by skipping two tests

Jeremy Evans <jeremyevans0 at gmail.com> wrote:> On Mon, Nov 14, 2011 at 9:54 PM, Eric Wong <normalperson at yhbt.net>
wrote:
> > Jeremy Evans <jeremyevans0 at gmail.com> wrote:
> >> This is obviously not a permanent fix, but I''m not sure
why the tests
> >> are hanging, and hanging during a test is bad. ?I suppose you
could
> >> also use a timeout, so the test fails instead of hangs.
?I''ll be happy
> >> to test other patches to either the test suite or the library code
> >> (assuming the hang is fixable on OpenBSD).
> >
> > I''d like to investigate why this fails.
> 
> Makes sense.  Is there something I can do to help debug?
I normally use "set -x" (using the "V=2" env for gmake
should enable it)
or strace (or whatever the OpenBSD equivalent is).

Syscall tracers are *much* easier to follow under MRI 1.9.3 than 1.9.2
due to the timer-thread being non-polling.  I always use "-f" to
strace
nowadays to follow process/thread creations.
> > So everything under t/ (gmake test-integration) works, too?
> > I much prefer shell-based the test suite myself
> 
> I didn''t even know about that test suite till now. :)
> 
> I started running those tests on OpenBSD.  With the way the test suite
> is currently setup, it''s a bit of a pain to debug, as it stops at
the
> first error.  So I have to:
I usually prefer to work on each problem, one-at-a-time.  However,
GNU make already has a handy -k/--keep-going flag to ignore failures.

I also use "set -e" in all my shell scripts to catch errors early on.
> ruby 1.9.3 has a couple regular test failures (skipping the two hangs
> with the earlier patch):
> 
> :   1) Error:
> : test_parse_error(HttpParserTest):
> : RuntimeError: can''t set length of shared string
> :     test/unit/test_http_parser.rb:350:in `headers''
> :     test/unit/test_http_parser.rb:350:in `test_parse_error''
That''s odd, is this with the latest version? (4.1.1)
I thought I fixed all of those issues several months ago...
> :   1) Failure:
> : test_help(ExecTest) [test/exec/test_exec.rb:319]:
> : <0> expected but was
> : <158>.
Can you dump out "test_stderr.#$$.log" just before that assertion?

Thanks!

Jeremy Evans

2011-Nov-15 20:03 UTC

head link

Fix hang when running tests on OpenBSD by skipping two tests

On Tue, Nov 15, 2011 at 4:17 AM, Eric Wong <normalperson at yhbt.net>
wrote:> I usually prefer to work on each problem, one-at-a-time. ?However,
> GNU make already has a handy -k/--keep-going flag to ignore failures.
Thanks, I didn''t know about that, and it is much easier than patching
make files.

I think I''ve fixed all the issues that caused test failures on
OpenBSD.  All changes are in the test code itself.  Hope this helps.
Sorry if gmail mangles these diffs.

Thanks,
Jeremy

expr on OpenBSD uses a basic regular expression (according to
re_format(7)), which doesn''t support +, only *.

--- t/t0011-active-unix-socket.sh.orig  Tue Nov 15 20:28:37 2011
+++ t/t0011-active-unix-socket.sh       Tue Nov 15 20:28:54 2011
@@ -7,7 +7,7 @@ read_pid_unix () {
            socat - UNIX:$unix_socket | \
            tail -1)
        test -n "$x"
-       y="$(expr "$x" : ''\([0-9]\+\)'')"
+       y="$(expr "$x" :
''\([0-9][0-9]*\)'')"
        test x"$x" = x"$y"
        test -n "$y"
        echo "$y"

I assume you aren''t purposely testing a large timeout here, so
hopefully this change is fine.  The original code caused an infinite
loop on OpenBSD, and also taking up all available space on the file
system if you let it run long enough because it wrote to the log
inside the loop.  On 1.8.7:

E, [2011-11-15T18:55:34.616397 #11092] ERROR -- : master loop error:
time + 2147483646.000000 out of Time range (RangeError)
E, [2011-11-15T18:55:34.616538 #11092] ERROR -- :
/usr/obj/ports/unicorn-4.1.1/unicorn-4.1.1/t/../test/ruby-1.8.7/lib/unicorn/http_server.rb:264:in
`+''
E, [2011-11-15T18:55:34.616611 #11092] ERROR -- :
/usr/obj/ports/unicorn-4.1.1/unicorn-4.1.1/t/../test/ruby-1.8.7/lib/unicorn/http_server.rb:264:in
`join''
E, [2011-11-15T18:55:34.616686 #11092] ERROR -- :
/usr/obj/ports/unicorn-4.1.1/unicorn-4.1.1/t/../test/ruby-1.8.7/bin/unicorn:121

On 1.9.3:

E, [2011-11-15T19:00:20.464234 #13442] ERROR -- : listen loop error:
Invalid argument (Errno::EINVAL)
E, [2011-11-15T19:00:20.464327 #13442] ERROR -- :
/usr/local/lib/ruby/gems/1.8/gems/unicorn-4.1.1/lib/unicorn/http_server.rb:620:in
`select''
E, [2011-11-15T19:00:20.464399 #13442] ERROR -- :
/usr/local/lib/ruby/gems/1.8/gems/unicorn-4.1.1/lib/unicorn/http_server.rb:620:in
`worker_loop''
E, [2011-11-15T19:00:20.464457 #13442] ERROR -- :
/usr/local/lib/ruby/gems/1.8/gems/unicorn-4.1.1/lib/unicorn/http_server.rb:485:in
`spawn_missing_workers''
E, [2011-11-15T19:00:20.464514 #13442] ERROR -- :
/usr/local/lib/ruby/gems/1.8/gems/unicorn-4.1.1/lib/unicorn/http_server.rb:135:in
`start''
E, [2011-11-15T19:00:20.464570 #13442] ERROR -- :
/usr/local/lib/ruby/gems/1.8/gems/unicorn-4.1.1/bin/unicorn:121
E, [2011-11-15T19:00:20.464626 #13442] ERROR -- :
/usr/local/bin/unicorn:19:in `load''
E, [2011-11-15T19:00:20.464681 #13442] ERROR -- : /usr/local/bin/unicorn:19

--- t/t0012-reload-empty-config.sh.orig Tue Nov 15 20:05:13 2011
+++ t/t0012-reload-empty-config.sh      Tue Nov 15 20:05:37 2011
@@ -9,7 +9,7 @@ t_begin "setup and start" && {
        cat >> $unicorn_config <<EOF
 logger Logger.new(STDOUT)
 preload_app true
-timeout 0x7fffffff
+timeout 0x7fffff
 worker_processes 2
 after_fork { |s,w| }
 \$dump_cfg = lambda { |fp,srv|

openssl sha1 on OpenBSD doesn''t just spit out the hash:

$ openssl sha1 mocha.diff
SHA1(mocha.diff)= 4ea47d3cf9e4f1858a298a8a9f5a5671422971d5
$ sha1 -q mocha.diff
4ea47d3cf9e4f1858a298a8a9f5a5671422971d5

--- t/test-lib.sh.orig  Tue Nov 15 19:12:25 2011
+++ t/test-lib.sh       Tue Nov 15 19:38:05 2011
@@ -101,6 +101,7 @@ unicorn_wait_start () {

 rsha1 () {
        _cmd="$(which sha1sum 2>/dev/null || :)"
+       test -n "$_cmd" || _cmd="$(which sha1 2>/dev/null ||
:) -q"
        test -n "$_cmd" || _cmd="$(which openssl 2>/dev/null
|| :) sha1"
        test "$_cmd" != " sha1" || _cmd="$(which
gsha1sum 2>/dev/null || :)"

You can listen on 0.0.0.0, but trying to connect to it doesn''t work
well on OpenBSD.

--- test/test_helper.rb.orig    Tue Nov 15 20:43:39 2011
+++ test/test_helper.rb Tue Nov 15 20:46:17 2011
@@ -72,6 +72,7 @@ def hit(uris)
     res = nil

     if u.kind_of? String
+      u = ''http://127.0.0.1:8080/'' if u ==
''http://0.0.0.0:8080/''
       res = Net::HTTP.get(URI.parse(u))
     else
       url = URI.parse(u[0])

Jeremy Evans

2011-Nov-15 21:19 UTC

head link

Fix hang when running tests on OpenBSD by skipping two tests

On Tue, Nov 15, 2011 at 9:03 PM, Jeremy Evans <jeremyevans0 at gmail.com>
wrote:> openssl sha1 on OpenBSD doesn''t just spit out the hash:
>
> $ openssl sha1 mocha.diff
> SHA1(mocha.diff)= 4ea47d3cf9e4f1858a298a8a9f5a5671422971d5
> $ sha1 -q mocha.diff
> 4ea47d3cf9e4f1858a298a8a9f5a5671422971d5
Here''s a fix for t/test-lib.sh that handles unix socket paths where
you are running the regression tests for a long directory name.  I
didn''t catch this earlier because the unix socket test only fails on
ruby 1.9 port on my system, since the path name is a little longer for
the ruby19 flavor than the unflavored ruby 1.8 port.

You probably want to use more robust name mangling if you choose to
fix this, my fix is simple but has corner cases where it breaks that
could be problematic.

Jeremy

--- t/test-lib.sh.orig  Thu Jan  1 01:00:00 1970
+++ t/test-lib.sh       Tue Nov 15 22:02:59 2011
@@ -38,20 +38,24 @@ rtmpfiles () {
        for id in "$@"
        do
                name=$id
-               _tmp=$t_pfx.$id
-               eval "$id=$_tmp"

                case $name in
                *fifo)
+                       _tmp=$t_pfx.$id
+                       eval "$id=$_tmp"
                        rm -f $_tmp
                        mkfifo $_tmp
                        T_RM_LIST="$T_RM_LIST $_tmp"
                        ;;
                *socket)
+                       _tmp=$(echo "$t_pfx.$id" | $RUBY -e
''print
$stdin.read(103)'')
+                       eval "$id=$_tmp"
                        rm -f $_tmp
                        T_RM_LIST="$T_RM_LIST $_tmp"
                        ;;
                *)
+                       _tmp=$t_pfx.$id
+                       eval "$id=$_tmp"
                        > $_tmp
                        T_OK_RM_LIST="$T_OK_RM_LIST $_tmp"
                        ;;
@@ -101,6 +105,7 @@ unicorn_wait_start () {

 rsha1 () {
        _cmd="$(which sha1sum 2>/dev/null || :)"
+       test -n "$_cmd" || _cmd="$(which sha1 2>/dev/null ||
:) -q"
        test -n "$_cmd" || _cmd="$(which openssl 2>/dev/null
|| :) sha1"
        test "$_cmd" != " sha1" || _cmd="$(which
gsha1sum 2>/dev/null || :)"

Eric Wong

2011-Nov-15 22:36 UTC

head link

Fix hang when running tests on OpenBSD by skipping two tests

Jeremy Evans <jeremyevans0 at gmail.com> wrote:> I assume you aren''t purposely testing a large timeout here, so
> hopefully this change is fine.  The original code caused an infinite
> loop on OpenBSD, and also taking up all available space on the file
> system if you let it run long enough because it wrote to the log
> inside the loop.  On 1.8.7:
I expected "timeout 0x7fffffff" to be fine (I sometimes use it :x)...
> E, [2011-11-15T18:55:34.616397 #11092] ERROR -- : master loop error:
> time + 2147483646.000000 out of Time range (RangeError)
> E, [2011-11-15T18:55:34.616538 #11092] ERROR -- :
>
/usr/obj/ports/unicorn-4.1.1/unicorn-4.1.1/t/../test/ruby-1.8.7/lib/unicorn/http_server.rb:264:in
Odd, which patchlevel of 1.8.7 is this?  (more of a curiosity at this
point, see below)
> On 1.9.3:
> 
> E, [2011-11-15T19:00:20.464234 #13442] ERROR -- : listen loop error:
> Invalid argument (Errno::EINVAL)
> E, [2011-11-15T19:00:20.464327 #13442] ERROR -- :
>
/usr/local/lib/ruby/gems/1.8/gems/unicorn-4.1.1/lib/unicorn/http_server.rb:620:in
> `select''
OK, I need to read POSIX manpages more often, not just the Linux
ones.>From the select() POSIX manpage:
	All implementations shall support a maximum timeout interval of
	at least 31 days.

I think I''ll just force Unicorn::Configurator to limit that to 30 days.

But, the POSIX manpage also states:

	If the timeout argument specifies a timeout interval greater
	than the implementation-defined maximum value, the maximum value
	shall be used as the actual timeout value.

So isn''t OpenBSD wrong in giving EINVAL here?  Unless somehow the large
time_t was interpreted as a negative value...
> openssl sha1 on OpenBSD doesn''t just spit out the hash:
> 
> $ openssl sha1 mocha.diff
> SHA1(mocha.diff)= 4ea47d3cf9e4f1858a298a8a9f5a5671422971d5
This happens on newer openssl under Debian, too.  Maybe just switch to
sha1sum.rb since I''m more comfortable with Ruby 1.9 now than I was in
2009.

Eric Wong

2011-Nov-16 01:18 UTC

head link

Fix hang when running tests on OpenBSD by skipping two tests

Jeremy Evans <jeremyevans0 at gmail.com> wrote:> On Tue, Nov 15, 2011 at 4:17 AM, Eric Wong <normalperson at yhbt.net>
wrote:
> > I usually prefer to work on each problem, one-at-a-time. ?However,
> > GNU make already has a handy -k/--keep-going flag to ignore failures.
> 
> Thanks, I didn''t know about that, and it is much easier than
patching
> make files.
> 
> I think I''ve fixed all the issues that caused test failures on
> OpenBSD.  All changes are in the test code itself.  Hope this helps.
OK, I think I''ve pushed relevant fixes up to master of unicorn.git
(commit fbcf6aa641e5827da48a3b6776c9897de123b405)

  Eric Wong (3):
        configurator: limit timeout to 30 days
        tests: just use the sha1sum implemented in Ruby
        tests: try to set a shorter path for Unix domain sockets

  Jeremy Evans (2):
        t0011: fix test under OpenBSD
        test_helper: ensure test client connects to valid address
> Sorry if gmail mangles these diffs.
No worries, patch(1) is very lenient.

Just wondering, do most folks have/lack decent SMTP setups nowadays?
(especially on servers they don''t usually work from)

When working without a VCS repo (live fixes on servers :x),
I''ll sometimes just send a patch out like this:

	diff -u a b | mail -s diff-a-b a at example.com

This is a big reason I prefer no-subscription-required mailing lists.
If it''s to a public mailing list, I''ll always email myself
first to
cleanup the Received: trail and also to rewrite my email address
so it doesn''t have @evil-corporation.com in it :)

mongrel unicorn - Nov 2011 - Fix hang when running tests on OpenBSD by skipping two tests

Fix hang when running tests on OpenBSD by skipping two tests

Fix hang when running tests on OpenBSD by skipping two tests

Fix hang when running tests on OpenBSD by skipping two tests

Fix hang when running tests on OpenBSD by skipping two tests

Fix hang when running tests on OpenBSD by skipping two tests

Fix hang when running tests on OpenBSD by skipping two tests

Fix hang when running tests on OpenBSD by skipping two tests

Fix hang when running tests on OpenBSD by skipping two tests