Jeremy Evans
2011-Nov-14 19:33 UTC
Fix hang when running tests on OpenBSD by skipping two tests
This skips two tests on OpenBSD when cause hangs when running the tests. This is obviously not a permanent fix, but I''m not sure why the tests are hanging, and hanging during a test is bad. I suppose you could also use a timeout, so the test fails instead of hangs. I''ll be happy to test other patches to either the test suite or the library code (assuming the hang is fixable on OpenBSD). After this patch, the test suite passes fine on OpenBSD. Jeremy --- test/exec/test_exec.rb.orig Mon Nov 14 18:38:09 2011 +++ test/exec/test_exec.rb Mon Nov 14 18:38:37 2011 @@ -968,7 +968,7 @@ EOF assert_nothing_raised { Process.kill(:QUIT, daemon_pid) } wait_for_death(daemon_pid) end - end + end unless RUBY_PLATFORM =~ /openbsd/i def test_default_listen_upgrade_holds_listener default_listen_lock do @@ -998,7 +998,7 @@ EOF assert_nothing_raised { Process.kill(:QUIT, daemon_pid) } wait_for_death(daemon_pid) end - end + end unless RUBY_PLATFORM =~ /openbsd/i def default_listen_setup File.open("config.ru", "wb") { |fp| fp.syswrite(HI.gsub("HI", ''#$$'')) }
Eric Wong
2011-Nov-14 20:54 UTC
Fix hang when running tests on OpenBSD by skipping two tests
Jeremy Evans <jeremyevans0 at gmail.com> wrote:> This is obviously not a permanent fix, but I''m not sure why the tests > are hanging, and hanging during a test is bad. I suppose you could > also use a timeout, so the test fails instead of hangs. I''ll be happy > to test other patches to either the test suite or the library code > (assuming the hang is fixable on OpenBSD).I''d like to investigate why this fails. I''ve never been entirely happy with test_exec, so this would be a good reason to improve or start porting problematic tests over to the shell-based system under t/> After this patch, the test suite passes fine on OpenBSD.So everything under t/ (gmake test-integration) works, too? I much prefer shell-based the test suite myself
Jeremy Evans
2011-Nov-14 23:46 UTC
Fix hang when running tests on OpenBSD by skipping two tests
On Mon, Nov 14, 2011 at 9:54 PM, Eric Wong <normalperson at yhbt.net> wrote:> Jeremy Evans <jeremyevans0 at gmail.com> wrote: >> This is obviously not a permanent fix, but I''m not sure why the tests >> are hanging, and hanging during a test is bad. ?I suppose you could >> also use a timeout, so the test fails instead of hangs. ?I''ll be happy >> to test other patches to either the test suite or the library code >> (assuming the hang is fixable on OpenBSD). > > I''d like to investigate why this fails.Makes sense. Is there something I can do to help debug?> So everything under t/ (gmake test-integration) works, too? > I much prefer shell-based the test suite myselfI didn''t even know about that test suite till now. :) I started running those tests on OpenBSD. With the way the test suite is currently setup, it''s a bit of a pain to debug, as it stops at the first error. So I have to: 1) run the entire test suite 2) wait for it to halt/fail 3) skip the test that halts/fails 4) go to step 1 With some small changes, you can make it through the entire test suite even if there are errors. It would greatly speed up the debugging process, but it does require you read the logged output. See the patch below. With it, I determined the following integration tests hang/fail: ruby 1.8.7 and ruby 1.9.3 integration test failures: t0011-active-unix-socket.sh # fails: not ok 11 - no errors t0100-rack-input-tests.sh # hangs: requires kill to unicorn and kill -9 to sh in the 2nd test ruby 1.9.3 has a couple regular test failures (skipping the two hangs with the earlier patch): : 1) Error: : test_parse_error(HttpParserTest): : RuntimeError: can''t set length of shared string : test/unit/test_http_parser.rb:350:in `headers'' : test/unit/test_http_parser.rb:350:in `test_parse_error'' : 1) Failure: : test_help(ExecTest) [test/exec/test_exec.rb:319]: : <0> expected but was : <158>. Thanks, Jeremy $OpenBSD$ --- GNUmakefile.orig Tue Nov 15 00:12:58 2011 +++ GNUmakefile Tue Nov 15 00:13:32 2011 @@ -124,14 +124,14 @@ run_test = $(quiet_pre) \ %.n: export PATH := $(test_prefix)/bin:$(PATH) %.n: export RUBYLIB := $(test_prefix):$(test_prefix)/lib:$(MYLIBS) %.n: $(test_prefix)/.stamp - $(run_test) + -$(run_test) $(T): arg = $@ $(T): t = $(subst .rb,$(log_suffix),$@) $(T): export PATH := $(test_prefix)/bin:$(PATH) $(T): export RUBYLIB := $(test_prefix):$(test_prefix)/lib:$(MYLIBS) $(T): $(test_prefix)/.stamp - $(run_test) + -$(run_test) install: $(bins) $(ext)/unicorn_http.c $(prep_setup_rb) @@ -221,7 +221,7 @@ $(T_r).%.r: export RUBYLIB := $(test_prefix):$(test_pr $(T_r).%.r: export UNICORN_RAILS_TEST_VERSION = $(rv) $(T_r).%.r: export RAILS_GIT_REPO = $(CURDIR)/$(rails_git) $(T_r).%.r: $(test_prefix)/.stamp $(rails_git)/info/v2.2.3-stamp - $(run_test) + -$(run_test) ifneq ($(VERSION),) rfproject := mongrel --- t/GNUmakefile.orig Tue Nov 15 00:12:39 2011 +++ t/GNUmakefile Tue Nov 15 00:12:50 2011 @@ -66,7 +66,7 @@ $(T): export RAKE := $(RAKE) $(T): export PATH := $(test_prefix)/bin:$(PATH) $(T): export RUBYLIB := $(test_prefix)/lib:$(MYLIBS) $(T): dep $(test_prefix)/.stamp trash/.gitignore - $(TRACER) $(SHELL) $(SH_TEST_OPTS) $@ $(TEST_OPTS) + -$(TRACER) $(SHELL) $(SH_TEST_OPTS) $@ $(TEST_OPTS) trash/.gitignore: mkdir -p $(@D)
Eric Wong
2011-Nov-15 03:17 UTC
Fix hang when running tests on OpenBSD by skipping two tests
Jeremy Evans <jeremyevans0 at gmail.com> wrote:> On Mon, Nov 14, 2011 at 9:54 PM, Eric Wong <normalperson at yhbt.net> wrote: > > Jeremy Evans <jeremyevans0 at gmail.com> wrote: > >> This is obviously not a permanent fix, but I''m not sure why the tests > >> are hanging, and hanging during a test is bad. ?I suppose you could > >> also use a timeout, so the test fails instead of hangs. ?I''ll be happy > >> to test other patches to either the test suite or the library code > >> (assuming the hang is fixable on OpenBSD). > > > > I''d like to investigate why this fails. > > Makes sense. Is there something I can do to help debug?I normally use "set -x" (using the "V=2" env for gmake should enable it) or strace (or whatever the OpenBSD equivalent is). Syscall tracers are *much* easier to follow under MRI 1.9.3 than 1.9.2 due to the timer-thread being non-polling. I always use "-f" to strace nowadays to follow process/thread creations.> > So everything under t/ (gmake test-integration) works, too? > > I much prefer shell-based the test suite myself > > I didn''t even know about that test suite till now. :) > > I started running those tests on OpenBSD. With the way the test suite > is currently setup, it''s a bit of a pain to debug, as it stops at the > first error. So I have to:I usually prefer to work on each problem, one-at-a-time. However, GNU make already has a handy -k/--keep-going flag to ignore failures. I also use "set -e" in all my shell scripts to catch errors early on.> ruby 1.9.3 has a couple regular test failures (skipping the two hangs > with the earlier patch): > > : 1) Error: > : test_parse_error(HttpParserTest): > : RuntimeError: can''t set length of shared string > : test/unit/test_http_parser.rb:350:in `headers'' > : test/unit/test_http_parser.rb:350:in `test_parse_error''That''s odd, is this with the latest version? (4.1.1) I thought I fixed all of those issues several months ago...> : 1) Failure: > : test_help(ExecTest) [test/exec/test_exec.rb:319]: > : <0> expected but was > : <158>.Can you dump out "test_stderr.#$$.log" just before that assertion? Thanks!
Jeremy Evans
2011-Nov-15 20:03 UTC
Fix hang when running tests on OpenBSD by skipping two tests
On Tue, Nov 15, 2011 at 4:17 AM, Eric Wong <normalperson at yhbt.net> wrote:> I usually prefer to work on each problem, one-at-a-time. ?However, > GNU make already has a handy -k/--keep-going flag to ignore failures.Thanks, I didn''t know about that, and it is much easier than patching make files. I think I''ve fixed all the issues that caused test failures on OpenBSD. All changes are in the test code itself. Hope this helps. Sorry if gmail mangles these diffs. Thanks, Jeremy expr on OpenBSD uses a basic regular expression (according to re_format(7)), which doesn''t support +, only *. --- t/t0011-active-unix-socket.sh.orig Tue Nov 15 20:28:37 2011 +++ t/t0011-active-unix-socket.sh Tue Nov 15 20:28:54 2011 @@ -7,7 +7,7 @@ read_pid_unix () { socat - UNIX:$unix_socket | \ tail -1) test -n "$x" - y="$(expr "$x" : ''\([0-9]\+\)'')" + y="$(expr "$x" : ''\([0-9][0-9]*\)'')" test x"$x" = x"$y" test -n "$y" echo "$y" I assume you aren''t purposely testing a large timeout here, so hopefully this change is fine. The original code caused an infinite loop on OpenBSD, and also taking up all available space on the file system if you let it run long enough because it wrote to the log inside the loop. On 1.8.7: E, [2011-11-15T18:55:34.616397 #11092] ERROR -- : master loop error: time + 2147483646.000000 out of Time range (RangeError) E, [2011-11-15T18:55:34.616538 #11092] ERROR -- : /usr/obj/ports/unicorn-4.1.1/unicorn-4.1.1/t/../test/ruby-1.8.7/lib/unicorn/http_server.rb:264:in `+'' E, [2011-11-15T18:55:34.616611 #11092] ERROR -- : /usr/obj/ports/unicorn-4.1.1/unicorn-4.1.1/t/../test/ruby-1.8.7/lib/unicorn/http_server.rb:264:in `join'' E, [2011-11-15T18:55:34.616686 #11092] ERROR -- : /usr/obj/ports/unicorn-4.1.1/unicorn-4.1.1/t/../test/ruby-1.8.7/bin/unicorn:121 On 1.9.3: E, [2011-11-15T19:00:20.464234 #13442] ERROR -- : listen loop error: Invalid argument (Errno::EINVAL) E, [2011-11-15T19:00:20.464327 #13442] ERROR -- : /usr/local/lib/ruby/gems/1.8/gems/unicorn-4.1.1/lib/unicorn/http_server.rb:620:in `select'' E, [2011-11-15T19:00:20.464399 #13442] ERROR -- : /usr/local/lib/ruby/gems/1.8/gems/unicorn-4.1.1/lib/unicorn/http_server.rb:620:in `worker_loop'' E, [2011-11-15T19:00:20.464457 #13442] ERROR -- : /usr/local/lib/ruby/gems/1.8/gems/unicorn-4.1.1/lib/unicorn/http_server.rb:485:in `spawn_missing_workers'' E, [2011-11-15T19:00:20.464514 #13442] ERROR -- : /usr/local/lib/ruby/gems/1.8/gems/unicorn-4.1.1/lib/unicorn/http_server.rb:135:in `start'' E, [2011-11-15T19:00:20.464570 #13442] ERROR -- : /usr/local/lib/ruby/gems/1.8/gems/unicorn-4.1.1/bin/unicorn:121 E, [2011-11-15T19:00:20.464626 #13442] ERROR -- : /usr/local/bin/unicorn:19:in `load'' E, [2011-11-15T19:00:20.464681 #13442] ERROR -- : /usr/local/bin/unicorn:19 --- t/t0012-reload-empty-config.sh.orig Tue Nov 15 20:05:13 2011 +++ t/t0012-reload-empty-config.sh Tue Nov 15 20:05:37 2011 @@ -9,7 +9,7 @@ t_begin "setup and start" && { cat >> $unicorn_config <<EOF logger Logger.new(STDOUT) preload_app true -timeout 0x7fffffff +timeout 0x7fffff worker_processes 2 after_fork { |s,w| } \$dump_cfg = lambda { |fp,srv| openssl sha1 on OpenBSD doesn''t just spit out the hash: $ openssl sha1 mocha.diff SHA1(mocha.diff)= 4ea47d3cf9e4f1858a298a8a9f5a5671422971d5 $ sha1 -q mocha.diff 4ea47d3cf9e4f1858a298a8a9f5a5671422971d5 --- t/test-lib.sh.orig Tue Nov 15 19:12:25 2011 +++ t/test-lib.sh Tue Nov 15 19:38:05 2011 @@ -101,6 +101,7 @@ unicorn_wait_start () { rsha1 () { _cmd="$(which sha1sum 2>/dev/null || :)" + test -n "$_cmd" || _cmd="$(which sha1 2>/dev/null || :) -q" test -n "$_cmd" || _cmd="$(which openssl 2>/dev/null || :) sha1" test "$_cmd" != " sha1" || _cmd="$(which gsha1sum 2>/dev/null || :)" You can listen on 0.0.0.0, but trying to connect to it doesn''t work well on OpenBSD. --- test/test_helper.rb.orig Tue Nov 15 20:43:39 2011 +++ test/test_helper.rb Tue Nov 15 20:46:17 2011 @@ -72,6 +72,7 @@ def hit(uris) res = nil if u.kind_of? String + u = ''http://127.0.0.1:8080/'' if u == ''http://0.0.0.0:8080/'' res = Net::HTTP.get(URI.parse(u)) else url = URI.parse(u[0])
Jeremy Evans
2011-Nov-15 21:19 UTC
Fix hang when running tests on OpenBSD by skipping two tests
On Tue, Nov 15, 2011 at 9:03 PM, Jeremy Evans <jeremyevans0 at gmail.com> wrote:> openssl sha1 on OpenBSD doesn''t just spit out the hash: > > $ openssl sha1 mocha.diff > SHA1(mocha.diff)= 4ea47d3cf9e4f1858a298a8a9f5a5671422971d5 > $ sha1 -q mocha.diff > 4ea47d3cf9e4f1858a298a8a9f5a5671422971d5Here''s a fix for t/test-lib.sh that handles unix socket paths where you are running the regression tests for a long directory name. I didn''t catch this earlier because the unix socket test only fails on ruby 1.9 port on my system, since the path name is a little longer for the ruby19 flavor than the unflavored ruby 1.8 port. You probably want to use more robust name mangling if you choose to fix this, my fix is simple but has corner cases where it breaks that could be problematic. Jeremy --- t/test-lib.sh.orig Thu Jan 1 01:00:00 1970 +++ t/test-lib.sh Tue Nov 15 22:02:59 2011 @@ -38,20 +38,24 @@ rtmpfiles () { for id in "$@" do name=$id - _tmp=$t_pfx.$id - eval "$id=$_tmp" case $name in *fifo) + _tmp=$t_pfx.$id + eval "$id=$_tmp" rm -f $_tmp mkfifo $_tmp T_RM_LIST="$T_RM_LIST $_tmp" ;; *socket) + _tmp=$(echo "$t_pfx.$id" | $RUBY -e ''print $stdin.read(103)'') + eval "$id=$_tmp" rm -f $_tmp T_RM_LIST="$T_RM_LIST $_tmp" ;; *) + _tmp=$t_pfx.$id + eval "$id=$_tmp" > $_tmp T_OK_RM_LIST="$T_OK_RM_LIST $_tmp" ;; @@ -101,6 +105,7 @@ unicorn_wait_start () { rsha1 () { _cmd="$(which sha1sum 2>/dev/null || :)" + test -n "$_cmd" || _cmd="$(which sha1 2>/dev/null || :) -q" test -n "$_cmd" || _cmd="$(which openssl 2>/dev/null || :) sha1" test "$_cmd" != " sha1" || _cmd="$(which gsha1sum 2>/dev/null || :)"
Eric Wong
2011-Nov-15 22:36 UTC
Fix hang when running tests on OpenBSD by skipping two tests
Jeremy Evans <jeremyevans0 at gmail.com> wrote:> I assume you aren''t purposely testing a large timeout here, so > hopefully this change is fine. The original code caused an infinite > loop on OpenBSD, and also taking up all available space on the file > system if you let it run long enough because it wrote to the log > inside the loop. On 1.8.7:I expected "timeout 0x7fffffff" to be fine (I sometimes use it :x)...> E, [2011-11-15T18:55:34.616397 #11092] ERROR -- : master loop error: > time + 2147483646.000000 out of Time range (RangeError) > E, [2011-11-15T18:55:34.616538 #11092] ERROR -- : > /usr/obj/ports/unicorn-4.1.1/unicorn-4.1.1/t/../test/ruby-1.8.7/lib/unicorn/http_server.rb:264:inOdd, which patchlevel of 1.8.7 is this? (more of a curiosity at this point, see below)> On 1.9.3: > > E, [2011-11-15T19:00:20.464234 #13442] ERROR -- : listen loop error: > Invalid argument (Errno::EINVAL) > E, [2011-11-15T19:00:20.464327 #13442] ERROR -- : > /usr/local/lib/ruby/gems/1.8/gems/unicorn-4.1.1/lib/unicorn/http_server.rb:620:in > `select''OK, I need to read POSIX manpages more often, not just the Linux ones.>From the select() POSIX manpage:All implementations shall support a maximum timeout interval of at least 31 days. I think I''ll just force Unicorn::Configurator to limit that to 30 days. But, the POSIX manpage also states: If the timeout argument specifies a timeout interval greater than the implementation-defined maximum value, the maximum value shall be used as the actual timeout value. So isn''t OpenBSD wrong in giving EINVAL here? Unless somehow the large time_t was interpreted as a negative value...> openssl sha1 on OpenBSD doesn''t just spit out the hash: > > $ openssl sha1 mocha.diff > SHA1(mocha.diff)= 4ea47d3cf9e4f1858a298a8a9f5a5671422971d5This happens on newer openssl under Debian, too. Maybe just switch to sha1sum.rb since I''m more comfortable with Ruby 1.9 now than I was in 2009.
Eric Wong
2011-Nov-16 01:18 UTC
Fix hang when running tests on OpenBSD by skipping two tests
Jeremy Evans <jeremyevans0 at gmail.com> wrote:> On Tue, Nov 15, 2011 at 4:17 AM, Eric Wong <normalperson at yhbt.net> wrote: > > I usually prefer to work on each problem, one-at-a-time. ?However, > > GNU make already has a handy -k/--keep-going flag to ignore failures. > > Thanks, I didn''t know about that, and it is much easier than patching > make files. > > I think I''ve fixed all the issues that caused test failures on > OpenBSD. All changes are in the test code itself. Hope this helps.OK, I think I''ve pushed relevant fixes up to master of unicorn.git (commit fbcf6aa641e5827da48a3b6776c9897de123b405) Eric Wong (3): configurator: limit timeout to 30 days tests: just use the sha1sum implemented in Ruby tests: try to set a shorter path for Unix domain sockets Jeremy Evans (2): t0011: fix test under OpenBSD test_helper: ensure test client connects to valid address> Sorry if gmail mangles these diffs.No worries, patch(1) is very lenient. Just wondering, do most folks have/lack decent SMTP setups nowadays? (especially on servers they don''t usually work from) When working without a VCS repo (live fixes on servers :x), I''ll sometimes just send a patch out like this: diff -u a b | mail -s diff-a-b a at example.com This is a big reason I prefer no-subscription-required mailing lists. If it''s to a public mailing list, I''ll always email myself first to cleanup the Received: trail and also to rewrite my email address so it doesn''t have @evil-corporation.com in it :)