Hi, (Unable to register on bug tracker, so writing here.) While building package for ALT Linux I found that when tests are run on i686 (32-bit x86 flavour) with 'make check' for 1.4.21 release, they fail with: 1 of 4 tests failed this is ./apitest backend inmemory: 326 tests passed, 4 failed, 8 skipped. ./apitest backend glass: 423 tests passed, 4 failed, 3 skipped. ./apitest backend singlefile_glass: 261 tests passed, 4 failed, 1 skipped. ./apitest backend multi_glass: 351 tests passed, 2 failed, 4 skipped. ./apitest backend chert: 419 tests passed, 4 failed, 1 expected failures, 3 skipped. ./apitest backend multi_chert: 303 tests passed, 2 failed, 1 expected failures, 4 skipped. ./apitest total: 4036 tests passed, 20 failed, 9 expected failures, 47 skipped. FAIL: apitest Detailed failure list: builder at i586:~/RPM/BUILD/xapian-core-1.4.21$ grep FAILED log -A3 Running test: checkstatsweight1... FAILED api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight2... FAILED Query((a SYNONYM absolut)) api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight3... FAILED Query(WILDCARD SYNONYM a) api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight4... FAILED api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight1... FAILED api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight2... FAILED Query((a SYNONYM absolut)) api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight3... FAILED Query(WILDCARD SYNONYM a) api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight4... FAILED api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight1... FAILED api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight2... FAILED Query((a SYNONYM absolut)) api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight3... FAILED Query(WILDCARD SYNONYM a) api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight4... FAILED api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight1... FAILED api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight2... FAILED Query((a SYNONYM absolut)) api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight1... FAILED api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight2... FAILED Query((a SYNONYM absolut)) api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight3... FAILED Query(WILDCARD SYNONYM a) api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight4... FAILED api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight1... FAILED api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 -- Running test: checkstatsweight2... FAILED Query((a SYNONYM absolut)) api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 (Empty lines removed.) On other architectures we build tests succeed (aarch64, arm7hf, ppc64le, x86_64). Would appreciate help resolving this. Thanks,
On Sun, Oct 23, 2022 at 03:11:07PM +0300, Vitaly Chikunov wrote:> (Unable to register on bug tracker, so writing here.) > While building package for ALT Linux I found that when tests are run on > i686 (32-bit x86 flavour) with 'make check' for 1.4.21 release, they > fail with: > > 1 of 4 tests failed > > this is > > ./apitest backend inmemory: 326 tests passed, 4 failed, 8 skipped. > ./apitest backend glass: 423 tests passed, 4 failed, 3 skipped. > ./apitest backend singlefile_glass: 261 tests passed, 4 failed, 1 skipped. > ./apitest backend multi_glass: 351 tests passed, 2 failed, 4 skipped. > ./apitest backend chert: 419 tests passed, 4 failed, 1 expected failures, 3 skipped. > ./apitest backend multi_chert: 303 tests passed, 2 failed, 1 expected failures, 4 skipped. > ./apitest total: 4036 tests passed, 20 failed, 9 expected failures, 47 skipped. > FAIL: apitest > > Detailed failure list: > > builder at i586:~/RPM/BUILD/xapian-core-1.4.21$ grep FAILED log -A3 > Running test: checkstatsweight1... FAILED > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight2... FAILED > Query((a SYNONYM absolut)) > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight3... FAILED > Query(WILDCARD SYNONYM a) > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight4... FAILED > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight1... FAILED > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight2... FAILED > Query((a SYNONYM absolut)) > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight3... FAILED > Query(WILDCARD SYNONYM a) > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight4... FAILED > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight1... FAILED > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight2... FAILED > Query((a SYNONYM absolut)) > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight3... FAILED > Query(WILDCARD SYNONYM a) > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight4... FAILED > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight1... FAILED > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight2... FAILED > Query((a SYNONYM absolut)) > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight1... FAILED > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight2... FAILED > Query((a SYNONYM absolut)) > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight3... FAILED > Query(WILDCARD SYNONYM a) > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight4... FAILED > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight1... FAILED > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > -- > Running test: checkstatsweight2... FAILED > Query((a SYNONYM absolut)) > api_weight.cc:1013: ((get_average_length()) == (db.get_avlength())) > Expected 'get_average_length()' and 'db.get_avlength()' to be equal: were 30.8333 and 30.8333 > > (Empty lines removed.) On other architectures we build tests succeed (aarch64, > arm7hf, ppc64le, x86_64). Would appreciate help resolving this.To add, this is GCC 12.1.1. As a debugging exercise I printed the numbers and they shown exactly the same, but subtracting them produces a value > 0. AFAIK '==' comparison of doubles is unsafe operation in general (there is GCC warning about that -Wfloat-equal). As a dirty hack I explicitly cast these doubles to (float) like this TEST_EQUAL((float)get_average_length(), (float)db.get_avlength()); This of course is not a solution. Thanks,> > Thanks, >
On Sun, Oct 23, 2022 at 03:11:07PM +0300, Vitaly Chikunov wrote:> (Unable to register on bug tracker, so writing here.) > While building package for ALT Linux I found that when tests are run on > i686 (32-bit x86 flavour) with 'make check' for 1.4.21 release, they > fail with:I can only reproduce your problem if I configure with --disable-sse, which is not a recommended configuration because on x86 without SSE, we get 387 FP's excess precision inflicted on us. There are hacks in the code to work around the worst problems that causes (such as segfaults due to undefined behaviour caused by not being able to reliably even use an FP comparison in a sort comparison function) but more subtle problems remain and we don't recommend using --disable-sse. Fundamentally it's hard to order results correctly by an FP value when the same equation evaluated with the same inputs in two places in the code is able to give different answers depending which intermediate results are spilled to memory. If you're using GCC, -ffloat-store mostly avoids the problem, but at a significant performance cost because FP values get written to memory everywhere rather than carried in registers when possible, and that cost is paid by everyone using the package on i686 when most of them have a perfectly capable FP unit. GCC's -fexcess-precision=standard option would provide a solution with less overhead, but sadly it's only been implemented for C. I'm not sure what the ALT Linux baseline for i686 is, but if you really need to build binary packages which will run on processors without SSE, I'd strongly recommend the approach describing in the last entry here: https://trac.xapian.org/wiki/PackagingXapian That way most x86 users get a build using SSE (i.e. one which works properly), and the problems excess precision causes will only affect the dwindling number of users who really have an x86 old enough not to have SSE.> (Empty lines removed.) On other architectures we build tests succeed > (aarch64, > arm7hf, ppc64le, x86_64). Would appreciate help resolving > this.If you insist on using --disable-sse, the simplest solution is to not run the testsuite. (The purpose of the testsuite is to find bugs; an effect of --disable-sse is essentially to introduce bugs...) Cheers, Olly