On Thu, Jun 09, 2022 at 03:20:02PM +0100, Daniel P. Berrang? wrote:> > + go test -count=1 -v > > === RUN Test010Load > > --- PASS: Test010Load (0.00s) > > === RUN TestAioBuffer > > --- PASS: TestAioBuffer (0.00s) > > === RUN TestAioBufferFree > > --- PASS: TestAioBufferFree (0.00s) > > === RUN TestAioBufferBytesAfterFree > > SIGABRT: abort > > PC=0x3fdf6f9bac m=0 sigcode=18446744073709551610 > > So suggesting TestAioBufferBytesAfterFree is as fault, but quite > odd as that test case is trivial and whle it allocates some > native memory it doesn't seem to write to it. Unless the problem > happened in an earlier test case and we have delayed detection ? > > I guess I'd try throwing darts at the wall by chopping out bits > of test code to see what makes it disappear. > > Perhaps also try swapping MakeAioBuffer with MakeAioBufferZero > in case pre-existing data into the C.malloc()d block is confusing > Go ?Interestingly if I remove libnbd_020_aio_buffer_test.go completely, and disable GODEBUG, then the tests pass. (Reproducer commands at end of email). So I guess at least one of the problems is confined to this test and/or functions it calls in the main library. Unfortunately this test is huge. At your suggestion, replacing every MakeAioBuffer with MakeAioBufferZero in that test, but it didn't help. Also tried replacing malloc -> calloc in the aio_buffer.go implementation which didn't help. I'll try some more random things ... Rich. $ emacs -nw run.in # comment out GODEBUG line $ emacs -nw golang/Makefile.am # remove libnbd_020_aio_buffer_test.go line $ mv golang/libnbd_020_aio_buffer_test.go golang/libnbd_020_aio_buffer_test.nogo $ make run $ make $ make -C golang check ... PASS: run-tests.sh ... -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
On Thu, Jun 09, 2022 at 04:24:12PM +0100, Richard W.M. Jones wrote:> On Thu, Jun 09, 2022 at 03:20:02PM +0100, Daniel P. Berrang? wrote: > > > + go test -count=1 -v > > > === RUN Test010Load > > > --- PASS: Test010Load (0.00s) > > > === RUN TestAioBuffer > > > --- PASS: TestAioBuffer (0.00s) > > > === RUN TestAioBufferFree > > > --- PASS: TestAioBufferFree (0.00s) > > > === RUN TestAioBufferBytesAfterFree > > > SIGABRT: abort > > > PC=0x3fdf6f9bac m=0 sigcode=18446744073709551610 > > > > So suggesting TestAioBufferBytesAfterFree is as fault, but quite > > odd as that test case is trivial and whle it allocates some > > native memory it doesn't seem to write to it. Unless the problem > > happened in an earlier test case and we have delayed detection ? > > > > I guess I'd try throwing darts at the wall by chopping out bits > > of test code to see what makes it disappear. > > > > Perhaps also try swapping MakeAioBuffer with MakeAioBufferZero > > in case pre-existing data into the C.malloc()d block is confusing > > Go ? > > Interestingly if I remove libnbd_020_aio_buffer_test.go completely, > and disable GODEBUG, then the tests pass. (Reproducer commands at end > of email). So I guess at least one of the problems is confined to > this test and/or functions it calls in the main library. > Unfortunately this test is huge. > > At your suggestion, replacing every MakeAioBuffer with > MakeAioBufferZero in that test, but it didn't help. Also tried > replacing malloc -> calloc in the aio_buffer.go implementation which > didn't help. > > I'll try some more random things ...Adding a few printf's shows something interesting: === RUN TestAioBufferBytesAfterFree calling Free on 0x3fbc1882b0 calling C.GoBytes on 0x3fbc1882b0 SIGABRT: abort PC=0x3fe6aaebac m=0 sigcode=18446744073709551610 goroutine 21 [running]: gsignal :0 abort :0 runtime.throwException ../../../libgo/runtime/go-unwind.c:128 runtime.unwindStack ../../../libgo/go/runtime/panic.go:535 panic ../../../libgo/go/runtime/panic.go:750 runtime.panicmem ../../../libgo/go/runtime/panic.go:210 runtime.sigpanic ../../../libgo/go/runtime/signal_unix.go:634 _wordcopy_fwd_aligned :0 __GI_memmove :0 runtime.stringtoslicebyte ../../../libgo/go/runtime/string.go:155 __go_string_to_byte_array ../../../libgo/go/runtime/string.go:509 _cgo_23192bdcbd72_Cfunc_GoBytes ./cgo-c-prolog-gccgo:46 This is a simple use after free because the Free function in aio_buffer.go frees the array and then the Bytes function attempts to copy b.Size bytes from the NULL pointer. I didn't write this test so I'm not quite sure what it's trying to achieve. It seems to be deliberately trying to cause a panic, but causes a segfault instead? (And why only on RISC-V?) func TestAioBufferBytesAfterFree(t *testing.T) { buf := MakeAioBuffer(uint(32)) buf.Free() defer func() { if r := recover(); r == nil { t.Fatal("Did not recover from panic calling Bytes() after Free()") } }() buf.Bytes() } Since this only happens on RISC-V I guess it might be something to do with the golang implementation on this architecture being unable to turn segfaults into panics. Removing all three *AfterFree tests fixes the tests. It seems a bit of an odd function however. Wouldn't it be better to changes the Bytes function so that it tests if the pointer is NULL and panics? NB: this _does not_ address the other problem where GODEBUG=cgocheck=2 complains about "fatal error: Go pointer stored into non-Go memory". Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
On Thu, Jun 9, 2022 at 6:24 PM Richard W.M. Jones <rjones at redhat.com> wrote:> > On Thu, Jun 09, 2022 at 03:20:02PM +0100, Daniel P. Berrang? wrote: > > > + go test -count=1 -v > > > === RUN Test010Load > > > --- PASS: Test010Load (0.00s) > > > === RUN TestAioBuffer > > > --- PASS: TestAioBuffer (0.00s) > > > === RUN TestAioBufferFree > > > --- PASS: TestAioBufferFree (0.00s) > > > === RUN TestAioBufferBytesAfterFree > > > SIGABRT: abort > > > PC=0x3fdf6f9bac m=0 sigcode=18446744073709551610 > > > > So suggesting TestAioBufferBytesAfterFree is as fault, but quite > > odd as that test case is trivial and whle it allocates some > > native memory it doesn't seem to write to it. Unless the problem > > happened in an earlier test case and we have delayed detection ? > > > > I guess I'd try throwing darts at the wall by chopping out bits > > of test code to see what makes it disappear. > > > > Perhaps also try swapping MakeAioBuffer with MakeAioBufferZero > > in case pre-existing data into the C.malloc()d block is confusing > > Go ? > > Interestingly if I remove libnbd_020_aio_buffer_test.go completely, > and disable GODEBUG, then the tests pass. (Reproducer commands at end > of email). So I guess at least one of the problems is confined to > this test and/or functions it calls in the main library. > Unfortunately this test is huge. > > At your suggestion, replacing every MakeAioBuffer with > MakeAioBufferZero in that test, but it didn't help. Also tried > replacing malloc -> calloc in the aio_buffer.go implementation which > didn't help. > > I'll try some more random things ... > > Rich. > > > $ emacs -nw run.in # comment out GODEBUG line > $ emacs -nw golang/Makefile.am # remove libnbd_020_aio_buffer_test.go line > $ mv golang/libnbd_020_aio_buffer_test.go golang/libnbd_020_aio_buffer_test.nogo > $ make run > $ make > $ make -C golang check > ... > PASS: run-tests.shSo when skipping libnbd_020_aio_buffer_test.go we don't get the warning about Go pointer in C memory? If true, can you find the test triggering this issue? You can run only some tests using -run={regex}, for example: $ go test -v -run=TestAioBuffer.+AfterFree === RUN TestAioBufferBytesAfterFree --- PASS: TestAioBufferBytesAfterFree (0.00s) === RUN TestAioBufferSliceAfterFree --- PASS: TestAioBufferSliceAfterFree (0.00s) === RUN TestAioBufferGetAfterFree --- PASS: TestAioBufferGetAfterFree (0.00s) PASS