Hi R Developers, Greg is helping me with debugging R on Solaris 10 x64. Please let us know if you have any thoughts or tips that can help us debug this. Thanks, David ************ Using default transfer plist in vector_io: permuting About to write *** caught segfault *** address e8554000, cause 'memory not mapped' Traceback: 1: .External("do_hdf5save", call, sys.frame(sys.parent()), fileout, ..., PACKAGE = "hdf5") 2: hdf5save(hdf5_Fstat, "Fstat", "geneNames", "genotype") aborting ... ************ We've tried many things to debug it: * dbx Runtime Checking (RTC) is not detecting any (meaningful) memory access problems that I can see. * The same on Solaris/SPARC. * Neither does Valgrind on Linux. * I've tried increasing the C stack size, assuming R could be running out of stack size. Didn't help. Running R under dbx (without RTC) until the crash shows this: ... About to write t at 1 (l at 1) signal SEGV (no mapping at the fault address) in _memcpy at 0xfe90444b 0xfe90444b: _memcpy+0x006b: movaps 0x00000000(%esi),%xmm0 Current function is H5D_select_mgath 379 HDmemcpy(tgath_buf,buf+off[curr_seq],curr_len); (dbx) where current thread: t at 1 [1] _memcpy(0x0, 0xfdebc707, 0x9f5c4f0), at 0xfe90444b =>[2] H5D_select_mgath(_buf = 0x9f79580, space = 0x8966770, iter 0x8045980, nelmts = 3120U, dxpl_cache = 0xfe170078, _tgath_buf 0x9f5c4f0), line 379 in "H5Dselect.c" [3] H5D_contig_write(io_info = 0x804620c, nelmts = 3120ULL, mem_type 0x97b05c8, mem_space = 0x8966770, file_space = 0x8966770, tpath 0x8ee7078, src_id = 201326906, dst_id = 201326904, buf = 0x9f79580), line 1418 in "H5Dio.c" [4] H5D_write(dataset = 0x8f169c0, mem_type_id = 201326906, mem_space = 0x8966770, file_space = 0x8966770, dxpl_id = 671088643, buf 0x9f79580), line 952 in "H5Dio.c" [5] H5Dwrite(dset_id = 335544330, mem_type_id = 201326906, mem_space_id = 0, file_space_id = 0, plist_id = 671088643, buf 0x9f79580), line 586 in "H5Dio.c" [6] vector_io(call = 0x97234ec, writeflag = 1, dataset = 335544330, space = 268435472, obj = 0x98386a0), line 535 in "hdf5.c" [7] hdf5_write_vector(call = 0x97234ec, id = 67108867, symname 0x9cf35d0 "geneNames", val = 0x98386a0), line 693 in "hdf5.c" [8] hdf5_save_object(call = 0x97234ec, fid = 67108867, symname 0x9cf35d0 "geneNames", val = 0x98386a0), line 957 in "hdf5.c" [9] do_hdf5save(args = 0x9723284), line 1104 in "hdf5.c" [10] do_External(call = 0x86d62bc, op = 0x8371cd8, args = 0x972340c, env = 0x9723594), line 832 in "dotcode.c" [11] Rf_eval(e = 0x86d62bc, rho = 0x9723594), line 445 in "eval.c" [12] Rf_evalList(el = 0x86d6230, rho = 0x9723594, op = 0x837226c), line 1463 in "eval.c" [13] Rf_eval(e = 0x86d6214, rho = 0x9723594), line 438 in "eval.c" [14] do_begin(call = 0x86d56bc, op = 0x836709c, args = 0x86d61dc, rho = 0x9723594), line 1107 in "eval.c" [15] Rf_eval(e = 0x86d56bc, rho = 0x9723594), line 431 in "eval.c" [16] Rf_applyClosure(call = 0x9723738, op = 0x83c0328, arglist 0x97236e4, rho = 0x8379b1c, suppliedenv = 0x8379b38), line 614 in "eval.c" [17] Rf_eval(e = 0x9723738, rho = 0x8379b1c), line 455 in "eval.c" [18] Rf_ReplIteration(rho = 0x8379b1c, savestack = 0, browselevel = 0, state = 0x8047328), line 256 in "main.c" [19] R_ReplConsole(rho = 0x8379b1c, savestack = 0, browselevel = 0), line 305 in "main.c" [20] run_Rmainloop(), line 944 in "main.c" [21] Rf_mainloop(), line 951 in "main.c" [22] main(ac = 4, av = 0x80477ac), line 33 in "Rmain.c" (dbx) p curr_len curr_len = 24960U (dbx) p curr_seq curr_seq = 0 (dbx) p of dbx: "of" is not defined in the scope `libhdf5.so.0.0.0`H5Dselect.c`H5D_select_mgath:347` dbx: see `help scope' for details (dbx) p off off = 0x8042960 (dbx) p tgath_buf tgath_buf = 0x9f5c4f0 "\xd87\x83^H\xa8\xf3\x82^H0^X\x82^H^X\xd4\x81^H^P\x90\x81^H\xb8m\x80^H^H'\x80^H\x88^?^?^H\x908^?^H\xb0\xf7~^H\xd8\xad~^H\xf8\xb2~^H\xb8\x8e~^H\xe8]~^H\xe8\xcb\xed^HP\xe3}^Hh\xdd\xbb^H\x98\xc4}^H\xf0\xa0}^H\xa8r}^HH}\xc3^HpO|^HH^V|^H^X\xd8|^H\xc0\xb1|^H8=}^H\x90\xcd{^H^Pm{^H\xb8#{^Hx'{^H\x90\xf8x^HpKx^H^POx^H\xa8~w^H^H>w^H\xf0\xb2w^H\xc8^Ew^HX'x^H\xf8\xdbv^H" (dbx) p buf buf = 0x9f79580 "\xd87\x83^H\xa8\xf3\x82^H0^X\x82^H^X\xd4\x81^H^P\x90\x81^H\xb8m\x80^H^H'\x80^H\x88^?^?^H\x908^?^H\xb0\xf7~^H\xd8\xad~^H\xf8\xb2~^H\xb8\x8e~^H\xe8]~^H\xe8\xcb\xed^HP\xe3}^Hh\xdd\xbb^H\x98\xc4}^H\xf0\xa0}^H\xa8r}^HH}\xc3^HpO|^HH^V|^H^X\xd8|^H\xc0\xb1|^H8=}^H\x90\xcd{^H^Pm{^H\xb8#{^Hx'{^H\x90\xf8x^HpKx^H^POx^H\xa8~w^H^H>w^H\xf0\xb2w^H\xc8^Ew^HX'x^H\xf8\xdbv^H" (dbx) p nseq nseq = 1U (dbx) p len len = 0x804195c (dbx) p len[0..2] len[0..2] [0] = 24960U [1] = 140025512U [2] = 140013048U (dbx) The R code in question is: ... /* Loop, while sequences left to process */ for(curr_seq=0; curr_seq<nseq; curr_seq++) { /* Get the number of bytes in sequence */ curr_len=len[curr_seq]; HDmemcpy(tgath_buf,buf+off[curr_seq],curr_len); /* Advance offset in gather buffer */ tgath_buf+=curr_len; } /* end for */ ... where ./src/hdf5-1.6.5/src/H5private.h: #define HDmemcpy(X,Y,Z) memcpy((char*)(X),(const char*)(Y),Z) Maybe the "curr_len = 24960U" value is too high. I have no way of knowing what it should be in this case. The crash could be caused by a compiler bug, although it's not very likely. These crashes have occurred both with and without optimization, with and without -g.
What did the maintainer of this unmentioned contributed package (hdf5) say when you ask him? [Hint: you *have* read the posting guide at http://www.r-project.org/posting-guide.html and done as it asks?] There is no evidence here that this is anything to do with R itself. On Fri, 13 Apr 2007, Tai-Wei (David) Lin wrote:> Hi R Developers, > > Greg is helping me with debugging R on Solaris 10 x64. Please let us > know if you have any thoughts or tips that can help us debug this. > > Thanks, > > David > > > > ************ > Using default transfer plist > in vector_io: permuting > About to write > > *** caught segfault *** > address e8554000, cause 'memory not mapped' > > Traceback: > 1: .External("do_hdf5save", call, sys.frame(sys.parent()), fileout, > ..., PACKAGE = "hdf5") > 2: hdf5save(hdf5_Fstat, "Fstat", "geneNames", "genotype") > aborting ... > ************ > > We've tried many things to debug it: > > * dbx Runtime Checking (RTC) is not detecting any (meaningful) memory > access problems that I can see. > > * The same on Solaris/SPARC. > > * Neither does Valgrind on Linux. > > * I've tried increasing the C stack size, assuming R could be running > out of stack size. Didn't help. > > Running R under dbx (without RTC) until the crash shows this: > > ... > About to write > t at 1 (l at 1) signal SEGV (no mapping at the fault address) in _memcpy at > 0xfe90444b > 0xfe90444b: _memcpy+0x006b: movaps 0x00000000(%esi),%xmm0 > Current function is H5D_select_mgath > 379 HDmemcpy(tgath_buf,buf+off[curr_seq],curr_len); > (dbx) where > current thread: t at 1 > [1] _memcpy(0x0, 0xfdebc707, 0x9f5c4f0), at 0xfe90444b > =>[2] H5D_select_mgath(_buf = 0x9f79580, space = 0x8966770, iter > 0x8045980, nelmts = 3120U, dxpl_cache = 0xfe170078, _tgath_buf > 0x9f5c4f0), line 379 in "H5Dselect.c" > [3] H5D_contig_write(io_info = 0x804620c, nelmts = 3120ULL, mem_type > 0x97b05c8, mem_space = 0x8966770, file_space = 0x8966770, tpath > 0x8ee7078, src_id = 201326906, dst_id = 201326904, buf = 0x9f79580), > line 1418 in "H5Dio.c" > [4] H5D_write(dataset = 0x8f169c0, mem_type_id = 201326906, mem_space > = 0x8966770, file_space = 0x8966770, dxpl_id = 671088643, buf > 0x9f79580), line 952 in "H5Dio.c" > [5] H5Dwrite(dset_id = 335544330, mem_type_id = 201326906, > mem_space_id = 0, file_space_id = 0, plist_id = 671088643, buf > 0x9f79580), line 586 in "H5Dio.c" > [6] vector_io(call = 0x97234ec, writeflag = 1, dataset = 335544330, > space = 268435472, obj = 0x98386a0), line 535 in "hdf5.c" > [7] hdf5_write_vector(call = 0x97234ec, id = 67108867, symname > 0x9cf35d0 "geneNames", val = 0x98386a0), line 693 in "hdf5.c" > [8] hdf5_save_object(call = 0x97234ec, fid = 67108867, symname > 0x9cf35d0 "geneNames", val = 0x98386a0), line 957 in "hdf5.c" > [9] do_hdf5save(args = 0x9723284), line 1104 in "hdf5.c" > [10] do_External(call = 0x86d62bc, op = 0x8371cd8, args = 0x972340c, > env = 0x9723594), line 832 in "dotcode.c" > [11] Rf_eval(e = 0x86d62bc, rho = 0x9723594), line 445 in "eval.c" > [12] Rf_evalList(el = 0x86d6230, rho = 0x9723594, op = 0x837226c), > line 1463 in "eval.c" > [13] Rf_eval(e = 0x86d6214, rho = 0x9723594), line 438 in "eval.c" > [14] do_begin(call = 0x86d56bc, op = 0x836709c, args = 0x86d61dc, rho > = 0x9723594), line 1107 in "eval.c" > [15] Rf_eval(e = 0x86d56bc, rho = 0x9723594), line 431 in "eval.c" > [16] Rf_applyClosure(call = 0x9723738, op = 0x83c0328, arglist > 0x97236e4, rho = 0x8379b1c, suppliedenv = 0x8379b38), line 614 in "eval.c" > [17] Rf_eval(e = 0x9723738, rho = 0x8379b1c), line 455 in "eval.c" > [18] Rf_ReplIteration(rho = 0x8379b1c, savestack = 0, browselevel = 0, > state = 0x8047328), line 256 in "main.c" > [19] R_ReplConsole(rho = 0x8379b1c, savestack = 0, browselevel = 0), > line 305 in "main.c" > [20] run_Rmainloop(), line 944 in "main.c" > [21] Rf_mainloop(), line 951 in "main.c" > [22] main(ac = 4, av = 0x80477ac), line 33 in "Rmain.c" > (dbx) p curr_len > curr_len = 24960U > (dbx) p curr_seq > curr_seq = 0 > (dbx) p of > dbx: "of" is not defined in the scope > `libhdf5.so.0.0.0`H5Dselect.c`H5D_select_mgath:347` > dbx: see `help scope' for details > (dbx) p off > off = 0x8042960 > (dbx) p tgath_buf > tgath_buf = 0x9f5c4f0 > "\xd87\x83^H\xa8\xf3\x82^H0^X\x82^H^X\xd4\x81^H^P\x90\x81^H\xb8m\x80^H^H'\x80^H\x88^?^?^H\x908^?^H\xb0\xf7~^H\xd8\xad~^H\xf8\xb2~^H\xb8\x8e~^H\xe8]~^H\xe8\xcb\xed^HP\xe3}^Hh\xdd\xbb^H\x98\xc4}^H\xf0\xa0}^H\xa8r}^HH}\xc3^HpO|^HH^V|^H^X\xd8|^H\xc0\xb1|^H8=}^H\x90\xcd{^H^Pm{^H\xb8#{^Hx'{^H\x90\xf8x^HpKx^H^POx^H\xa8~w^H^H>w^H\xf0\xb2w^H\xc8^Ew^HX'x^H\xf8\xdbv^H" > (dbx) p buf > buf = 0x9f79580 > "\xd87\x83^H\xa8\xf3\x82^H0^X\x82^H^X\xd4\x81^H^P\x90\x81^H\xb8m\x80^H^H'\x80^H\x88^?^?^H\x908^?^H\xb0\xf7~^H\xd8\xad~^H\xf8\xb2~^H\xb8\x8e~^H\xe8]~^H\xe8\xcb\xed^HP\xe3}^Hh\xdd\xbb^H\x98\xc4}^H\xf0\xa0}^H\xa8r}^HH}\xc3^HpO|^HH^V|^H^X\xd8|^H\xc0\xb1|^H8=}^H\x90\xcd{^H^Pm{^H\xb8#{^Hx'{^H\x90\xf8x^HpKx^H^POx^H\xa8~w^H^H>w^H\xf0\xb2w^H\xc8^Ew^HX'x^H\xf8\xdbv^H" > (dbx) p nseq > nseq = 1U > (dbx) p len > len = 0x804195c > (dbx) p len[0..2] > len[0..2] > [0] = 24960U > [1] = 140025512U > [2] = 140013048U > (dbx) > > > The R code in question is: > > ... > /* Loop, while sequences left to process */ > for(curr_seq=0; curr_seq<nseq; curr_seq++) { > /* Get the number of bytes in sequence */ > curr_len=len[curr_seq]; > > HDmemcpy(tgath_buf,buf+off[curr_seq],curr_len); > > /* Advance offset in gather buffer */ > tgath_buf+=curr_len; > } /* end for */ > ... > > where > > ./src/hdf5-1.6.5/src/H5private.h: > #define HDmemcpy(X,Y,Z) memcpy((char*)(X),(const char*)(Y),Z) > > Maybe the "curr_len = 24960U" value is too high. I have no way of > knowing what it should be in this case. > > The crash could be caused by a compiler bug, although it's not very > likely. These crashes have occurred both with and without optimization, > with and without -g. > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
Hi David, Tai-Wei (David) Lin wrote:> Hi R Developers, > > Greg is helping me with debugging R on Solaris 10 x64. Please let us > know if you have any thoughts or tips that can help us debug this. > > Thanks, > > David > > > > ************ > Using default transfer plist > in vector_io: permuting > About to write > > *** caught segfault *** > address e8554000, cause 'memory not mapped' > > Traceback: > 1: .External("do_hdf5save", call, sys.frame(sys.parent()), fileout, > ..., PACKAGE = "hdf5") > 2: hdf5save(hdf5_Fstat, "Fstat", "geneNames", "genotype") > aborting ... > ************ > > We've tried many things to debug it: > > * dbx Runtime Checking (RTC) is not detecting any (meaningful) memory > access problems that I can see. > > * The same on Solaris/SPARC. > > * Neither does Valgrind on Linux. > > * I've tried increasing the C stack size, assuming R could be running > out of stack size. Didn't help. > > Running R under dbx (without RTC) until the crash shows this: > > ... > About to write > t at 1 (l at 1) signal SEGV (no mapping at the fault address) in _memcpy at > 0xfe90444b > 0xfe90444b: _memcpy+0x006b: movaps 0x00000000(%esi),%xmm0 > Current function is H5D_select_mgath > 379 HDmemcpy(tgath_buf,buf+off[curr_seq],curr_len); > (dbx) where > current thread: t at 1 > [1] _memcpy(0x0, 0xfdebc707, 0x9f5c4f0), at 0xfe90444b > =>[2] H5D_select_mgath(_buf = 0x9f79580, space = 0x8966770, iter > 0x8045980, nelmts = 3120U, dxpl_cache = 0xfe170078, _tgath_buf > 0x9f5c4f0), line 379 in "H5Dselect.c" > [3] H5D_contig_write(io_info = 0x804620c, nelmts = 3120ULL, mem_type > 0x97b05c8, mem_space = 0x8966770, file_space = 0x8966770, tpath > 0x8ee7078, src_id = 201326906, dst_id = 201326904, buf = 0x9f79580), > line 1418 in "H5Dio.c" > [4] H5D_write(dataset = 0x8f169c0, mem_type_id = 201326906, mem_space > = 0x8966770, file_space = 0x8966770, dxpl_id = 671088643, buf > 0x9f79580), line 952 in "H5Dio.c" > [5] H5Dwrite(dset_id = 335544330, mem_type_id = 201326906, > mem_space_id = 0, file_space_id = 0, plist_id = 671088643, buf > 0x9f79580), line 586 in "H5Dio.c" > [6] vector_io(call = 0x97234ec, writeflag = 1, dataset = 335544330, > space = 268435472, obj = 0x98386a0), line 535 in "hdf5.c" > [7] hdf5_write_vector(call = 0x97234ec, id = 67108867, symname > 0x9cf35d0 "geneNames", val = 0x98386a0), line 693 in "hdf5.c" > [8] hdf5_save_object(call = 0x97234ec, fid = 67108867, symname > 0x9cf35d0 "geneNames", val = 0x98386a0), line 957 in "hdf5.c" > [9] do_hdf5save(args = 0x9723284), line 1104 in "hdf5.c" > [10] do_External(call = 0x86d62bc, op = 0x8371cd8, args = 0x972340c, > env = 0x9723594), line 832 in "dotcode.c" > [11] Rf_eval(e = 0x86d62bc, rho = 0x9723594), line 445 in "eval.c" > [12] Rf_evalList(el = 0x86d6230, rho = 0x9723594, op = 0x837226c), > line 1463 in "eval.c" > [13] Rf_eval(e = 0x86d6214, rho = 0x9723594), line 438 in "eval.c" > [14] do_begin(call = 0x86d56bc, op = 0x836709c, args = 0x86d61dc, rho > = 0x9723594), line 1107 in "eval.c" > [15] Rf_eval(e = 0x86d56bc, rho = 0x9723594), line 431 in "eval.c" > [16] Rf_applyClosure(call = 0x9723738, op = 0x83c0328, arglist > 0x97236e4, rho = 0x8379b1c, suppliedenv = 0x8379b38), line 614 in "eval.c" > [17] Rf_eval(e = 0x9723738, rho = 0x8379b1c), line 455 in "eval.c" > [18] Rf_ReplIteration(rho = 0x8379b1c, savestack = 0, browselevel = 0, > state = 0x8047328), line 256 in "main.c" > [19] R_ReplConsole(rho = 0x8379b1c, savestack = 0, browselevel = 0), > line 305 in "main.c" > [20] run_Rmainloop(), line 944 in "main.c" > [21] Rf_mainloop(), line 951 in "main.c" > [22] main(ac = 4, av = 0x80477ac), line 33 in "Rmain.c" > (dbx) p curr_len > curr_len = 24960U > (dbx) p curr_seq > curr_seq = 0 > (dbx) p of > dbx: "of" is not defined in the scope > `libhdf5.so.0.0.0`H5Dselect.c`H5D_select_mgath:347` > dbx: see `help scope' for details > (dbx) p off > off = 0x8042960 > (dbx) p tgath_buf > tgath_buf = 0x9f5c4f0 > "\xd87\x83^H\xa8\xf3\x82^H0^X\x82^H^X\xd4\x81^H^P\x90\x81^H\xb8m\x80^H^H'\x80^H\x88^?^?^H\x908^?^H\xb0\xf7~^H\xd8\xad~^H\xf8\xb2~^H\xb8\x8e~^H\xe8]~^H\xe8\xcb\xed^HP\xe3}^Hh\xdd\xbb^H\x98\xc4}^H\xf0\xa0}^H\xa8r}^HH}\xc3^HpO|^HH^V|^H^X\xd8|^H\xc0\xb1|^H8=}^H\x90\xcd{^H^Pm{^H\xb8#{^Hx'{^H\x90\xf8x^HpKx^H^POx^H\xa8~w^H^H>w^H\xf0\xb2w^H\xc8^Ew^HX'x^H\xf8\xdbv^H" > (dbx) p buf > buf = 0x9f79580 > "\xd87\x83^H\xa8\xf3\x82^H0^X\x82^H^X\xd4\x81^H^P\x90\x81^H\xb8m\x80^H^H'\x80^H\x88^?^?^H\x908^?^H\xb0\xf7~^H\xd8\xad~^H\xf8\xb2~^H\xb8\x8e~^H\xe8]~^H\xe8\xcb\xed^HP\xe3}^Hh\xdd\xbb^H\x98\xc4}^H\xf0\xa0}^H\xa8r}^HH}\xc3^HpO|^HH^V|^H^X\xd8|^H\xc0\xb1|^H8=}^H\x90\xcd{^H^Pm{^H\xb8#{^Hx'{^H\x90\xf8x^HpKx^H^POx^H\xa8~w^H^H>w^H\xf0\xb2w^H\xc8^Ew^HX'x^H\xf8\xdbv^H" > (dbx) p nseq > nseq = 1U > (dbx) p len > len = 0x804195c > (dbx) p len[0..2] > len[0..2] > [0] = 24960U > [1] = 140025512U > [2] = 140013048U > (dbx) > > > The R code in question is: > > ... > /* Loop, while sequences left to process */ > for(curr_seq=0; curr_seq<nseq; curr_seq++) { > /* Get the number of bytes in sequence */ > curr_len=len[curr_seq]; > > HDmemcpy(tgath_buf,buf+off[curr_seq],curr_len); > > /* Advance offset in gather buffer */ > tgath_buf+=curr_len; > } /* end for */ > ...What's the initial size of tgath_buf? You need to make sure that you are not stepping out of it i.e. that sum(len[i], for 0<=i<nseq) is not greater than its initial size. That's for the writing side. Same on the reading side: you need to make sure that buf+off[curr_seq]+len[i]-1 is a safe place to be for any 0<=i<nseq. Otherwise, expect bad things to happen. And they are generally not reproducible in a consistent way. So even if this code never crashes on other systems, it doesn't mean that it is not broken. Cheers, H.> > where > > ./src/hdf5-1.6.5/src/H5private.h: > #define HDmemcpy(X,Y,Z) memcpy((char*)(X),(const char*)(Y),Z) > > Maybe the "curr_len = 24960U" value is too high. I have no way of > knowing what it should be in this case. > > The crash could be caused by a compiler bug, although it's not very > likely. These crashes have occurred both with and without optimization, > with and without -g. > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >