I am running a multithreaded test program that calls the
Berkeley DB library. This test runs on many platforms.
Lately I have been trying it under CentOS on an Intel x86_64
which is hyperthreaded.
I get a variety of failures. Mostly they appear to be because
of addressing errors. For instance:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1178638688 (LWP 7957)]
0x0000002a956225af in __db_cursor_int (dbp=0x2f2f388, txn=0x0,
dbtype=DB_BTREE, root=0, is_opd=0, lockerid=0, dbcp=0x46408c20)
at ../dist/../db/db_am.c:68
68 if (dbtype == dbc->dbtype) {
(gdb) p dbc
$6 = (DBC *) 0x2f3ae5800000000
Note all the zeros. Now if we shift the
address down:
(gdb) p *(DBC*) 0x2f3ae58
$7 = {dbp = 0x2f2f388, txn = 0x0, links = {tqe_next = 0x2f47ef8,
tqe_prev = 0x2f3d4c8}, rskey = 0x2f3ae90, rkey = 0x2f3aeb0,
rdata = 0x2f3aed0, my_rskey = {data = 0x0, size = 0, ulen = 0,
dlen = 0,
doff = 0, flags = 0}, my_rkey = {data = 0x0, size = 0, ulen = 0,
dlen = 0,
doff = 0, flags = 0}, my_rdata = {data = 0x0, size = 0, ulen = 0,
dlen = 0, doff = 0, flags = 0}, lref = 0x2f2a290, locker = 19,
lock_dbt = {
data = 0x2f3af20, size = 28, ulen = 0, dlen = 0, doff = 0, flags
= 0},
lock = {pgno = 925,
fileid = "X@\203\000\000?\000\000?\224E\205?\036\000\000\000
\000\000",
type = 3}, mylock = {off = 0, ndx = 0, gen = 0, mode = DB_LOCK_NG},
cl_id = 0, dbtype = DB_BTREE, internal = 0x2f39228,
c_close = 0x2a956388a3 <__db_c_close_pp>,
c_count = 0x2a95638a63 <__db_c_count_pp>,
c_del = 0x2a95638ba7 <__db_c_del_pp>, c_dup = 0x2a95638ea3
<__db_c_dup_pp>,
c_get = 0x2a95638fce <__db_c_get_pp>,
c_pget = 0x2a95639746 <__db_c_pget_pp>,
c_put = 0x2a956399f6 <__db_c_put_pp>, c_am_bulk = 0x2a9557fdd2
<__bam_bulk>,
c_am_close = 0x2a9557dcc4 <__bam_c_close>,
c_am_del = 0x2a9557ebd3 <__bam_c_del>,
c_am_destroy = 0x2a9557e50e <__bam_c_destroy>,
c_am_get = 0x2a9557f10a <__bam_c_get>,
c_am_put = 0x2a95582040 <__bam_c_put>,
c_am_writelock = 0x2a9558314f <__bam_c_writelock>, flags = 288}
This is the right address.
Sometimes this part of the code executes correctly and it fails
elsewhere. Often appearing that an address has been shifted
by 32 bits.
Any ideas?