Hi, My current code is tripping the following assertion: lib/libzpool/build-kernel/arc.c:736: arc_change_state: Assertion `new_state->size + to_delta >= new_state->lsize (0x2a60000 >= 0x2a64000)` failed. gdb info: Program terminated with signal 6, Aborted. #0 0x00002afcd767847b in raise () from /lib/libc.so.6 (gdb) bt #0 0x00002afcd767847b in raise () from /lib/libc.so.6 #1 0x00002afcd7679da0 in abort () from /lib/libc.so.6 #2 0x0000000000454dff in arc_change_state (new_state=0x591aa0, ab=0x2aaabe2930c0, hash_lock=<value optimized out>) at lib/libzpool/build-kernel/arc.c:735 #3 0x0000000000457f32 in arc_access (buf=0x2aaabe2930c0, hash_lock=0x592c30) at lib/libzpool/build-kernel/arc.c:1637 #4 0x0000000000458ff9 in arc_read_done (zio=0x2aaabcfa4ed0) at lib/libzpool/build-kernel/arc.c:1850 #5 0x000000000044fb9f in zio_done (zio=0x2aaabcfa4ed0) at lib/libzpool/build-kernel/zio.c:868 #6 0x00000000004527f0 in zio_vdev_io_assess (zio=0x2aaabcfa4ed0) at lib/libzpool/build-kernel/zio.c:1491 #7 0x0000000000466ecf in taskq_thread (arg=<value optimized out>) at lib/libsolkerncompat/taskq.c:160 #8 0x00002afcd74273ca in start_thread () from /lib/libpthread.so.0 #9 0x00002afcd771555d in clone () from /lib/libc.so.6 #10 0x0000000000000000 in ?? () (gdb) frame 2 #2 0x0000000000454dff in arc_change_state (new_state=0x591aa0, ab=0x2aaabe2930c0, hash_lock=<value optimized out>) at lib/libzpool/build-kernel/arc.c:735 735 ASSERT3U(new_state->size + to_delta, >=, (gdb) print new_state->size $1 = 44695552 (gdb) print to_delta $2 = 131072 (gdb) print new_state->lsize $3 = 44449792 My code is synced to the ON Mercurial repository onnv_56 tag, with some minor changes in arc.c (diff attached). Do you have any idea what this might be? Thanks. -------------- next part -------------- A non-text attachment was scrubbed... Name: zfs-fuse-arc.diff Type: text/x-diff Size: 2559 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-code/attachments/20070123/05ca882a/attachment.bin>
On Tuesday 23 January 2007 19:01, Ricardo Correia wrote:> My current code is tripping the following assertion: > lib/libzpool/build-kernel/arc.c:736: arc_change_state: Assertion > `new_state->size + to_delta >= new_state->lsize (0x2a60000 >= 0x2a64000)` > failed.(snip)> (gdb) print new_state->size > $1 = 44695552 > (gdb) print to_delta > $2 = 131072 > (gdb) print new_state->lsize > $3 = 44449792I''ve just noticed how the "new_state->size + to_delta" value from the crash dump is different from the assertion. I guess it might be a race condition, since I''ve just implemented multithreaded operation, but I think I''m missing some locks that are done by the Solaris VFS.
On Jan 23, 2007, at 11:28 AM, Ricardo Correia wrote:> On Tuesday 23 January 2007 19:01, Ricardo Correia wrote: >> My current code is tripping the following assertion: >> lib/libzpool/build-kernel/arc.c:736: arc_change_state: Assertion >> `new_state->size + to_delta >= new_state->lsize (0x2a60000 >= >> 0x2a64000)` >> failed. > > (snip) > >> (gdb) print new_state->size >> $1 = 44695552 >> (gdb) print to_delta >> $2 = 131072 >> (gdb) print new_state->lsize >> $3 = 44449792 > > I''ve just noticed how the "new_state->size + to_delta" value from > the crash > dump is different from the assertion. > > I guess it might be a race condition, since I''ve just implemented > multithreaded operation, but I think I''m missing some locks that > are done by > the Solaris VFS. > _______________________________________________Right, i would verify your locks are working correctly (especially make sure atomic_add_64() is truly atomic). Note, these locks are in the ARC - so they are not in the VFS. eric
On Wednesday 24 January 2007 00:04, eric kustarz wrote:> Right, i would verify your locks are working correctly (especially > make sure atomic_add_64() is truly atomic). Note, these locks are in > the ARC - so they are not in the VFS.Yes, atomic_add_64() should be truly atomic, since I''ve taken that (assembly) code from OpenSolaris :) Although I have to ask. The atomic_add_64() itself is atomic, but couldn''t the ab->b_state->lsize value change between the atomic_add_64() and the ASSERT3U()? Unless the mutex is protecting this value. But then why would atomic_add_64() be needed? Now I''m confused. As you can probably see already, I have no clue about that piece of code.. :) The locks I was referring to were the VOP_RWLOCK() locks in the VFS read() and write() syscalls, possibly some others as well that I still haven''t implemented. I have to do a code review to see what''s missing. Thanks.
Ricardo Correia wrote:> On Wednesday 24 January 2007 00:04, eric kustarz wrote: >> Right, i would verify your locks are working correctly (especially >> make sure atomic_add_64() is truly atomic). Note, these locks are in >> the ARC - so they are not in the VFS. > > Yes, atomic_add_64() should be truly atomic, since I''ve taken that (assembly) > code from OpenSolaris :) > > Although I have to ask. The atomic_add_64() itself is atomic, but couldn''t the > ab->b_state->lsize value change between the atomic_add_64() and the > ASSERT3U()? > > Unless the mutex is protecting this value. But then why would atomic_add_64() > be needed? Now I''m confused. As you can probably see already, I have no clue > about that piece of code.. :) > > The locks I was referring to were the VOP_RWLOCK() locks in the VFS read() and > write() syscalls, possibly some others as well that I still haven''t > implemented. I have to do a code review to see what''s missing. >Don''t worry about VOP_RWLOCK/VOP_RWUNLOCK(), ZFS doesn''t implement those vops. ZFS/ZPL uses its own locks to guarantee the proper POSIX semantics. -Mark