Hi, I've recently been playing about with recent ext3 0.0.6b and lvm 0.9.1 beta7 and am now able to trigger an "Attempt to refile free buffer" assertion. This seems to "only" occur when using ext3 on the root filesystem. Possibly that is related to the fact that the lvm utility I'm using to reproduce this problem is modifying data in /etc. The easist reproduction case I've come across to generate the assertion is to load up lvm (insmod if necessary) and then run: while /bin/true; do vgscan -v; done Then wait a while and it will generate the assertion. You can speed it up a bit by running another instance of the above command. :) Again, it doesn't seem to generate the problem when using ext2 on the root filesystem even if I have ext3 in use on seperate filesystems. Also, you do not need to have an LVM device actively mounted to generate this. In my case I have no active lvm devices up and running, just lvm-mod insmoded. Using ext3 0.0.6b and LVM 0.9.1beta6 I can't generate the assertion. Any ideas or thoughts most welcome. Anybody else seen any goofiness like this? Assertion below: ksymoops 2.3.4 on i686 2.2.18pre11. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.2.18pre11/ (default) -m /boot/System.map-2.2.18pre11 (specified) Warning (compare_maps): ksyms_base symbol module_list_R__ver_module_list not found in System.map. Ignoring ksyms_base entry Unable to handle kernel NULL pointer dereference at virtual address 00000000 current->tss.cr3 = 043b4000, %cr3 = 043b4000 *pde = 043ae067 Oops: 0002 CPU: 0 EIP: 0010:[<c012c25f>] Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010296 eax: 0000001e ebx: c572a7e0 ecx: 00000012 edx: 00000029 esi: c572a7e0 edi: c7f9cc80 ebp: 0000000a esp: c43b3e3c ds: 0018 es: 0018 ss: 0018 Process vgscan (pid: 7851, process nr: 58, stackpage=c43b3000) Stack: c7f9cc80 c015f9bb c572a7e0 c572a7e0 c3d52200 c015fbae c7f9cc80 c3d52200 c7f9cc80 c7f9ccb0 c7f9cccc 00000280 00000246 0000027f c539a4a0 c015f86d c7f9cc80 00000280 c5919540 00000280 c7f9cc80 00000000 c015c98b c7f9cc80 Call Trace: [<c015f9bb>] [<c015fbae>] [<c015f86d>] [<c015c98b>] [<c015caa3>] [<c0152986>] [<c0120176>] [<c0139638>] [<c0129067>] [<c01329bc>] [<c0129e40>] [<c012a08e>] [<c010a034>] Code: c6 05 00 00 00 00 00 e9 a6 00 00 00 90 80 7e 29 01 76 0a bb>>EIP; c012c25f <refile_buffer+17/d0> <====Trace; c015f9bb <cleanup_transaction+12b/158>Trace; c015fbae <log_do_checkpoint+1c6/220> Trace; c015f86d <log_wait_for_space+8d/b0> Trace; c015c98b <start_this_handle+307/3a8> Trace; c015caa3 <journal_start+77/a0> Trace; c0152986 <ext3_notify_change+14a/360> Trace; c0120176 <do_generic_file_read+8d2/8e0> Trace; c0139638 <notify_change+40/64> Trace; c0129067 <do_truncate+5f/ac> Trace; c01329bc <open_namei+474/4d4> Trace; c0129e40 <filp_open+44/f8> Trace; c012a08e <sys_open+36/94> Trace; c010a034 <system_call+34/38> Code; c012c25f <refile_buffer+17/d0> 00000000 <_EIP>: Code; c012c25f <refile_buffer+17/d0> <==== 0: c6 05 00 00 00 00 00 movb $0x0,0x0 <====Code; c012c266 <refile_buffer+1e/d0> 7: e9 a6 00 00 00 jmp b2 <_EIP+0xb2> c012c311 <refile_buffer+c9/d0> Code; c012c26b <refile_buffer+23/d0> c: 90 nop Code; c012c26c <refile_buffer+24/d0> d: 80 7e 29 01 cmpb $0x1,0x29(%esi) Code; c012c270 <refile_buffer+28/d0> 11: 76 0a jbe 1d <_EIP+0x1d> c012c27c <refile_buffer+34/d0> Code; c012c272 <refile_buffer+2a/d0> 13: bb 00 00 00 00 mov $0x0,%ebx 1 warning issued. Results may not be reliable.
Jay writes:> I've recently been playing about with recent ext3 0.0.6b and lvm 0.9.1 > beta7 and am now able to trigger an "Attempt to refile free buffer" > assertion. > > This seems to "only" occur when using ext3 on the root filesystem. > Possibly that is related to the fact that the lvm utility I'm using to > reproduce this problem is modifying data in /etc.Yes, I had this same problem with LVM 0.9.1b7 and ext3 0.0.6b.> The easist reproduction case I've come across to generate the assertion is > to load up lvm (insmod if necessary) and then run: > > while /bin/true; do vgscan -v; doneThe same is true even if you only do pvscan (this has no chance to blow up your LVM configuration). The reason is because of LVM calling invalidate_buffers on all of the devices (I believe), but I haven't tracked down all of the reasons it is happening. In __invalidate_buffers, Stephen asked to add in "&& bh->b_jlist == BJ_None" to the checks for put_last_free(), but this only reduced the assertions and did not remove them entirely.> Again, it doesn't seem to generate the problem when using ext2 on the root > filesystem even if I have ext3 in use on seperate filesystems. Also, you > do not need to have an LVM device actively mounted to generate this. In > my case I have no active lvm devices up and running, just lvm-mod > insmoded.This is more than what I figured out. Initially, I thought it had to do with the LVM devices themselves (on which I was running ext3), but after putting in debugging I also see that the buffers belong to the root device. In my case, I have data journaling on root. Is this the case for you? Cheers, Andreas -- Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto, \ would they cancel out, leaving him still hungry?" http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert