Jan Kirchhoff
2006-Jun-12 13:03 UTC
[Ocfs2-users] kernel BUG at /usr/src/ocfs2-1.2.1/fs/ocfs2/file.c:494!
Hi, First of all, I'm new to ocfs2 and drbd. I set up two identical servers (Athlon64, 1GB RAM, GB-Ethernet) with Debian Etch, compiled my own kernel (2.6.16.20), then compiled the drbd-modules and ocfs (modules and tools) from source. The process of getting everything up and running was very easy. I have one big 140GB partition that is synced with drbd (in c-mode) and has an ocfs2 filesystem on it. The servers will be webservers so the data ist the whole document-root (mostly pdfs for download) and CGIs. I rsynced 31GB of data from another server onto this partition last week and did some simple testing and everything looked good. Today though, I typed in the url of one of the servers in my browser and didn't get anything back but an apache-error after a 3 minute timeout of the cgi-script. The same with the second system :( There has been no traffic/load on the servers but my testing through the browser. dmesg shows me the following (same error on both systems): XFS mounting filesystem sda6 Ending clean XFS mount for filesystem: sda6 NET: Registered protocol family 10 lo: Disabled Privacy Extensions IPv6 over IPv4 tunneling driver OCFS2 Node Manager 1.2.1 Tue Jun 6 13:24:21 CEST 2006 (build d647396d7a65bfeeaad84fa736d4dd1c) OCFS2 DLM 1.2.1 Tue Jun 6 13:24:21 CEST 2006 (build 70adfba8f7c9ce44dac2d47ec99bb7d2) OCFS2 DLMFS 1.2.1 Tue Jun 6 13:24:21 CEST 2006 (build 70adfba8f7c9ce44dac2d47ec99bb7d2) OCFS2 User DLM kernel interface loaded eth0: no IPv6 routers present drbd0: disk( Diskless -> Attaching ) drbd0: drbd_bm_resize called with capacity == 351555584 drbd0: resync bitmap: bits=43944448 words=1373264 drbd0: size = 167 GB (175777792 KB) drbd0: reading of bitmap took 152 jiffies drbd0: recounting of set bits took additional 8 jiffies drbd0: 0 KB marked out-of-sync by on disk bit-map. drbd0: Found 6 transactions (276 active extents) in activity log. drbd0: disk( Attaching -> Consistent ) drbd0: Writing meta data super block now. drbd1: disk( Diskless -> Attaching ) drbd1: drbd_bm_resize called with capacity == 9767080 drbd1: resync bitmap: bits=1220885 words=38154 drbd1: size = 4769 MB (4883540 KB) drbd1: reading of bitmap took 8 jiffies drbd1: recounting of set bits took additional 0 jiffies drbd1: 0 KB marked out-of-sync by on disk bit-map. drbd1: Found 6 transactions (103 active extents) in activity log. drbd1: disk( Attaching -> Consistent ) drbd1: Writing meta data super block now. drbd0: conn( StandAlone -> Unconnected ) drbd0: conn( Unconnected -> WFConnection ) drbd1: conn( StandAlone -> Unconnected ) drbd1: conn( Unconnected -> WFConnection ) drbd0: conn( WFConnection -> WFReportParams ) drbd0: Handshake successful: DRBD Network Protocol version 80 drbd0: Peer authenticated usind 20 bytes of 'sha1' HMAC drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapT ) disk( Consistent -> Outdated ) pdsk( DUnknown -> UpToDate ) drbd0: Writing meta data super block now. drbd1: conn( WFConnection -> WFReportParams ) drbd1: Handshake successful: DRBD Network Protocol version 80 drbd1: Peer authenticated usind 20 bytes of 'sha1' HMAC drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapT ) disk( Consistent -> Outdated ) pdsk( DUnknown -> UpToDate ) drbd1: Writing meta data super block now. drbd1: conn( WFBitMapT -> WFSyncUUID ) drbd1: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent ) drbd1: Began resync as SyncTarget (will sync 32 KB [8 bits set]). drbd1: Writing meta data super block now. drbd0: conn( WFBitMapT -> WFSyncUUID ) drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent ) drbd0: Began resync as SyncTarget (will sync 32 KB [8 bits set]). drbd0: Writing meta data super block now. drbd1: Resync done (total 1 sec; paused 0 sec; 32 K/sec) drbd1: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate ) drbd1: Writing meta data super block now. drbd0: Resync done (total 1 sec; paused 0 sec; 32 K/sec) drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate ) drbd0: Writing meta data super block now. drbd0: peer( Secondary -> Primary ) drbd1: peer( Secondary -> Primary ) drbd0: role( Secondary -> Primary ) drbd0: Writing meta data super block now. drbd1: role( Secondary -> Primary ) drbd1: Writing meta data super block now. o2net: accepted connection from node portal2 (num 1) at 192.168.0.82:7777 OCFS2 1.2.1 Tue Jun 6 13:24:15 CEST 2006 (build bd2f25ba0af9677db3572e3ccd92f739) ocfs2_dlm: Nodes in domain ("8B6DD64326394C308A4E2B2259162C78"): 0 1 kjournald starting. Commit interval 5 seconds ocfs2: Mounting device (147,0) on (node 0, slot 1) (16918,0):ocfs2_truncate_file:494 ERROR: bug expression: le64_to_cpu(fe->i_size) != i_size_read(inode) (16918,0):ocfs2_truncate_file:494 ERROR: Inode 42363033, inode i_size = 1129 != di i_size = 1120, i_flags = 0x1 ------------[ cut here ]------------ kernel BUG at /usr/src/ocfs2-1.2.1/fs/ocfs2/file.c:494! invalid opcode: 0000 [#1] SMP Modules linked in: ocfs2 sha1 ocfs2_dlmfs ocfs2_dlm ocfs2_nodemanager configfs ipv6 ext3 jbd dm_mod drbd sr_mod sbp2 ide_generic ide_disk ide_cd cdrom eth1394 mousedev tsdev psmouse ehci_hcd ohci_hcd amd74xx generic parport_pc parport evdev serio_raw usbcore ohci1394 ieee1394 nvnet ide_core rtc floppy pcspkr snd_hda_intel snd_hda_codec snd_pcm snd_timer snd soundcore snd_page_alloc CPU: 0 EIP: 0060:[<f95374f6>] Tainted: P VLI EFLAGS: 00210286 (2.6.16.20ll-wbsrv #1) EIP is at ocfs2_setattr+0x6a8/0x12cf [ocfs2] eax: 00000073 ebx: 00000000 ecx: ffffffff edx: ffffff23 esi: 00000469 edi: 00000000 ebp: d3875000 esp: e1c31eb8 ds: 007b es: 007b ss: 0068 Process ix.cgi (pid: 16918, threadinfo=e1c30000 task=f26d5a90) Stack: <0>00000000 00000000 00000000 c0cc5e08 f491e800 c0cc5f7c 00000460 00000000 00000460 00000000 00000000 00000000 00000000 d75695e8 d75695e8 00000000 c0cc5e08 00002008 e1c31f38 c0162405 e904584c e1c31f38 01222222 448d600f Call Trace: [<c0162405>] notify_change+0x13f/0x2da [<c014a7c9>] do_truncate+0x59/0x72 [<c014a902>] do_sys_ftruncate+0x120/0x13f [<c014a947>] sys_ftruncate64+0x13/0x15 [<c010278d>] syscall_call+0x7/0xb Code: fd ff ff ff b1 f8 fd ff ff 68 ee 01 00 00 68 37 90 55 f9 ff 70 10 8b 00 ff b0 9c 00 00 00 68 36 dc 55 f9 e8 e6 10 be c6 83 c4 30 <0f> 0b ee 01 ed d4 55 f9 8b 4d 24 39 4c 24 08 8b 55 20 0f 82 c3 BUG: ix.cgi/16918, lock held at task exit time! [c0cc5e7c] {inode_init_once} .. held by: ix.cgi:16918 [f26d5a90, 116] ... acquired at: do_truncate+0x50/0x72 portal1:~# uname -a Linux portal1 2.6.16.20ll-wbsrv #1 SMP Tue Jun 6 12:33:55 CEST 2006 i686 GNU/Linux portal1:~# cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 47 model name : AMD Athlon(tm) 64 Processor 3500+ stepping : 2 cpu MHz : 2210.343 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm ts fid vid ttp tm stc bogomips : 4429.36 Can anybody help me with that? thanks Jan
John Hess
2006-Jun-20 17:58 UTC
[Ocfs2-users] kernel BUG at /usr/src/ocfs2-1.2.1/fs/ocfs2/file.c:494!
Jan Kirchhoff <kirchy <at> gmx.de> writes:> > Hi, > > First of all, I'm new to ocfs2 and drbd....> (16918,0):ocfs2_truncate_file:494 ERROR: bug expression: le64_to_cpu(fe- >i_size) != i_size_read(inode) > (16918,0):ocfs2_truncate_file:494 ERROR: Inode 42363033, inode i_size = 1129 != di i_size = 1120,> i_flags = 0x1 > ------------[ cut here ]------------ > kernel BUG at /usr/src/ocfs2-1.2.1/fs/ocfs2/file.c:494! > invalid opcode: 0000 [#1]...> > Can anybody help me with that? > > thanks > Jan >We are getting the same exact error, but have been unable to resolve it so far. Did you ever get this problem solved? Anyone else have any tips? Thanks, John