Moving forward, I created an MDT/MGS on a disk partition: bio-ppc-45:~# /vol/lustre-1.5.95/sbin/mkfs.lustre --fsname=lus1 --mdt --mgs --reformat /dev/sda2 Permanent disk data: Target: lus1-MDTffff Index: unassigned Lustre FS: lus1 Mount type: ldiskfs Flags: 0x75 (MDT MGS needs_index first_time update ) Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr Parameters: device size = 10240MB formatting backing filesystem ldiskfs on /dev/sda2 target name lus1-MDTffff 4k blocks 0 options -J size=400 -i 4096 -I 512 -q -O dir_index -F mkfs_cmd = mkfs.ext2 -j -b 4096 -L lus1-MDTffff -J size=400 -i 4096 - I 512 -q -O dir_index -F /dev/sda2 Writing CONFIGS/mountdata bio-ppc-45:~# mount -t lustre /dev/sda2 /mnt/lustre/mds Mounted no problem. From console: LDISKFS FS on sda2, internal journal LDISKFS-fs: mounted filesystem with ordered data mode. Lustre: OBD class driver, info@clusterfs.com Lustre Version: 1.5.95 Build Version: 1.5.95-19700101000000- PRISTINE-.tmp.x.linux-2.6.18-2.6.18 Lustre: Added LNI 192.5.200.69@tcp [8/256] Lustre: Accept secure, port 988 Lustre: Lustre Client File System; info@clusterfs.com Lustre: mount data: Lustre: device: /dev/sda2 Lustre: flags: 0 kjournald starting. Commit interval 5 seconds LDISKFS FS on sda2, internal journal LDISKFS-fs: mounted filesystem with ordered data mode. Lustre: disk data: Lustre: server: lus1-MDTffff Lustre: uuid: Lustre: fs: lus1 Lustre: index: ffff Lustre: config: 1 Lustre: flags: 0x75 Lustre: diskfs: ldiskfs Lustre: options: errors=remount-ro,iopen_nopriv,user_xattr Lustre: params: Lustre: comment: kjournald starting. Commit interval 5 seconds LDISKFS FS on sda2, internal journal LDISKFS-fs: mounted filesystem with ordered data mode. LDISKFS-fs: mballoc enabled Lustre: MGS MGS started Lustre: disk data: Lustre: server: lus1-MDT0000 Lustre: uuid: Lustre: fs: lus1 Lustre: index: 0000 Lustre: config: 2 Lustre: flags: 0x5 Lustre: diskfs: ldiskfs Lustre: options: errors=remount-ro,iopen_nopriv,user_xattr Lustre: params: Lustre: comment: Lustre: Enabling user_xattr Lustre: lus1-MDT0000: new disk, initializing Lustre: MDT lus1-MDT0000 now serving dev (lus1- MDT0000/3a216daa-81dc-4683-aefc-4e8660383993) with recovery enabled Lustre: 0 UP mgs MGS MGS 5 Lustre: 1 UP mgc MGC192.5.200.69@tcp f1842a6f-8434-1fc0-f598- ce739bcf44ee 6 Lustre: 2 UP mdt MDS MDS_uuid 3 Lustre: 3 UP lov lus1-mdtlov lus1-mdtlov_UUID 4 Lustre: 4 UP mds lus1-MDT0000 lus1-MDT0000_UUID 3 Lustre: mount /dev/sda2 complete The problem comes when trying to bring an OST online: bio-ppc-45:~# /vol/lustre-1.5.95/sbin/mkfs.lustre --fsname=lus1 --ost --mgsnode=bio-ppc-45@tcp0 --reformat /dev/sda3 Permanent disk data: Target: lus1-OSTffff Index: unassigned Lustre FS: lus1 Mount type: ldiskfs Flags: 0x72 (OST needs_index first_time update ) Persistent mount opts: errors=remount-ro,extents,mballoc Parameters: mgsnode=192.5.200.69@tcp device size = 102400MB formatting backing filesystem ldiskfs on /dev/sda3 target name lus1-OSTffff 4k blocks 0 options -J size=400 -i 16384 -I 256 -q -O dir_index -F mkfs_cmd = mkfs.ext2 -j -b 4096 -L lus1-OSTffff -J size=400 -i 16384 -I 256 -q -O dir_index -F /dev/sda3 Writing CONFIGS/mountdata bio-ppc-45:~# mount -t lustre /dev/sda3 /mnt/lustre/osd1 [CRASH] From the console: LDISKFS FS on sda3, internal journal LDISKFS-fs: mounted filesystem with ordered data mode. Lustre: mount data: Lustre: device: /dev/sda3 Lustre: flags: 0 kjournald starting. Commit interval 5 seconds LDISKFS FS on sda3, internal journal LDISKFS-fs: mounted filesystem with ordered data mode. Lustre: disk data: Lustre: server: lus1-OSTffff Lustre: uuid: Lustre: fs: lus1 Lustre: index: ffff Lustre: config: 1 Lustre: flags: 0x72 Lustre: diskfs: ldiskfs Lustre: options: errors=remount-ro,extents,mballoc Lustre: params: mgsnode=192.5.200.69@tcp Lustre: comment: kjournald starting. Commit interval 5 seconds LDISKFS FS on sda3, internal journal LDISKFS-fs: mounted filesystem with ordered data mode. LDISKFS-fs: file extents enabled LDISKFS-fs: mballoc enabled Lustre: disk data: Lustre: server: lus1-OST0000 Lustre: uuid: Lustre: fs: lus1 Lustre: index: 0000 Lustre: config: 2 Lustre: flags: 0x2 Lustre: diskfs: ldiskfs Lustre: options: errors=remount-ro,extents,mballoc Lustre: params: mgsnode=192.5.200.69@tcp Lustre: comment: BUG: soft lockup detected on CPU#1! Call Trace: [C000000076991DC0] [C00000000000EBD0] .show_stack+0x68/0x1b0 (unreliable) [C000000076991E60] [C000000000077484] .softlockup_tick+0xf0/0x13c [C000000076991F10] [C000000000056114] .run_local_timers+0x1c/0x30 [C000000076991F90] [C00000000001B6C4] .timer_interrupt+0xa4/0x464 [C000000076992070] [C0000000000034EC] decrementer_common+0xec/0x100 --- Exception: 901 at .generic_find_next_zero_le_bit+0xb8/0x21c LR = .ldiskfs_mb_init_cache+0x590/0x7a4 [ldiskfs] [C000000076992360] [D0000000009D55D0] .ldiskfs_mb_init_cache +0x174/0x7a4 [ldiskfs] (unreliable) [C0000000769924D0] [D0000000009D5DEC] .ldiskfs_mb_load_buddy+0x1ec/ 0x358 [ldiskfs] [C000000076992580] [D0000000009D71D4] .ldiskfs_mb_new_blocks +0x514/0x14c4 [ldiskfs] [C000000076992720] [D0000000009D8448] .ldiskfs_new_block+0x4c/0x5c [ldiskfs] [C0000000769927B0] [D0000000009D3660] .ldiskfs_ext_get_block +0x318/0x480 [ldiskfs] [C0000000769928D0] [D0000000009BEEE0] .ldiskfs_getblk+0xa8/0x2dc [ldiskfs] [C0000000769929D0] [D0000000009C04C4] .ldiskfs_bread+0x18/0xb8 [ldiskfs] [C000000076992A60] [D000000000995960] .fsfilt_ldiskfs_write_record +0x18c/0x51c [fsfilt_ldiskfs] [C000000076992B70] [D0000000006AD44C] .llog_lvfs_write_blob +0x124/0x57c [obdclass] [C000000076992C60] [D0000000006AF4D8] .llog_lvfs_write_rec +0xae0/0xe48 [obdclass] [C000000076992D60] [D000000000957634] .mgc_copy_handler+0x364/0x860 [mgc] [C000000076992E60] [D0000000006A665C] .llog_process+0xd54/0x12f0 [obdclass] [C000000076992F70] [D000000000952A88] .mgc_copy_llog+0xc18/0xe10 [mgc] [C000000076993050] [D0000000009538E4] .mgc_process_log+0xc64/0x144c [mgc] [C000000076993240] [D000000000958DC0] .mgc_process_config +0x1290/0x198c [mgc] [C000000076993330] [D0000000006F5A6C] .lustre_process_log +0x1028/0x15dc [obdclass] [C000000076993460] [D0000000006F712C] .server_start_targets+0x110c/ 0x208c [obdclass] [C000000076993570] [D0000000006F8A50] .server_fill_super+0x9a4/0x1174 [obdclass] [C000000076993640] [D0000000006FB36C] .lustre_fill_super+0x214c/ 0x2374 [obdclass] [C000000076993730] [C0000000000B07B4] .get_sb_nodev+0x88/0xf8 [C0000000769937D0] [D0000000006E5DA4] .lustre_get_sb+0x20/0x38 [obdclass] [C000000076993850] [C0000000000B032C] .vfs_kern_mount+0x80/0xe8 [C0000000769938F0] [C0000000000B03F0] .do_kern_mount+0x4c/0x80 [C000000076993990] [C0000000000CE580] .do_mount+0x6d8/0x784 [C000000076993D70] [C0000000000CE6EC] .sys_mount+0xc0/0x140 [C000000076993E30] [C00000000000871C] syscall_exit+0x0/0x40 BUG: soft lockup detected on CPU#0! Call Trace: [C00000000FFBF0A0] [C00000000000EBD0] .show_stack+0x68/0x1b0 (unreliable) [C00000000FFBF140] [C000000000077484] .softlockup_tick+0xf0/0x13c [C00000000FFBF1F0] [C000000000056114] .run_local_timers+0x1c/0x30 [C00000000FFBF270] [C00000000001B6C4] .timer_interrupt+0xa4/0x464 [C00000000FFBF350] [C0000000000034EC] decrementer_common+0xec/0x100 --- Exception: 901 at .lock_kernel+0x64/0x8c LR = .nfs_permission+0xd8/0x274 [C00000000FFBF640] [C00000000FFBF6E0] 0xc00000000ffbf6e0 (unreliable) [C00000000FFBF700] [C0000000000B9EC0] .permission+0xb0/0xe8 [C00000000FFBF780] [C0000000000BB848] .__link_path_walk+0x174/0x15f0 [C00000000FFBF910] [C0000000000BCD60] .link_path_walk+0x9c/0x184 [C00000000FFBFA80] [C0000000000BD34C] .do_path_lookup+0x320/0x368 [C00000000FFBFB30] [C0000000000BE0DC] .__user_walk_fd_it+0x60/0xa0 [C00000000FFBFBD0] [C0000000000B46D0] .vfs_stat_fd+0x70/0xc8 [C00000000FFBFD30] [C0000000000B484C] .sys_newstat+0x2c/0x60 [C00000000FFBFE30] [C00000000000871C] syscall_exit+0x0/0x40 This is, btw, running on a netbooted machine with root on NFS (I bring it up as I see nfs_permission in the one callback there). Disk layout: bio-ppc-45:~# mac-fdisk -l /dev/sda # type name length base ( size ) system /dev/sda1 Apple_partition_map Apple 63 @ 1 ( 31.5k) Partition map /dev/sda2 Apple_UNIX_SVR2 lustre-mds 20971520 @ 64 ( 10.0G) Linux native /dev/sda3 Apple_UNIX_SVR2 lustre-osd1 209715200 @ 20971584 (100.0G) Linux native /dev/sda4 Apple_Free Extra 550735984 @ 230686784 (262.6G) Free space Block size=512, Number of Blocks=781422768 DeviceType=0x0, DeviceId=0x0 Thanks, --bob
oh, and one other datapoint - it worked fine when I created mgs/mgt and ost as loopback filesystems on a single machine. On Nov 3, 2006, at 2:43 PM, Robert Olson wrote:> Moving forward, I created an MDT/MGS on a disk partition: > > bio-ppc-45:~# /vol/lustre-1.5.95/sbin/mkfs.lustre --fsname=lus1 -- > mdt --mgs --reformat /dev/sda2 > > Permanent disk data: > Target: lus1-MDTffff > Index: unassigned > Lustre FS: lus1 > Mount type: ldiskfs > Flags: 0x75 > (MDT MGS needs_index first_time update ) > Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr > Parameters: > > device size = 10240MB > formatting backing filesystem ldiskfs on /dev/sda2 > target name lus1-MDTffff > 4k blocks 0 > options -J size=400 -i 4096 -I 512 -q -O dir_index -F > mkfs_cmd = mkfs.ext2 -j -b 4096 -L lus1-MDTffff -J size=400 -i > 4096 -I 512 -q -O dir_index -F /dev/sda2 > Writing CONFIGS/mountdata > > bio-ppc-45:~# mount -t lustre /dev/sda2 /mnt/lustre/mds > > > Mounted no problem. From console: > > LDISKFS FS on sda2, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > Lustre: OBD class driver, info@clusterfs.com > Lustre Version: 1.5.95 > Build Version: 1.5.95-19700101000000- > PRISTINE-.tmp.x.linux-2.6.18-2.6.18 > Lustre: Added LNI 192.5.200.69@tcp [8/256] > Lustre: Accept secure, port 988 > Lustre: Lustre Client File System; info@clusterfs.com > Lustre: mount data: > Lustre: device: /dev/sda2 > Lustre: flags: 0 > kjournald starting. Commit interval 5 seconds > LDISKFS FS on sda2, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > Lustre: disk data: > Lustre: server: lus1-MDTffff > Lustre: uuid: > Lustre: fs: lus1 > Lustre: index: ffff > Lustre: config: 1 > Lustre: flags: 0x75 > Lustre: diskfs: ldiskfs > Lustre: options: errors=remount-ro,iopen_nopriv,user_xattr > Lustre: params: > Lustre: comment: > kjournald starting. Commit interval 5 seconds > LDISKFS FS on sda2, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LDISKFS-fs: mballoc enabled > Lustre: MGS MGS started > Lustre: disk data: > Lustre: server: lus1-MDT0000 > Lustre: uuid: > Lustre: fs: lus1 > Lustre: index: 0000 > Lustre: config: 2 > Lustre: flags: 0x5 > Lustre: diskfs: ldiskfs > Lustre: options: errors=remount-ro,iopen_nopriv,user_xattr > Lustre: params: > Lustre: comment: > Lustre: Enabling user_xattr > Lustre: lus1-MDT0000: new disk, initializing > Lustre: MDT lus1-MDT0000 now serving dev (lus1- > MDT0000/3a216daa-81dc-4683-aefc-4e8660383993) with recovery enabled > Lustre: 0 UP mgs MGS MGS 5 > Lustre: 1 UP mgc MGC192.5.200.69@tcp f1842a6f-8434-1fc0-f598- > ce739bcf44ee 6 > Lustre: 2 UP mdt MDS MDS_uuid 3 > Lustre: 3 UP lov lus1-mdtlov lus1-mdtlov_UUID 4 > Lustre: 4 UP mds lus1-MDT0000 lus1-MDT0000_UUID 3 > Lustre: mount /dev/sda2 complete > > > The problem comes when trying to bring an OST online: > > bio-ppc-45:~# /vol/lustre-1.5.95/sbin/mkfs.lustre --fsname=lus1 -- > ost --mgsnode=bio-ppc-45@tcp0 --reformat /dev/sda3 > > Permanent disk data: > Target: lus1-OSTffff > Index: unassigned > Lustre FS: lus1 > Mount type: ldiskfs > Flags: 0x72 > (OST needs_index first_time update ) > Persistent mount opts: errors=remount-ro,extents,mballoc > Parameters: mgsnode=192.5.200.69@tcp > > device size = 102400MB > formatting backing filesystem ldiskfs on /dev/sda3 > target name lus1-OSTffff > 4k blocks 0 > options -J size=400 -i 16384 -I 256 -q -O dir_index -F > mkfs_cmd = mkfs.ext2 -j -b 4096 -L lus1-OSTffff -J size=400 -i > 16384 -I 256 -q -O dir_index -F /dev/sda3 > Writing CONFIGS/mountdata > bio-ppc-45:~# mount -t lustre /dev/sda3 /mnt/lustre/osd1 > [CRASH] > > From the console: > > LDISKFS FS on sda3, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > Lustre: mount data: > Lustre: device: /dev/sda3 > Lustre: flags: 0 > kjournald starting. Commit interval 5 seconds > LDISKFS FS on sda3, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > Lustre: disk data: > Lustre: server: lus1-OSTffff > Lustre: uuid: > Lustre: fs: lus1 > Lustre: index: ffff > Lustre: config: 1 > Lustre: flags: 0x72 > Lustre: diskfs: ldiskfs > Lustre: options: errors=remount-ro,extents,mballoc > Lustre: params: mgsnode=192.5.200.69@tcp > Lustre: comment: > kjournald starting. Commit interval 5 seconds > LDISKFS FS on sda3, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LDISKFS-fs: file extents enabled > LDISKFS-fs: mballoc enabled > Lustre: disk data: > Lustre: server: lus1-OST0000 > Lustre: uuid: > Lustre: fs: lus1 > Lustre: index: 0000 > Lustre: config: 2 > Lustre: flags: 0x2 > Lustre: diskfs: ldiskfs > Lustre: options: errors=remount-ro,extents,mballoc > Lustre: params: mgsnode=192.5.200.69@tcp > Lustre: comment: > BUG: soft lockup detected on CPU#1! > Call Trace: > [C000000076991DC0] [C00000000000EBD0] .show_stack+0x68/0x1b0 > (unreliable) > [C000000076991E60] [C000000000077484] .softlockup_tick+0xf0/0x13c > [C000000076991F10] [C000000000056114] .run_local_timers+0x1c/0x30 > [C000000076991F90] [C00000000001B6C4] .timer_interrupt+0xa4/0x464 > [C000000076992070] [C0000000000034EC] decrementer_common+0xec/0x100 > --- Exception: 901 at .generic_find_next_zero_le_bit+0xb8/0x21c > LR = .ldiskfs_mb_init_cache+0x590/0x7a4 [ldiskfs] > [C000000076992360] [D0000000009D55D0] .ldiskfs_mb_init_cache > +0x174/0x7a4 [ldiskfs] (unreliable) > [C0000000769924D0] [D0000000009D5DEC] .ldiskfs_mb_load_buddy+0x1ec/ > 0x358 [ldiskfs] > [C000000076992580] [D0000000009D71D4] .ldiskfs_mb_new_blocks > +0x514/0x14c4 [ldiskfs] > [C000000076992720] [D0000000009D8448] .ldiskfs_new_block+0x4c/0x5c > [ldiskfs] > [C0000000769927B0] [D0000000009D3660] .ldiskfs_ext_get_block > +0x318/0x480 [ldiskfs] > [C0000000769928D0] [D0000000009BEEE0] .ldiskfs_getblk+0xa8/0x2dc > [ldiskfs] > [C0000000769929D0] [D0000000009C04C4] .ldiskfs_bread+0x18/0xb8 > [ldiskfs] > [C000000076992A60] [D000000000995960] .fsfilt_ldiskfs_write_record > +0x18c/0x51c [fsfilt_ldiskfs] > [C000000076992B70] [D0000000006AD44C] .llog_lvfs_write_blob > +0x124/0x57c [obdclass] > [C000000076992C60] [D0000000006AF4D8] .llog_lvfs_write_rec > +0xae0/0xe48 [obdclass] > [C000000076992D60] [D000000000957634] .mgc_copy_handler+0x364/0x860 > [mgc] > [C000000076992E60] [D0000000006A665C] .llog_process+0xd54/0x12f0 > [obdclass] > [C000000076992F70] [D000000000952A88] .mgc_copy_llog+0xc18/0xe10 [mgc] > [C000000076993050] [D0000000009538E4] .mgc_process_log+0xc64/0x144c > [mgc] > [C000000076993240] [D000000000958DC0] .mgc_process_config > +0x1290/0x198c [mgc] > [C000000076993330] [D0000000006F5A6C] .lustre_process_log > +0x1028/0x15dc [obdclass] > [C000000076993460] [D0000000006F712C] .server_start_targets+0x110c/ > 0x208c [obdclass] > [C000000076993570] [D0000000006F8A50] .server_fill_super > +0x9a4/0x1174 [obdclass] > [C000000076993640] [D0000000006FB36C] .lustre_fill_super+0x214c/ > 0x2374 [obdclass] > [C000000076993730] [C0000000000B07B4] .get_sb_nodev+0x88/0xf8 > [C0000000769937D0] [D0000000006E5DA4] .lustre_get_sb+0x20/0x38 > [obdclass] > [C000000076993850] [C0000000000B032C] .vfs_kern_mount+0x80/0xe8 > [C0000000769938F0] [C0000000000B03F0] .do_kern_mount+0x4c/0x80 > [C000000076993990] [C0000000000CE580] .do_mount+0x6d8/0x784 > [C000000076993D70] [C0000000000CE6EC] .sys_mount+0xc0/0x140 > [C000000076993E30] [C00000000000871C] syscall_exit+0x0/0x40 > BUG: soft lockup detected on CPU#0! > Call Trace: > [C00000000FFBF0A0] [C00000000000EBD0] .show_stack+0x68/0x1b0 > (unreliable) > [C00000000FFBF140] [C000000000077484] .softlockup_tick+0xf0/0x13c > [C00000000FFBF1F0] [C000000000056114] .run_local_timers+0x1c/0x30 > [C00000000FFBF270] [C00000000001B6C4] .timer_interrupt+0xa4/0x464 > [C00000000FFBF350] [C0000000000034EC] decrementer_common+0xec/0x100 > --- Exception: 901 at .lock_kernel+0x64/0x8c > LR = .nfs_permission+0xd8/0x274 > [C00000000FFBF640] [C00000000FFBF6E0] 0xc00000000ffbf6e0 (unreliable) > [C00000000FFBF700] [C0000000000B9EC0] .permission+0xb0/0xe8 > [C00000000FFBF780] [C0000000000BB848] .__link_path_walk+0x174/0x15f0 > [C00000000FFBF910] [C0000000000BCD60] .link_path_walk+0x9c/0x184 > [C00000000FFBFA80] [C0000000000BD34C] .do_path_lookup+0x320/0x368 > [C00000000FFBFB30] [C0000000000BE0DC] .__user_walk_fd_it+0x60/0xa0 > [C00000000FFBFBD0] [C0000000000B46D0] .vfs_stat_fd+0x70/0xc8 > [C00000000FFBFD30] [C0000000000B484C] .sys_newstat+0x2c/0x60 > [C00000000FFBFE30] [C00000000000871C] syscall_exit+0x0/0x40 > > This is, btw, running on a netbooted machine with root on NFS (I > bring it up as I see nfs_permission in the one callback there). > > Disk layout: > > bio-ppc-45:~# mac-fdisk -l > /dev/sda > # type name length > base ( size ) system > /dev/sda1 Apple_partition_map Apple 63 @ > 1 ( 31.5k) Partition map > /dev/sda2 Apple_UNIX_SVR2 lustre-mds 20971520 @ > 64 ( 10.0G) Linux native > /dev/sda3 Apple_UNIX_SVR2 lustre-osd1 209715200 @ > 20971584 (100.0G) Linux native > /dev/sda4 Apple_Free Extra 550735984 @ > 230686784 (262.6G) Free space > > Block size=512, Number of Blocks=781422768 > DeviceType=0x0, DeviceId=0x0 > > Thanks, > --bob > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >
Hm, I have to take that back. When I try it on this machine using loopback devices I just got the same crash. --bob On Nov 3, 2006, at 3:09 PM, Robert Olson wrote:> oh, and one other datapoint - it worked fine when I created mgs/mgt > and ost as loopback filesystems on a single machine.
On Fri, 3 Nov 2006, Robert Olson wrote:> BUG: soft lockup detected on CPU#1! > Call Trace: > [C000000076991DC0] [C00000000000EBD0] .show_stack+0x68/0x1b0 (unreliable) > [C000000076991E60] [C000000000077484] .softlockup_tick+0xf0/0x13c > [C000000076991F10] [C000000000056114] .run_local_timers+0x1c/0x30 > [C000000076991F90] [C00000000001B6C4] .timer_interrupt+0xa4/0x464 > [C000000076992070] [C0000000000034EC] decrementer_common+0xec/0x100 > --- Exception: 901 at .generic_find_next_zero_le_bit+0xb8/0x21c > LR = .ldiskfs_mb_init_cache+0x590/0x7a4 [ldiskfs] > [C000000076992360] [D0000000009D55D0] .ldiskfs_mb_init_cache+0x174/0x7a4 > [ldiskfs] (unreliable) > [C0000000769924D0] [D0000000009D5DEC] .ldiskfs_mb_load_buddy+0x1ec/0x358 > [ldiskfs]You may need patches from this bug: https://bugzilla.lustre.org/show_bug.cgi?id=10634 HTH -- Jean-Marc Saffroy - jean-marc.saffroy@ext.bull.net
Hm, this is tricky. It looks like these patches include: the addition of the multiblock allocator, already patched in my kernel (from one of the lustre patches I believe) the use of EXT3_MOUNT_MBALLOC ext2_find_next_le_bit and its use change to the signature of ext3_free_blocks addition of buddy allocator I have the feeling the patches for these all exist in the lustre kernel patches dir, I''m just missing some. Still exploring. Thanks, --bob On Nov 3, 2006, at 5:39 PM, Jean-Marc Saffroy wrote:> On Fri, 3 Nov 2006, Robert Olson wrote: > >> BUG: soft lockup detected on CPU#1! >> Call Trace: >> [C000000076991DC0] [C00000000000EBD0] .show_stack+0x68/0x1b0 >> (unreliable) >> [C000000076991E60] [C000000000077484] .softlockup_tick+0xf0/0x13c >> [C000000076991F10] [C000000000056114] .run_local_timers+0x1c/0x30 >> [C000000076991F90] [C00000000001B6C4] .timer_interrupt+0xa4/0x464 >> [C000000076992070] [C0000000000034EC] decrementer_common+0xec/0x100 >> --- Exception: 901 at .generic_find_next_zero_le_bit+0xb8/0x21c >> LR = .ldiskfs_mb_init_cache+0x590/0x7a4 [ldiskfs] >> [C000000076992360] [D0000000009D55D0] .ldiskfs_mb_init_cache >> +0x174/0x7a4 [ldiskfs] (unreliable) >> [C0000000769924D0] [D0000000009D5DEC] .ldiskfs_mb_load_buddy+0x1ec/ >> 0x358 [ldiskfs] > > You may need patches from this bug: > https://bugzilla.lustre.org/show_bug.cgi?id=10634 > > > HTH > > -- > Jean-Marc Saffroy - jean-marc.saffroy@ext.bull.net >