Charland, Denis
2011-Oct-25 21:12 UTC
[Lustre-discuss] Lustre 1.8.5 on Fedora 12 and kernel 2.6.32.19
Hi, I''m trying to build Lustre 1.8.5 on Fedora 12 running kernel 2.6.32.19. I first built a patched kernel by applying patches listed in 2.6-sles11.series file (based on Lustre Test Matrix, SLES 11 uses kernel 2.6.32.19). Kernel was patched and rpms were built without any problem. I installed the kernel rpms and rebooted using this new kernel. Then I built Lustre 1.8.5 against the patched kernel source tree. Lustre rpms were built without any problem. I installed the Lustre rpms. I created on the same system the MGS, MDT and OST file systems. Now, when I mount those file systems, I get the following messages from the output of dmesg command: Lustre: OBD class driver, http://www.lustre.org/ Lustre: Lustre Version: 1.8.5 Lustre: Build Version: 1.8.5-20111012163457-PRISTINE-2.6.32.19-163.lustre_1.8.5.fc12.x86_64 Lustre: Added LNI 172.17.15.254 at tcp [8/256/0/180] Lustre: Accept secure, port 988 Lustre: Lustre Client File System; http://www.lustre.org/ init dynlocks cache LDISKFS-fs (sda5): barriers disabled LDISKFS-fs (sda5): mounted filesystem with ordered data mode LDISKFS-fs (sda5): barriers disabled LDISKFS-fs (sda5): mounted filesystem with ordered data mode Lustre: MGS MGS started Lustre: MGC172.17.15.254 at tcp: Reactivating import LDISKFS-fs (sda7): barriers disabled LDISKFS-fs (sda7): mounted filesystem with ordered data mode LDISKFS-fs (sda7): barriers disabled LDISKFS-fs (sda7): mounted filesystem with ordered data mode ima_file_free: test-OST0000 open/free imbalance (r:0 w:0 o:0 f:0) Pid: 1818, comm: ll_mgs_00 Not tainted 2.6.32.19-163.lustre_1.8.5.fc12.x86_64 #1 Call Trace: [<ffffffff81202568>] ima_file_free+0xa7/0x10c [<ffffffff8111cdc5>] __fput+0x13a/0x1dc [<ffffffff8111ce81>] fput+0x1a/0x1c [<ffffffff81119315>] filp_close+0x68/0x72 [<ffffffffa022e322>] llog_lvfs_close+0x32/0x160 [obdclass] [<ffffffffa022713d>] llog_close+0x6d/0x260 [obdclass] [<ffffffffa030be15>] ? lustre_msg_buf+0x85/0x90 [ptlrpc] [<ffffffffa032c86a>] llog_origin_handle_read_header+0x6ea/0xa60 [ptlrpc] [<ffffffffa05c23de>] mgs_handle+0xa0e/0x18a0 [mgs] [<ffffffff81048171>] ? enqueue_task_fair+0x2a/0x6d [<ffffffffa030c744>] ? lustre_msg_get_conn_cnt+0x94/0x100 [ptlrpc] [<ffffffff81017d69>] ? read_tsc+0x9/0x1b [<ffffffffa0317c5b>] ? ptlrpc_update_export_timer+0x4b/0x3f0 [ptlrpc] [<ffffffffa030b8d4>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc] [<ffffffffa0319403>] ptlrpc_server_handle_request+0x883/0xef0 [ptlrpc] [<ffffffff8103e7e4>] ? __wake_up_common+0x4e/0x84 [<ffffffffa031df63>] ptlrpc_main+0x753/0x1240 [ptlrpc] [<ffffffff81012d6a>] child_rip+0xa/0x20 [<ffffffffa031d810>] ? ptlrpc_main+0x0/0x1240 [ptlrpc] [<ffffffff81012d60>] ? child_rip+0x0/0x20 ima_file_free: test-OST0000 open/free imbalance (r:0 w:-1 o:-1 f:0) Lustre: Filtering OBD driver; http://www.lustre.org/ Lustre: 1967:0:(filter.c:995:filter_init_server_data()) RECOVERY: service test-OST0000, 1 recoverable clients, 0 delayed clients, last_rcvd 8589934592 Lustre: test-OST0000: Now serving test-OST0000 on /dev/sda7 with recovery enabled Lustre: test-OST0000: Will be in recovery for at least 5:00, or until 1 client reconnects LDISKFS-fs (sda6): barriers disabled LDISKFS-fs (sda6): mounted filesystem with ordered data mode LDISKFS-fs (sda6): barriers disabled LDISKFS-fs (sda6): mounted filesystem with ordered data mode ima_file_free: test-MDT0000 open/free imbalance (r:0 w:0 o:0 f:0) Pid: 1818, comm: ll_mgs_00 Not tainted 2.6.32.19-163.lustre_1.8.5.fc12.x86_64 #1 Call Trace: [<ffffffff81202568>] ima_file_free+0xa7/0x10c [<ffffffff8111cdc5>] __fput+0x13a/0x1dc [<ffffffff8111ce81>] fput+0x1a/0x1c [<ffffffff81119315>] filp_close+0x68/0x72 [<ffffffffa022e322>] llog_lvfs_close+0x32/0x160 [obdclass] [<ffffffffa022713d>] llog_close+0x6d/0x260 [obdclass] [<ffffffffa030be15>] ? lustre_msg_buf+0x85/0x90 [ptlrpc] [<ffffffffa032c86a>] llog_origin_handle_read_header+0x6ea/0xa60 [ptlrpc] [<ffffffffa05c23de>] mgs_handle+0xa0e/0x18a0 [mgs] [<ffffffff81048171>] ? enqueue_task_fair+0x2a/0x6d [<ffffffffa030c744>] ? lustre_msg_get_conn_cnt+0x94/0x100 [ptlrpc] [<ffffffff81017d69>] ? read_tsc+0x9/0x1b [<ffffffffa0317c5b>] ? ptlrpc_update_export_timer+0x4b/0x3f0 [ptlrpc] [<ffffffffa030b8d4>] ? lustre_msg_get_opc+0x94/0x100 [ptlrpc] [<ffffffffa0319403>] ptlrpc_server_handle_request+0x883/0xef0 [ptlrpc] [<ffffffff8103e7e4>] ? __wake_up_common+0x4e/0x84 [<ffffffffa031df63>] ptlrpc_main+0x753/0x1240 [ptlrpc] [<ffffffff81012d6a>] child_rip+0xa/0x20 [<ffffffffa031d810>] ? ptlrpc_main+0x0/0x1240 [ptlrpc] [<ffffffff81012d60>] ? child_rip+0x0/0x20 ima_file_free: test-MDT0000 open/free imbalance (r:0 w:-1 o:-1 f:0) Lustre: Enabling user_xattr Lustre: test-MDT0000: Now serving test-MDT0000 on /dev/sda6 with recovery enabled Lustre: 2059:0:(lproc_mds.c:271:lprocfs_wr_group_upcall()) test-MDT0000: group upcall set to /usr/sbin/l_getgroups Lustre: test-MDT0000.mdt: set parameter group_upcall=/usr/sbin/l_getgroups Lustre: 2059:0:(mds_lov.c:1155:mds_notify()) MDS test-MDT0000: add target test-OST0000_UUID LustreError: 1837:0:(ldlm_lib.c:885:target_handle_connect()) test-OST0000: NID 0 at lo (test-mdtlov_UUID) reconnected with 1 conn_cnt; cookies not random? LustreError: 1837:0:(ldlm_lib.c:1919:target_send_reply_msg()) @@@ processing error (-114) req at ffff8800553bc400 x1383674394181646/t0 o8-><?>@<?>:0/0 lens 368/264 e 0 to 0 dl 1319574904 ref 1 fl Interpret:/0/0 rc -114/0 LustreError: 11-0: an error occurred while communicating with 0 at lo. The ost_connect operation failed with -114 Lustre: 1786:0:(import.c:517:import_select_connection()) test-OST0000-osc: tried all connections, increasing latency to 2s Lustre: test-OST0000: Recovery period over after 0:01, of 1 clients 1 recovered and 0 were evicted. Lustre: test-OST0000: sending delayed replies to recovered clients Lustre: 1785:0:(quota_master.c:1716:mds_quota_recovery()) Only 0/1 OSTs are active, abort quota recovery Lustre: test-OST0000-osc: Connection restored to service test-OST0000 using nid 0 at lo. Lustre: test-OST0000: received MDS connection from 0 at lo Lustre: MDS test-MDT0000: test-OST0000_UUID now active, resetting orphans Any idea why I get the "comm: ll_mgs_00 Not tainted..." messages with the Call Traces? Denis Charland UNIX Systems Administrator National Research Council Canada -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20111025/fed12635/attachment-0001.html
Charland, Denis
2011-Oct-26 20:28 UTC
[Lustre-discuss] Lustre 1.8.5 on Fedora 12 and kernel 2.6.32.19
After searching on the WEB, I found that this messages is related to a problem with IMA file free imbalance. Here''s what I found: Gitweb: http://git.kernel.org/linus/1df9f0a73178718969ae47d813b8e7aab2cf073c Commit: 1df9f0a73178718969ae47d813b8e7aab2cf073c Parent: f4bd857bc8ed997c25ec06b56ef8064aafa6d4f3 Author: Mimi Zohar <zohar at linux.vnet.ibm.com> AuthorDate: Wed Feb 4 09:07:02 2009 -0500 Committer: James Morris <jmorris at namei.org> CommitDate: Fri Feb 6 09:05:33 2009 +1100 Integrity: IMA file free imbalance The number of calls to ima_path_check()/ima_file_free() should be balanced. An extra call to fput(), indicates the file could have been accessed without first being measured. Although f_count is incremented/decremented in places other than fget/fput, like fget_light/fput_light and get_file, the current task must already hold a file refcnt. The call to __fput() is delayed until the refcnt becomes 0, resulting in ima_file_free() flagging any changes. - add hook to increment opencount for IPC shared memory(SYSV), shmat files, and /dev/zero - moved NULL iint test in opencount_get() Signed-off-by: Mimi Zohar <zohar at us.ibm.com> Acked-by: Serge Hallyn <serue at us.ibm.com> Signed-off-by: James Morris jmorris at namei.org<mailto:jmorris at namei.org> IMA is enabled by default in the kernel in Fedora12. It can be disabled by removing all lines beginning with CONFIG_IMA in the kernel configuration file. Is it OK to disable IMA? Note that IMA is disabled in the kernel configuration file for SLES11 kernel 2.6.32 supplied with Lustre 1.8.5. Denis Charland UNIX Systems Administrator National Research Council Canada -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20111026/1b27e587/attachment.html
On an OSS with two OSTs, is a RAID-1 set with two 400MB partitions a bad choice for the external journals or should each journal be located on a separate RAID-1 set with one 400MB partition? Denis Charland UNIX Systems Administrator National Research Council Canada -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20111103/d95e2dfe/attachment.html
It looks like the two following mkfs.lustre options are the same: --mountfsoptions="stripe=" and --mkfsoptions "-E stripe-width=">From documentation:--mountfsoptions="stripe=<stripe_width_blocks>" where <stripe_width_blocks> = <stripe_width> / 4K we known that <stripe_width> = <chunksize> * <number of data disks> so <stripe_width_blocks> = <chunksize> * <number of data disks> / 4k we also know that <stride-size> = <chunksize> / 4K that gives <chunksize> = <stride-size> * 4K so <stripe_width_blocks> = <stride-size> * 4K *<number of data disks> / 4K = <stride-size> * <number of data disks> Finally we have mkfs.lustre . . . --mountfsoptions="stripe=<stride-size>*<number of data disks>" And from mkfs.lustre and mke2fs man pages, we have: mkfs.lustre . . . -mkfsoptions "-E stripe-width=<stride-size>*<number of data disks>" Are these two options redundant and only one of them should be specified? Denis Charland UNIX Systems Administrator National Research Council Canada -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20111107/9bc15fcd/attachment.html
> On an OSS with two OSTs, is a RAID-1 set with two 400MB partitions a bad choice for the external > journals or should each journal be located on a separate RAID-1 set with one 400MB partition? >Anyone can give me an advice on that? Actually, there will be four OSTs located in two 12 2TB-disk racks ( 2 * 4+1 plus two hot spares per rack, ext4 with external journal). One RAID controller per disk rack. The server box has room for eight disks ( one RAID1 for the OS, one RAID1 for the MDT ).There are four disks available for the external journals: one RAID1 with four 400MB partitions or two RAID1 with two 400MB partitions? Denis Charland UNIX Systems Administrator National Research Council Canada -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20111115/2f598968/attachment.html
Charland, Denis
2011-Dec-20 21:50 UTC
[Lustre-discuss] Lustre 1.8.7 kernel patches for SLES11
Any good reason why sd_iostats-2.6.32-vanilla.patch has been removed from lustre/kernel_patches/series/2.6-sles11.series in Lustre 1.8.7? Denis Charland UNIX Systems Administrator National Research Council Canada -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20111220/ff3bab38/attachment.html
Charland, Denis
2011-Dec-21 16:46 UTC
[Lustre-discuss] Lustre 1.8.7 kernel patches for SLES11
> Any good reason why sd_iostats-2.6.32-vanilla.patch has been removed > from lustre/kernel_patches/series/2.6-sles11.series in Lustre 1.8.7?I found that it has been removed as part of "b=23988 Remove sd iostats patch from sles11 patch series". I''m using this patch series to patch kernel 2.6.32.19-163 in Fedora12. Should I avoid applying this patch when building the patched kernel? Does this patch apply to SCSI disks only or does it apply to other type of disks (SAS/SATA) too? Denis Charland UNIX Systems Administrator National Research Council Canada -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20111221/afb5d5fb/attachment.html
Kevin Van Maren
2011-Dec-21 17:01 UTC
[Lustre-discuss] Lustre 1.8.7 kernel patches for SLES11
I don''t know why it would have been removed. I find the sd_iostats very useful. It provides stats for any "sd" disk. So if you are using SCSI or SAS, and SATA in scsi-emulation mode (ie: no if you get IDE''s /dev/hd*, but yes if you get /dev/sd*) Kevin On Dec 21, 2011, at 9:46 AM, Charland, Denis wrote:> Any good reason why sd_iostats-2.6.32-vanilla.patch has been removed > from lustre/kernel_patches/series/2.6-sles11.series in Lustre 1.8.7?I found that it has been removed as part of ?b=23988 Remove sd iostats patch from sles11 patch series?. I?m using this patch series to patch kernel 2.6.32.19-163 in Fedora12. Should I avoid applying this patch when building the patched kernel? Does this patch apply to SCSI disks only or does it apply to other type of disks (SAS/SATA) too? Denis Charland UNIX Systems Administrator National Research Council Canada <ATT00001..txt> Confidentiality Notice: This e-mail message, its contents and any attachments to it are confidential to the intended recipient, and may contain information that is privileged and/or exempt from disclosure under applicable law. If you are not the intended recipient, please immediately notify the sender and destroy the original e-mail message and any attachments (and any copies that may have been made) from your system or otherwise. Any unauthorized use, copying, disclosure or distribution of this information is strictly prohibited. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20111221/efb012cc/attachment-0001.html
Charland, Denis
2011-Dec-22 16:18 UTC
[Lustre-discuss] Lustre 1.8.7 kernel patches for SLES11
> I don''t know why it would have been removed. I find the sd_iostats very useful. > >It provides stats for any "sd" disk. So if you are using SCSI or SAS, and SATA in scsi-emulation mode (ie: no if you get IDE''s /dev/hd*, but >yes if you get /dev/sd*)Here''s the link to the bug report: https://bugzilla.lustre.org/show_bug.cgi?id=23988 Looks like this is Cray related only. Since I''m not using this functionality I''ll remove this patch from my own patch file just in case. Anyway, newer kernels have a tool called blktrace that can be used to collect the same kind of statistics through ftrace (see comment #42 and up). Cheers, Denis -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20111222/7fb6f4f1/attachment.html