Hi, I am new user with Lustre file system, I am trying to run over Infiniband fabric (ofed 1.2.xxx), I am using Lustre version 1.6.4.1 I have created mgs/mdt on the same node and two OST''s on different nodes, all looks fine till now. Now I am trying to mount the client node but getting connection refused from the MGS server, This is my command on the client node: mount -t lustre 11.4.3.241 at o2ib:/datafs /mnt/testfs This is the dmesg on the MGS node: Lustre: datafs-MDT0000: temporarily refusing client connection from 11.4.3.242 at o2ib Lustre: Skipped 19 previous similar messages LustreError: 6047:0:(ldlm_lib.c:1442:target_send_reply_msg()) @@@ processing error (-11) req at ffff810219855850 x6/t0 o38-><?>@<?>:-1 lens 240/0 ref 0 fl Interpret:/0/0 rc -11/0 LustreError: 6047:0:(ldlm_lib.c:1442:target_send_reply_msg()) Skipped 19 previous similar messages Is someone familiar with this problem, can someone help? Best Regards, Alberto. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20071230/dfbddc66/attachment-0002.html
Hi! Could you post the IPs over your MDT/MGS, OSSs and Client? Also on the MDT/MGS I''m wondering if the MDT is in recovery. To check recovery status run-- cat /proc/fs/lustre/mds/datafs-MDT0000/recovery_status and post what you get back. Also you can do an lnet ping to the various points in your system. The syntax is "lctl ping nid" so to ping 11.4.3.241 at o2ib the syntax would be "lctl ping 11.4.3.241 at o2ib". Also post what you get back from that. -Aaron On Dec 29, 2007, at 7:29 PM, Albert Ozilov wrote:> Hi, > > I am new user with Lustre file system, I am trying to run over > Infiniband fabric (ofed 1.2.xxx), I am using Lustre version 1.6.4.1 > I have created mgs/mdt on the same node and two OST''s on different > nodes, all looks fine till now. > Now I am trying to mount the client node but getting connection > refused from the MGS server, > > This is my command on the client node: > mount -t lustre 11.4.3.241 at o2ib:/datafs /mnt/testfs > > This is the dmesg on the MGS node: > > Lustre: datafs-MDT0000: temporarily refusing client connection from > 11.4.3.242 at o2ib > Lustre: Skipped 19 previous similar messages > LustreError: 6047:0:(ldlm_lib.c:1442:target_send_reply_msg()) @@@ > processing error (-11) req at ffff810219855850 x6/t0 o38-><?>@<?>:-1 > lens 240/0 ref 0 fl Interpret:/0/0 rc -11/0 > LustreError: 6047:0:(ldlm_lib.c:1442:target_send_reply_msg()) > Skipped 19 previous similar messages > > Is someone familiar with this problem, can someone help? > > Best Regards, > Alberto. > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discussAaron Knister Associate Systems Analyst Center for Ocean-Land-Atmosphere Studies (301) 595-7000 aaron at iges.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20071229/c3e8ac8a/attachment-0002.html
Aha! I think I''ve found the problem. It looks like the filesystem name on the OSTs is different than the MDT you are trying to use. On the MDT you have datafs and on the OSTs it looks like you have spfs. I''m guessing you wan the name "datafs" so you need to use tunefs.lustre to change the fs name on the OSTs. Here''s how-- 1. Unmount the OSTs 2. Run "tunefs.lustre --fsname="datafs" /device/of/ost" for each of your OSTs 3. Mount everything back up and post the output of "lctl dl" on your OSSs and MDSs. -Aaron ps "lctl dl" is a quicker way of running cat /proc/fs/lustre/devices On Dec 30, 2007, at 4:00 AM, Albert Ozilov wrote:> Hi Aaron, > > Thanks in advance for your help. > > Here is some Info from my setup: > > MDT/MGS server IP is 11.4.3.241 > Client IP is 11.4.3.242. > OST''s IP''s are 11.4.3.243 and 11.4.3.244. > ==========================================> > [root at sw241 ~]# cat /proc/fs/lustre/mds/datafs-MDT0000/recovery_status > status: INACTIVE > ==========================================> > cat /proc/fs/lustre/devices: > > [root at sw241 ~]# cat /proc/fs/lustre/devices > 0 UP mgs MGS MGS 9 > 1 UP mgc MGC11.4.3.241 at o2ib a4bcf80c-5a47-b79d-326a-6520a237ce4a 5 > 2 UP mdt MDS MDS_uuid 3 > 3 UP lov datafs-mdtlov datafs-mdtlov_UUID 4 > 4 UP mds datafs-MDT0000 datafs-MDT0000_UUID 3 > [root at sw241 ~]# > ==========================================> [root at sw243 ~]# cat /proc/fs/lustre/devices > 0 UP mgc MGC11.4.3.241 at o2ib c44415f9-2bd0-420f-99a3-8792c190d45e 5 > 1 UP ost OSS OSS_uuid 3 > 2 UP obdfilter spfs-OST0000 spfs-OST0000_UUID 3 > [root at sw243 ~]# > ==========================================> [root at sw244 ~]# cat /proc/fs/lustre/devices > 0 UP mgc MGC11.4.3.241 at o2ib d4639175-68e7-9482-2edd-e5ee29e3af2a 5 > 1 UP ost OSS OSS_uuid 3 > 2 UP obdfilter spfs-OST0001 spfs-OST0001_UUID 3 > [root at sw244 ~]# > ==========================================> > lctl ping: > > [root at sw242 ~]# lctl ping 11.4.3.241 at o2ib > 12345-0 at lo > 12345-11.4.3.241 at o2ib > > [root at sw242 ~]# lctl ping 11.4.3.243 at o2ib > 12345-0 at lo > 12345-11.4.3.243 at o2ib > > [root at sw242 ~]# lctl ping 11.4.3.244 at o2ib > 12345-0 at lo > 12345-11.4.3.244 at o2ib > ==========================================> [root at sw241 ~]# lctl ping 11.4.3.242 at o2ib > 12345-0 at lo > 12345-11.4.3.242 at o2ib > > [root at sw241 ~]# lctl ping 11.4.3.243 at o2ib > 12345-0 at lo > 12345-11.4.3.243 at o2ib > > [root at sw241 ~]# lctl ping 11.4.3.244 at o2ib > 12345-0 at lo > 12345-11.4.3.244 at o2ib > ==========================================> > Your help is appreciated. > > Best Regards, > Alberto. > > > From: Aaron Knister [mailto:aaron at iges.org] > Sent: Sunday, December 30, 2007 4:05 AM > To: Albert Ozilov > Cc: lustre-discuss at clusterfs.com > Subject: Re: [Lustre-discuss] Can someone help? > > Hi! > > Could you post the IPs over your MDT/MGS, OSSs and Client? > > Also on the MDT/MGS I''m wondering if the MDT is in recovery. To > check recovery status run-- > > cat /proc/fs/lustre/mds/datafs-MDT0000/recovery_status > > and post what you get back. > > Also you can do an lnet ping to the various points in your system. > The syntax is "lctl ping nid" so to ping 11.4.3.241 at o2ib the syntax > would be "lctl ping 11.4.3.241 at o2ib". Also post what you get back > from that. > > -Aaron > > On Dec 29, 2007, at 7:29 PM, Albert Ozilov wrote: > >> Hi, >> >> I am new user with Lustre file system, I am trying to run over >> Infiniband fabric (ofed 1.2.xxx), I am using Lustre version 1.6.4.1 >> I have created mgs/mdt on the same node and two OST''s on different >> nodes, all looks fine till now. >> Now I am trying to mount the client node but getting connection >> refused from the MGS server, >> >> This is my command on the client node: >> mount -t lustre 11.4.3.241 at o2ib:/datafs /mnt/testfs >> >> This is the dmesg on the MGS node: >> >> Lustre: datafs-MDT0000: temporarily refusing client connection from >> 11.4.3.242 at o2ib >> Lustre: Skipped 19 previous similar messages >> LustreError: 6047:0:(ldlm_lib.c:1442:target_send_reply_msg()) @@@ >> processing error (-11) req at ffff810219855850 x6/t0 o38-><?>@<?>:-1 >> lens 240/0 ref 0 fl Interpret:/0/0 rc -11/0 >> LustreError: 6047:0:(ldlm_lib.c:1442:target_send_reply_msg()) >> Skipped 19 previous similar messages >> >> Is someone familiar with this problem, can someone help? >> >> Best Regards, >> Alberto. >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at clusterfs.com >> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss > > Aaron Knister > Associate Systems Analyst > Center for Ocean-Land-Atmosphere Studies > > (301) 595-7000 > aaron at iges.org > > > >Aaron Knister Associate Systems Analyst Center for Ocean-Land-Atmosphere Studies (301) 595-7000 aaron at iges.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20071230/d02e8503/attachment-0002.html
Aaron, Thanks a lot. It''s working like a tiger now :) Best Regards, Alberto. ________________________________ From: Aaron Knister [mailto:aaron at iges.org] Sent: Sunday, December 30, 2007 4:16 PM To: Albert Ozilov Cc: Lustre-discuss Subject: Re: [Lustre-discuss] Can someone help? Aha! I think I''ve found the problem. It looks like the filesystem name on the OSTs is different than the MDT you are trying to use. On the MDT you have datafs and on the OSTs it looks like you have spfs. I''m guessing you wan the name "datafs" so you need to use tunefs.lustre to change the fs name on the OSTs. Here''s how-- 1. Unmount the OSTs 2. Run "tunefs.lustre --fsname="datafs" /device/of/ost" for each of your OSTs 3. Mount everything back up and post the output of "lctl dl" on your OSSs and MDSs. -Aaron ps "lctl dl" is a quicker way of running cat /proc/fs/lustre/devices On Dec 30, 2007, at 4:00 AM, Albert Ozilov wrote: Hi Aaron, Thanks in advance for your help. Here is some Info from my setup: MDT/MGS server IP is 11.4.3.241 Client IP is 11.4.3.242. OST''s IP''s are 11.4.3.243 and 11.4.3.244. ========================================== [root at sw241 ~]# cat /proc/fs/lustre/mds/datafs-MDT0000/recovery_status status: INACTIVE ========================================== cat /proc/fs/lustre/devices: [root at sw241 ~]# cat /proc/fs/lustre/devices 0 UP mgs MGS MGS 9 1 UP mgc MGC11.4.3.241 at o2ib a4bcf80c-5a47-b79d-326a-6520a237ce4a 5 2 UP mdt MDS MDS_uuid 3 3 UP lov datafs-mdtlov datafs-mdtlov_UUID 4 4 UP mds datafs-MDT0000 datafs-MDT0000_UUID 3 [root at sw241 ~]# ========================================== [root at sw243 ~]# cat /proc/fs/lustre/devices 0 UP mgc MGC11.4.3.241 at o2ib c44415f9-2bd0-420f-99a3-8792c190d45e 5 1 UP ost OSS OSS_uuid 3 2 UP obdfilter spfs-OST0000 spfs-OST0000_UUID 3 [root at sw243 ~]# ========================================== [root at sw244 ~]# cat /proc/fs/lustre/devices 0 UP mgc MGC11.4.3.241 at o2ib d4639175-68e7-9482-2edd-e5ee29e3af2a 5 1 UP ost OSS OSS_uuid 3 2 UP obdfilter spfs-OST0001 spfs-OST0001_UUID 3 [root at sw244 ~]# ========================================== lctl ping: [root at sw242 ~]# lctl ping 11.4.3.241 at o2ib <mailto:11.4.3.241 at o2ib> 12345-0 at lo <mailto:12345-0 at lo> 12345-11.4.3.241 at o2ib <mailto:12345-11.4.3.241 at o2ib> [root at sw242 ~]# lctl ping 11.4.3.243 at o2ib <mailto:11.4.3.243 at o2ib> 12345-0 at lo <mailto:12345-0 at lo> 12345-11.4.3.243 at o2ib <mailto:12345-11.4.3.243 at o2ib> [root at sw242 ~]# lctl ping 11.4.3.244 at o2ib <mailto:11.4.3.244 at o2ib> 12345-0 at lo <mailto:12345-0 at lo> 12345-11.4.3.244 at o2ib <mailto:12345-11.4.3.244 at o2ib> ========================================== [root at sw241 ~]# lctl ping 11.4.3.242 at o2ib <mailto:11.4.3.242 at o2ib> 12345-0 at lo <mailto:12345-0 at lo> 12345-11.4.3.242 at o2ib <mailto:12345-11.4.3.242 at o2ib> [root at sw241 ~]# lctl ping 11.4.3.243 at o2ib <mailto:11.4.3.243 at o2ib> 12345-0 at lo <mailto:12345-0 at lo> 12345-11.4.3.243 at o2ib <mailto:12345-11.4.3.243 at o2ib> [root at sw241 ~]# lctl ping 11.4.3.244 at o2ib <mailto:11.4.3.244 at o2ib> 12345-0 at lo <mailto:12345-0 at lo> 12345-11.4.3.244 at o2ib <mailto:12345-11.4.3.244 at o2ib> ========================================== Your help is appreciated. Best Regards, Alberto. ________________________________ From: Aaron Knister [mailto:aaron at iges.org] Sent: Sunday, December 30, 2007 4:05 AM To: Albert Ozilov Cc: lustre-discuss at clusterfs.com Subject: Re: [Lustre-discuss] Can someone help? Hi! Could you post the IPs over your MDT/MGS, OSSs and Client? Also on the MDT/MGS I''m wondering if the MDT is in recovery. To check recovery status run-- cat /proc/fs/lustre/mds/datafs-MDT0000/recovery_status and post what you get back. Also you can do an lnet ping to the various points in your system. The syntax is "lctl ping nid" so to ping 11.4.3.241 at o2ib the syntax would be "lctl ping 11.4.3.241 at o2ib". Also post what you get back from that. -Aaron On Dec 29, 2007, at 7:29 PM, Albert Ozilov wrote: Hi, I am new user with Lustre file system, I am trying to run over Infiniband fabric (ofed 1.2.xxx), I am using Lustre version 1.6.4.1 I have created mgs/mdt on the same node and two OST''s on different nodes, all looks fine till now. Now I am trying to mount the client node but getting connection refused from the MGS server, This is my command on the client node: mount -t lustre 11.4.3.241 at o2ib:/datafs /mnt/testfs This is the dmesg on the MGS node: Lustre: datafs-MDT0000: temporarily refusing client connection from 11.4.3.242 at o2ib Lustre: Skipped 19 previous similar messages LustreError: 6047:0:(ldlm_lib.c:1442:target_send_reply_msg()) @@@ processing error (-11) req at ffff810219855850 x6/t0 o38-><?>@<?>:-1 lens 240/0 ref 0 fl Interpret:/0/0 rc -11/0 LustreError: 6047:0:(ldlm_lib.c:1442:target_send_reply_msg()) Skipped 19 previous similar messages Is someone familiar with this problem, can someone help? Best Regards, Alberto. _______________________________________________ Lustre-discuss mailing list Lustre-discuss at clusterfs.com https://mail.clusterfs.com/mailman/listinfo/lustre-discuss Aaron Knister Associate Systems Analyst Center for Ocean-Land-Atmosphere Studies (301) 595-7000 aaron at iges.org Aaron Knister Associate Systems Analyst Center for Ocean-Land-Atmosphere Studies (301) 595-7000 aaron at iges.org -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20071230/79fb72f7/attachment-0002.html