Matt Hollingsworth
2007-Feb-14 22:18 UTC
[Lustre-discuss] Strange Problem When Mounting Lustre Filesystem
Hello, #I''m sorry if this is a double post, but the message bounced back to me, and I''m not sure whether it went or not. I have spent the last couple of months designing the file system for a scientific cluster that I am helping administer. After a bunch of testing, we are finally ready to actually get down to using our setup. We have a slightly strange setup, so I''ll start off with explaining what we are doing. We rebuilt the kernel from the kernel-source-2.6.9-42.0.2.EL_lustre.1.4.7.1.x86_64.rpm package in order to to slim down the features and to add root over nfs support to it. Then we built lustre-1.4.8 against that kernel. We have one head-node that is the MDS as well as the boot server (it exports the root file system and runs tftpd). All of the other nodes, then, boot off of that server. The other (slave) nodes are the OSS''s. I use this script to generate the config file: ############################# ############################# cms-lustre-config.sh ############################# ############################# #!/bin/bash rm cluster-production.xml #----------------- #Create the nodes | #----------------- lmc -m cluster-production.xml --add node --node osg1 lmc -m cluster-production.xml --add net --node osg1 --nid 10.0.0.243@tcp0 --nettype lnet lmc -m cluster-production.xml --add node --node node253 lmc -m cluster-production.xml --add net --node node253 --nid 10.0.0.253@tcp0 --nettype lnet lmc -m cluster-production.xml --add node --node node252 lmc -m cluster-production.xml --add net --node node252 --nid 10.0.0.252@tcp0 --nettype lnet lmc -m cluster-production.xml --add node --node node251 lmc -m cluster-production.xml --add net --node node251 --nid 10.0.0.251@tcp0 --nettype lnet lmc -m cluster-production.xml --add node --node node250 lmc -m cluster-production.xml --add net --node node250 --nid 10.0.0.250@tcp0 --nettype lnet lmc -m cluster-production.xml --add node --node node249 lmc -m cluster-production.xml --add net --node node249 --nid 10.0.0.249@tcp0 --nettype lnet lmc -m cluster-production.xml --add node --node client lmc -m cluster-production.xml --add net --node client --nid ''*'' --nettype lnet #-------------- #Configure MDS | #-------------- lmc -m cluster-production.xml --add mds --node osg1 --mds cms-mds --fstype ldiskfs --dev /dev/sdb #--------------- #Configure OSTs | #--------------- lmc -m cluster-production.xml --add lov --lov cms-lov --mds cms-mds --stripe_sz 1048576 --stripe_cnt 0 --stripe_pattern 0 #Head Node #========= #lmc -m cluster-production.xml --add ost --node osg1 --lov cms-lov --ost node001-ost --fstype ldiskfs --dev /dev/sdc #========= #Compute Nodes #========= #node253 lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov --ost node253-ost-sda --fstype ldiskfs --dev /dev/sda lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov --ost node253-ost-sdb --fstype ldiskfs --dev /dev/sdb lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov --ost node253-ost-sdc --fstype ldiskfs --dev /dev/sdc lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov --ost node253-ost-sdd --fstype ldiskfs --dev /dev/sdd #-------- #node252 lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov --ost node252-ost-sda --fstype ldiskfs --dev /dev/sda lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov --ost node252-ost-sdb --fstype ldiskfs --dev /dev/sdb lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov --ost node252-ost-sdc --fstype ldiskfs --dev /dev/sdc lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov --ost node252-ost-sdd --fstype ldiskfs --dev /dev/sdd #--------- #node251 lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov --ost node251-ost-sda --fstype ldiskfs --dev /dev/sda lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov --ost node251-ost-sdb --fstype ldiskfs --dev /dev/sdb lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov --ost node251-ost-sdc --fstype ldiskfs --dev /dev/sdc lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov --ost node251-ost-sdd --fstype ldiskfs --dev /dev/sdd #--------- #node250 lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov --ost node250-ost-sda --fstype ldiskfs --dev /dev/sda lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov --ost node250-ost-sdb --fstype ldiskfs --dev /dev/sdb lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov --ost node250-ost-sdc --fstype ldiskfs --dev /dev/sdc lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov --ost node250-ost-sdd --fstype ldiskfs --dev /dev/sdd #--------- #node249 lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov --ost node249-ost-sda --fstype ldiskfs --dev /dev/sda lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov --ost node249-ost-sdb --fstype ldiskfs --dev /dev/sdb lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov --ost node249-ost-sdc --fstype ldiskfs --dev /dev/sdc lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov --ost node249-ost-sdd --fstype ldiskfs --dev /dev/sdd #========= #----------------- #Configure client | #----------------- lmc -m cluster-production.xml --add mtpt --node client --path /mnt/cms-lustre --mds cms-mds --lov cms-lov cp cluster-production.xml /cluster-images/rootfs-SL4-x86_64/root/lustre-config/ ############################# ############################# end ############################# ############################# Now, I do all the lconf --reformat --node <insert node name> cluster-production.xml on each node, and wait for a while for everything to format. Everything completes fine, without an error. I then do mount.lustre 10.0.0.243:/cms-mds/client /mnt/cms-lustre on the head node (osg1). That also works fine. I''ve run a number of tests, and it works fine (really well, in fact). The problem occurs when I attempt to mount the file system on the slave nodes. When I do the same command as above, I get the following: [root@node253 ~]# mount.lustre 10.0.0.243:/cms-mds/client /var/writable/cms-lustre/ mount.lustre: mount(10.0.0.243@tcp0:/cms-mds/client, /var/writable/cms-lustre/) failed: No such device mds nid 0: 10.0.0.243@tcp mds name: cms-mds profile: client options: retry: 0 Are the lustre modules loaded? Check /etc/modprobe.conf and /proc/filesystems [root@node253 ~]# and this pops up in the error log: Feb 14 04:50:07 localhost kernel: LustreError: 6053:0:(genops.c:224:class_newdev()) OBD: unknown type: osc Feb 14 04:50:07 localhost kernel: LustreError: 6053:0:(obd_config.c:102:class_attach()) Cannot create device OSC_osg1.<mydomain>_node253-ost-sda_MNT_client-000001011f659c00 of type osc : -19 Feb 14 04:50:07 localhost kernel: LustreError: mdc_dev: The configuration ''client'' could not be read from the MDS ''cms-mds''. This may be the result of communication errors between the client and the MDS, or if the MDS is not running. Feb 14 04:50:07 localhost kernel: LustreError: 6053:0:(llite_lib.c:936:lustre_fill_super()) Unable to process log: client Any idea what''s going on here? Thanks a bunch for the help. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.clusterfs.com/pipermail/lustre-discuss/attachments/20070214/73da41bd/attachment-0001.html
Aaron Knister
2007-Feb-15 09:14 UTC
[Lustre-discuss] Strange Problem When Mounting Lustre Filesystem
Try this-- lmc -m cluster-production.xml --add node --node client --nid ''*''@tcp --nettype tcp it might complain because the node "client" has already been added to your config file but I"m not sure. Once you''re run the above command try remounting your lustre fs. -Aaron Matt Hollingsworth wrote:> > Hello, > > #I?m sorry if this is a double post, but the message bounced back to > me, and I?m not sure whether it went or not. > > I have spent the last couple of months designing the file system for a > scientific cluster that I am helping administer. After a bunch of > testing, we are finally ready to actually get down to using our setup. > We have a slightly strange setup, so I?ll start off with explaining > what we are doing. > > We rebuilt the kernel from the > kernel-source-2.6.9-42.0.2.EL_lustre.1.4.7.1.x86_64.rpm package in > order to to slim down the features and to add root over nfs support to > it. Then we built lustre-1.4.8 against that kernel. We have one > head-node that is the MDS as well as the boot server (it exports the > root file system and runs tftpd). All of the other nodes, then, boot > off of that server. The other (slave) nodes are the OSS?s. I use this > script to generate the config file: > > ############################# > > ############################# > > cms-lustre-config.sh > > ############################# > > ############################# > > #!/bin/bash > > rm cluster-production.xml > > #----------------- > > #Create the nodes | > > #----------------- > > lmc -m cluster-production.xml --add node --node osg1 > > lmc -m cluster-production.xml --add net --node osg1 --nid > 10.0.0.243@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node node253 > > lmc -m cluster-production.xml --add net --node node253 --nid > 10.0.0.253@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node node252 > > lmc -m cluster-production.xml --add net --node node252 --nid > 10.0.0.252@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node node251 > > lmc -m cluster-production.xml --add net --node node251 --nid > 10.0.0.251@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node node250 > > lmc -m cluster-production.xml --add net --node node250 --nid > 10.0.0.250@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node node249 > > lmc -m cluster-production.xml --add net --node node249 --nid > 10.0.0.249@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node client > > lmc -m cluster-production.xml --add net --node client --nid ''*'' > --nettype lnet > > #-------------- > > #Configure MDS | > > #-------------- > > lmc -m cluster-production.xml --add mds --node osg1 --mds cms-mds > --fstype ldiskfs --dev /dev/sdb > > #--------------- > > #Configure OSTs | > > #--------------- > > lmc -m cluster-production.xml --add lov --lov cms-lov --mds cms-mds > --stripe_sz 1048576 --stripe_cnt 0 --stripe_pattern 0 > > #Head Node > > #=========> > #lmc -m cluster-production.xml --add ost --node osg1 --lov cms-lov > --ost node001-ost --fstype ldiskfs --dev /dev/sdc > > #=========> > #Compute Nodes > > #=========> > #node253 > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #-------- > > #node252 > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #--------- > > #node251 > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #--------- > > #node250 > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #--------- > > #node249 > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #=========> > #----------------- > > #Configure client | > > #----------------- > > lmc -m cluster-production.xml --add mtpt --node client --path > /mnt/cms-lustre --mds cms-mds --lov cms-lov > > cp cluster-production.xml > /cluster-images/rootfs-SL4-x86_64/root/lustre-config/ > > ############################# > > ############################# > > end > > ############################# > > ############################# > > Now, I do all the > > lconf --reformat --node <insert node name> cluster-production.xml > > on each node, and wait for a while for everything to format. > Everything completes fine, without an error. > > I then do mount.lustre 10.0.0.243:/cms-mds/client /mnt/cms-lustre on > the head node (osg1). > > That also works fine. I?ve run a number of tests, and it works fine > (really well, in fact). > > The problem occurs when I attempt to mount the file system on the > slave nodes. When I do the same command as above, I get the following: > > [root@node253 ~]# mount.lustre 10.0.0.243:/cms-mds/client > /var/writable/cms-lustre/ > > mount.lustre: mount(10.0.0.243@tcp0:/cms-mds/client, > /var/writable/cms-lustre/) failed: No such device > > mds nid 0: 10.0.0.243@tcp > > mds name: cms-mds > > profile: client > > options: > > retry: 0 > > Are the lustre modules loaded? > > Check /etc/modprobe.conf and /proc/filesystems > > [root@node253 ~]# > > and this pops up in the error log: > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(genops.c:224:class_newdev()) OBD: unknown type: osc > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(obd_config.c:102:class_attach()) Cannot create device > OSC_osg1.<mydomain>_node253-ost-sda_MNT_client-000001011f659c00 of > type osc : -19 > > Feb 14 04:50:07 localhost kernel: LustreError: mdc_dev: The > configuration ''client'' could not be read from the MDS ''cms-mds''. This > may be the result of communication errors between the client and the > MDS, or if the MDS is not running. > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(llite_lib.c:936:lustre_fill_super()) Unable to process log: client > > Any idea what?s going on here? > > Thanks a bunch for the help. > > -Matt > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >-- "Computers are incredibly fast, accurate and stupid; humans are incredibly slow, inaccurate and brilliant; together they are powerful beyond imagination." --Albert Einsten Aaron Knister Center for Research on Environment and Water 4041 Powder Mill Road, Suite 302; Calverton MD 20705 Office: (240) 247-1456 Fax: (301) 595-9790 http://crew.iges.org -------------- next part -------------- A non-text attachment was scrubbed... Name: aaron.vcf Type: text/x-vcard Size: 313 bytes Desc: not available Url : http://mail.clusterfs.com/pipermail/lustre-discuss/attachments/20070215/59045e13/aaron.vcf
Nathaniel Rutman
2007-Feb-15 09:34 UTC
[Lustre-discuss] Strange Problem When Mounting Lustre Filesystem
Matt Hollingsworth wrote:> > [root@node253 ~]# mount.lustre 10.0.0.243:/cms-mds/client > /var/writable/cms-lustre/ > > mount.lustre: mount(10.0.0.243@tcp0:/cms-mds/client, > /var/writable/cms-lustre/) failed: No such device > > mds nid 0: 10.0.0.243@tcp > > mds name: cms-mds > > profile: client > > options: > > retry: 0 > > Are the lustre modules loaded? >^^^^^^^^^^^ Hint #1> > Check /etc/modprobe.conf and /proc/filesystems > > [root@node253 ~]# > > and this pops up in the error log: > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(genops.c:224:class_newdev()) OBD: unknown type: osc >^^^^^^^^^^^ Hint #2> > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(obd_config.c:102:class_attach()) Cannot create device > OSC_osg1.<mydomain>_node253-ost-sda_MNT_client-000001011f659c00 of > type osc : -19 > > Feb 14 04:50:07 localhost kernel: LustreError: mdc_dev: The > configuration ''client'' could not be read from the MDS ''cms-mds''. This > may be the result of communication errors between the client and the > MDS, or if the MDS is not running. > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(llite_lib.c:936:lustre_fill_super()) Unable to process log: client > > Any idea what?s going on here? >The ''osc'' module must be loaded on the mds and clients. It seems that it''s not at least on this client. You should be able to modprobe it by hand and make sure it loads - see if there''s maybe a symbol mismatch.
Matt Hollingsworth
2007-Feb-15 10:30 UTC
[Lustre-discuss] Strange Problem When Mounting Lustre Filesystem
Aaron, Thanks for the reply. That didn''t seem to work...it still gives me the whole "No Such Device" error. Same thing in /var/log/messages too. Thanks! -Matt -----Original Message----- From: Aaron Knister [mailto:aaron@cola.iges.org] Sent: Thursday, February 15, 2007 11:14 AM To: Matt Hollingsworth Cc: lustre-discuss@clusterfs.com Subject: Re: [Lustre-discuss] Strange Problem When Mounting Lustre Filesystem Try this-- lmc -m cluster-production.xml --add node --node client --nid ''*''@tcp --nettype tcp it might complain because the node "client" has already been added to your config file but I"m not sure. Once you''re run the above command try remounting your lustre fs. -Aaron Matt Hollingsworth wrote:> > Hello, > > #I''m sorry if this is a double post, but the message bounced back to > me, and I''m not sure whether it went or not. > > I have spent the last couple of months designing the file system for a > scientific cluster that I am helping administer. After a bunch of > testing, we are finally ready to actually get down to using our setup. > We have a slightly strange setup, so I''ll start off with explaining > what we are doing. > > We rebuilt the kernel from the > kernel-source-2.6.9-42.0.2.EL_lustre.1.4.7.1.x86_64.rpm package in > order to to slim down the features and to add root over nfs support to > it. Then we built lustre-1.4.8 against that kernel. We have one > head-node that is the MDS as well as the boot server (it exports the > root file system and runs tftpd). All of the other nodes, then, boot > off of that server. The other (slave) nodes are the OSS''s. I use this > script to generate the config file: > > ############################# > > ############################# > > cms-lustre-config.sh > > ############################# > > ############################# > > #!/bin/bash > > rm cluster-production.xml > > #----------------- > > #Create the nodes | > > #----------------- > > lmc -m cluster-production.xml --add node --node osg1 > > lmc -m cluster-production.xml --add net --node osg1 --nid > 10.0.0.243@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node node253 > > lmc -m cluster-production.xml --add net --node node253 --nid > 10.0.0.253@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node node252 > > lmc -m cluster-production.xml --add net --node node252 --nid > 10.0.0.252@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node node251 > > lmc -m cluster-production.xml --add net --node node251 --nid > 10.0.0.251@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node node250 > > lmc -m cluster-production.xml --add net --node node250 --nid > 10.0.0.250@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node node249 > > lmc -m cluster-production.xml --add net --node node249 --nid > 10.0.0.249@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node client > > lmc -m cluster-production.xml --add net --node client --nid ''*'' > --nettype lnet > > #-------------- > > #Configure MDS | > > #-------------- > > lmc -m cluster-production.xml --add mds --node osg1 --mds cms-mds > --fstype ldiskfs --dev /dev/sdb > > #--------------- > > #Configure OSTs | > > #--------------- > > lmc -m cluster-production.xml --add lov --lov cms-lov --mds cms-mds > --stripe_sz 1048576 --stripe_cnt 0 --stripe_pattern 0 > > #Head Node > > #=========> > #lmc -m cluster-production.xml --add ost --node osg1 --lov cms-lov > --ost node001-ost --fstype ldiskfs --dev /dev/sdc > > #=========> > #Compute Nodes > > #=========> > #node253 > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #-------- > > #node252 > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #--------- > > #node251 > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #--------- > > #node250 > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #--------- > > #node249 > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #=========> > #----------------- > > #Configure client | > > #----------------- > > lmc -m cluster-production.xml --add mtpt --node client --path > /mnt/cms-lustre --mds cms-mds --lov cms-lov > > cp cluster-production.xml > /cluster-images/rootfs-SL4-x86_64/root/lustre-config/ > > ############################# > > ############################# > > end > > ############################# > > ############################# > > Now, I do all the > > lconf --reformat --node <insert node name> cluster-production.xml > > on each node, and wait for a while for everything to format. > Everything completes fine, without an error. > > I then do mount.lustre 10.0.0.243:/cms-mds/client /mnt/cms-lustre on > the head node (osg1). > > That also works fine. I''ve run a number of tests, and it works fine > (really well, in fact). > > The problem occurs when I attempt to mount the file system on the > slave nodes. When I do the same command as above, I get the following: > > [root@node253 ~]# mount.lustre 10.0.0.243:/cms-mds/client > /var/writable/cms-lustre/ > > mount.lustre: mount(10.0.0.243@tcp0:/cms-mds/client, > /var/writable/cms-lustre/) failed: No such device > > mds nid 0: 10.0.0.243@tcp > > mds name: cms-mds > > profile: client > > options: > > retry: 0 > > Are the lustre modules loaded? > > Check /etc/modprobe.conf and /proc/filesystems > > [root@node253 ~]# > > and this pops up in the error log: > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(genops.c:224:class_newdev()) OBD: unknown type: osc > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(obd_config.c:102:class_attach()) Cannot create device > OSC_osg1.<mydomain>_node253-ost-sda_MNT_client-000001011f659c00 of > type osc : -19 > > Feb 14 04:50:07 localhost kernel: LustreError: mdc_dev: The > configuration ''client'' could not be read from the MDS ''cms-mds''. This > may be the result of communication errors between the client and the > MDS, or if the MDS is not running. > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(llite_lib.c:936:lustre_fill_super()) Unable to process log: client > > Any idea what''s going on here? > > Thanks a bunch for the help. > > -Matt > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >-- "Computers are incredibly fast, accurate and stupid; humans are incredibly slow, inaccurate and brilliant; together they are powerful beyond imagination." --Albert Einsten Aaron Knister Center for Research on Environment and Water 4041 Powder Mill Road, Suite 302; Calverton MD 20705 Office: (240) 247-1456 Fax: (301) 595-9790 http://crew.iges.org
Matt Hollingsworth
2007-Feb-15 10:35 UTC
[Lustre-discuss] Strange Problem When Mounting Lustre Filesystem
Nathaniel, Thank you!! That was it...I''m glad it was so simple :). Thanks again for the file system; it''s perfect for our setup. -Matt -----Original Message----- From: Nathaniel Rutman [mailto:nathan@clusterfs.com] Sent: Thursday, February 15, 2007 11:34 AM To: Matt Hollingsworth Cc: lustre-discuss@clusterfs.com Subject: Re: [Lustre-discuss] Strange Problem When Mounting Lustre Filesystem Matt Hollingsworth wrote:> > [root@node253 ~]# mount.lustre 10.0.0.243:/cms-mds/client > /var/writable/cms-lustre/ > > mount.lustre: mount(10.0.0.243@tcp0:/cms-mds/client, > /var/writable/cms-lustre/) failed: No such device > > mds nid 0: 10.0.0.243@tcp > > mds name: cms-mds > > profile: client > > options: > > retry: 0 > > Are the lustre modules loaded? >^^^^^^^^^^^ Hint #1> > Check /etc/modprobe.conf and /proc/filesystems > > [root@node253 ~]# > > and this pops up in the error log: > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(genops.c:224:class_newdev()) OBD: unknown type: osc >^^^^^^^^^^^ Hint #2> > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(obd_config.c:102:class_attach()) Cannot create device > OSC_osg1.<mydomain>_node253-ost-sda_MNT_client-000001011f659c00 of > type osc : -19 > > Feb 14 04:50:07 localhost kernel: LustreError: mdc_dev: The > configuration ''client'' could not be read from the MDS ''cms-mds''. This > may be the result of communication errors between the client and the > MDS, or if the MDS is not running. > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(llite_lib.c:936:lustre_fill_super()) Unable to process log: client > > Any idea what''s going on here? >The ''osc'' module must be loaded on the mds and clients. It seems that it''s not at least on this client. You should be able to modprobe it by hand and make sure it loads - see if there''s maybe a symbol mismatch.
Aaron Knister
2007-Feb-15 19:16 UTC
[Lustre-discuss] Strange Problem When Mounting Lustre Filesystem
Try this-- lmc -m cluster-production.xml --add node --node client --nid ''*''@tcp --nettype tcp it might complain because the node "client" has already been added to your config file but I"m not sure. Once you''re run the above command try remounting your lustre fs. -Aaron Matt Hollingsworth wrote:> > Hello, > > #I?m sorry if this is a double post, but the message bounced back to > me, and I?m not sure whether it went or not. > > I have spent the last couple of months designing the file system for a > scientific cluster that I am helping administer. After a bunch of > testing, we are finally ready to actually get down to using our setup. > We have a slightly strange setup, so I?ll start off with explaining > what we are doing. > > We rebuilt the kernel from the > kernel-source-2.6.9-42.0.2.EL_lustre.1.4.7.1.x86_64.rpm package in > order to to slim down the features and to add root over nfs support to > it. Then we built lustre-1.4.8 against that kernel. We have one > head-node that is the MDS as well as the boot server (it exports the > root file system and runs tftpd). All of the other nodes, then, boot > off of that server. The other (slave) nodes are the OSS?s. I use this > script to generate the config file: > > ############################# > > ############################# > > cms-lustre-config.sh > > ############################# > > ############################# > > #!/bin/bash > > rm cluster-production.xml > > #----------------- > > #Create the nodes | > > #----------------- > > lmc -m cluster-production.xml --add node --node osg1 > > lmc -m cluster-production.xml --add net --node osg1 --nid > 10.0.0.243@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node node253 > > lmc -m cluster-production.xml --add net --node node253 --nid > 10.0.0.253@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node node252 > > lmc -m cluster-production.xml --add net --node node252 --nid > 10.0.0.252@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node node251 > > lmc -m cluster-production.xml --add net --node node251 --nid > 10.0.0.251@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node node250 > > lmc -m cluster-production.xml --add net --node node250 --nid > 10.0.0.250@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node node249 > > lmc -m cluster-production.xml --add net --node node249 --nid > 10.0.0.249@tcp0 --nettype lnet > > lmc -m cluster-production.xml --add node --node client > > lmc -m cluster-production.xml --add net --node client --nid ''*'' > --nettype lnet > > #-------------- > > #Configure MDS | > > #-------------- > > lmc -m cluster-production.xml --add mds --node osg1 --mds cms-mds > --fstype ldiskfs --dev /dev/sdb > > #--------------- > > #Configure OSTs | > > #--------------- > > lmc -m cluster-production.xml --add lov --lov cms-lov --mds cms-mds > --stripe_sz 1048576 --stripe_cnt 0 --stripe_pattern 0 > > #Head Node > > #=========> > #lmc -m cluster-production.xml --add ost --node osg1 --lov cms-lov > --ost node001-ost --fstype ldiskfs --dev /dev/sdc > > #=========> > #Compute Nodes > > #=========> > #node253 > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node253 --lov cms-lov > --ost node253-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #-------- > > #node252 > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node252 --lov cms-lov > --ost node252-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #--------- > > #node251 > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node251 --lov cms-lov > --ost node251-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #--------- > > #node250 > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node250 --lov cms-lov > --ost node250-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #--------- > > #node249 > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sda --fstype ldiskfs --dev /dev/sda > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sdb --fstype ldiskfs --dev /dev/sdb > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sdc --fstype ldiskfs --dev /dev/sdc > > lmc -m cluster-production.xml --add ost --node node249 --lov cms-lov > --ost node249-ost-sdd --fstype ldiskfs --dev /dev/sdd > > #=========> > #----------------- > > #Configure client | > > #----------------- > > lmc -m cluster-production.xml --add mtpt --node client --path > /mnt/cms-lustre --mds cms-mds --lov cms-lov > > cp cluster-production.xml > /cluster-images/rootfs-SL4-x86_64/root/lustre-config/ > > ############################# > > ############################# > > end > > ############################# > > ############################# > > Now, I do all the > > lconf --reformat --node <insert node name> cluster-production.xml > > on each node, and wait for a while for everything to format. > Everything completes fine, without an error. > > I then do mount.lustre 10.0.0.243:/cms-mds/client /mnt/cms-lustre on > the head node (osg1). > > That also works fine. I?ve run a number of tests, and it works fine > (really well, in fact). > > The problem occurs when I attempt to mount the file system on the > slave nodes. When I do the same command as above, I get the following: > > [root@node253 ~]# mount.lustre 10.0.0.243:/cms-mds/client > /var/writable/cms-lustre/ > > mount.lustre: mount(10.0.0.243@tcp0:/cms-mds/client, > /var/writable/cms-lustre/) failed: No such device > > mds nid 0: 10.0.0.243@tcp > > mds name: cms-mds > > profile: client > > options: > > retry: 0 > > Are the lustre modules loaded? > > Check /etc/modprobe.conf and /proc/filesystems > > [root@node253 ~]# > > and this pops up in the error log: > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(genops.c:224:class_newdev()) OBD: unknown type: osc > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(obd_config.c:102:class_attach()) Cannot create device > OSC_osg1.<mydomain>_node253-ost-sda_MNT_client-000001011f659c00 of > type osc : -19 > > Feb 14 04:50:07 localhost kernel: LustreError: mdc_dev: The > configuration ''client'' could not be read from the MDS ''cms-mds''. This > may be the result of communication errors between the client and the > MDS, or if the MDS is not running. > > Feb 14 04:50:07 localhost kernel: LustreError: > 6053:0:(llite_lib.c:936:lustre_fill_super()) Unable to process log: client > > Any idea what?s going on here? > > Thanks a bunch for the help. > > -Matt > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >