On Mon, 2004-05-03 at 15:05, nima@amy.udd.htu.se wrote:> > this is copied to all nodes and ip value is changed depending on hosts > presented on /etc/hosts, since nodes recognize other nodes with > different ip due to fully meshed network configuration, and client part is > commented out depending on wich node i run this script on. after running > the script i use: > lconf --reformat --gdb --node <node_name> ng.xml > the results is that all OST/client nodes fail with the message: > mount failed: /home/scratch/lustre : mount: wrong fs type, bad option, bad > superblock on ng, > or too many mounted file systems > > The node running MDS exits with: MDSDEV: mds1 mds1_UUID /home/scratch/mds > ext3 10000000 yes > ! /usr/sbin/lctl (110): error: setup: Connection timed outYou need to take steps to ensure that the OSTs get started first, then the MDS, then the clients. I believe that Lustre 1.2.x client mounts will retry if they don''t succeed right away, but with 1.0.x you''ll have to be more clever. You have two choices: 1. You can start the OSTs with "lconf --maxlevel 40", then start the MDS, then finish with "lconf --minlevel 50" to start the client 2. Remove all of the "--add mtpt" lines, and replace them with: lmc -m ng.xml --add net --node client --nid ''*'' --nettype tcp lmc -m ng.xml --add mtpt --node client --path /home/scratch/lustre --mds mds1 --lov lov1 Then start your OSTs and MDS as usual. On the client nodes, run: mount -t lustre ng2:/mds1/client /home/scratch/lustre See section 1.6 of https://wiki.clusterfs.com/lustre/LustreHowto for more information, including how to setup your /etc/modules.conf Please let us know if that doesn''t help! -Phil
On May 03, 2004 21:05 +0200, nima@amy.udd.htu.se wrote:> $lmc -m ng.xml --add lov --lov lov1 --mds mds1 --stripe_sz 4096 > --stripe_cnt 0 --stripe_pattern 0You shouldn''t use such a small stripe size. This will not give very good performance. For 1.0 we recommend at least 64kB and for 1.2 we recommend at least 512kB or 1MB for best performance. Cheers, Andreas -- Andreas Dilger http://sourceforge.net/projects/ext2resize/ http://www-mddsp.enel.ucalgary.ca/People/adilger/
thank you guys. I used lwizard to generate a script and then edited the file. letting OSTs run first and changing routing table fixed the problem with this new config. the stripe size is also set to 64k .all nodes could mount lov with no problems at all. best regards /Nima On Mon, 2004-05-03 at 15:05, nima@amy.udd.htu.se wrote:> > this is copied to all nodes and ip value is changed depending on hosts > presented on /etc/hosts, since nodes recognize other nodes with > different ip due to fully meshed network configuration, and client partis> commented out depending on wich node i run this script on. after running > the script i use: > lconf --reformat --gdb --node <node_name> ng.xml > the results is that all OST/client nodes fail with the message: > mount failed: /home/scratch/lustre : mount: wrong fs type, bad option,bad> superblock on ng, > or too many mounted file systems > > The node running MDS exits with: MDSDEV: mds1 mds1_UUID/home/scratch/mds> ext3 10000000 yes > ! /usr/sbin/lctl (110): error: setup: Connection timed out>You need to take steps to ensure that the OSTs get started first, then >the MDS, then the clients. I believe that Lustre 1.2.x client mounts >will retry if they don''t succeed right away, but with 1.0.x you''ll have >to be more clever.>You have two choices:>1. You can start the OSTs with "lconf --maxlevel 40", then start the >MDS, then finish with "lconf --minlevel 50" to start the client>2. Remove all of the "--add mtpt" lines, and replace them with:>lmc -m ng.xml --add net --node client --nid ''*'' --nettype tcp >lmc -m ng.xml --add mtpt --node client --path /home/scratch/lustre --mds >mds1 --lov lov1>Then start your OSTs and MDS as usual. On the client nodes, run:>mount -t lustre ng2:/mds1/client /home/scratch/lustre>See section 1.6 of https://wiki.clusterfs.com/lustre/LustreHowto for >more information, including how to setup your /etc/modules.conf>Please let us know if that doesn''t help!>-Phil>On May 03, 2004 21:05 +0200, nima@amy.udd.htu.se wrote: >> $lmc -m ng.xml --add lov --lov lov1 --mds mds1 --stripe_sz 4096 >> --stripe_cnt 0 --stripe_pattern 0>You shouldn''t use such a small stripe size. This will not give very >good performance. For 1.0 we recommend at least 64kB and for 1.2 we >recommend at least 512kB or 1MB for best performance.>Cheers, Andreas >-- >Andreas Dilger >http://sourceforge.net/projects/ext2resize/ >http://www-mddsp.enel.ucalgary.ca/People/adilger/
Dear list members, I''m new to this list and lustre. A few weeks ago i started working with a project at my university, which is about testing I/O intensive cluster applications on different file systems to measure the performance. The cluster used in this project consists of 4 nodes; ng1,ng2,ng3 and ng4. applications run on ng1 and other nodes are used as computing nodes. each node is equipped with 3 nics beside the embedded nic on motherboard (used only on management node for internet connection) and these node form a fully meshed gigabit ethernet LAN. i try to briefly explain my goal of running lustre.Please remind me if i''m doing something wrong. By using lustre I want the application''s output files, be striped on a volume e.g. /home/scratch/lustre consisting of 3 OSTs (ng1,ng3,ng4) and MDS on ng2. since all these nodes are going to be computing nodes the client will run on all 4. i started to make a very simple script to generate a xml file as following: --------------------------------------------------------------- #!/bin/sh lmc=/usr/sbin/lmc lconf=/usr/sbin/lconf #create nodes $lmc -o ng.xml --add net --node ng1 --nid $ng1_ip --nettype tcp $lmc -m ng.xml --add net --node ng2 --nid $ng2_ip --nettype tcp $lmc -m ng.xml --add net --node ng3 --nid $ng3_ip --nettype tcp $lmc -m ng.xml --add net --node ng4 --nid $ng4_ip --nettype tcp #configure MDS $lmc -m ng.xml --format --add mds --node ng2 --mds mds1 --fstype ext3 --dev /home/scratch/mds --size=10000000 #configure OST $lmc -m ng.xml --add lov --lov lov1 --mds mds1 --stripe_sz 4096 --stripe_cnt 0 --stripe_pattern 0 $lmc -m ng.xml --add ost --node ng1 --lov lov1 --ost ost1 --fstype ext3 --dev /home/scratch/ost1 --size=10000000 $lmc -m ng.xml --add ost --node ng3 --lov lov1 --ost ost2 --fstype ext3 --dev /home/scratch/ost2 --size=10000000 $lmc -m ng.xml --add ost --node ng4 --lov lov1 --ost ost3 --fstype ext3 --dev /home/scratch/ost3 --size=10000000 #configure clients $lmc -m ng.xml --add mtpt --node ng1 --path /home/scratch/lustre --mds mds1 --lov lov1 #$lmc -m ng.xml --add mtpt --node ng2 --path /home/scratch/lustre --mds mds1 --lov lov1 #$lmc -m ng.xml --add mtpt --node ng3 --path /home/scratch/lustre --mds mds1 --lov lov1 #$lmc -m ng.xml --add mtpt --node ng4 --path /home/scratch/lustre --mds mds1 --lov lov1 ----------------------------------------------------------------------- this is copied to all nodes and ip value is changed depending on hosts presented on /etc/hosts, since nodes recognize other nodes with different ip due to fully meshed network configuration, and client part is commented out depending on wich node i run this script on. after running the script i use: lconf --reformat --gdb --node <node_name> ng.xml the results is that all OST/client nodes fail with the message: mount failed: /home/scratch/lustre : mount: wrong fs type, bad option, bad superblock on ng, or too many mounted file systems The node running MDS exits with: MDSDEV: mds1 mds1_UUID /home/scratch/mds ext3 10000000 yes ! /usr/sbin/lctl (110): error: setup: Connection timed out please help med understand how to correct these error and successfully run lustre to achive the goal. Thanks in advance Yours /Nima