thr3ads.net - Lustre discuss - [Lustre-discuss] Some quickie config questions [May 2006]

If this information is useful, please help other people find it:
Share via:

Phil Schwan

2006-May-19 07:36 UTC

[Lustre-discuss] Some quickie config questions

Daire Byrne wrote:> 
> First may I congratulate all the developers on a very impressive piece of 
> software. I''ve been keeping track of Lustre for over 6 months and
have
> with the 1.0 release have started testing it in earnest.
Thank you!  There are many advanced file system features which Lustre
does not yet have, but we are happy to hear that it is meeting some of
your needs.
> However I have a couple of problems that I havn''t been able to
quite
> figure out in my little head. Firstly the /etc/init.d/lustre startup 
> script often complians about an "unkown host entry". ie Lustre
doesnt know
> what "node" is trying to start up. I know its a configuration
problem as I
> don''t get this problem if I use the lwizard script. If I manually
edit the
> init script to specify what node is starting its all okay. To create the 
> config xml I do something like this:
> 
> -----------------
> # create nodes
> lmc -o $config --add net --node node1 --nid mds1 --nettype tcp
> lmc -m $config --add net --node node2 --nid ost1 --nettype tcp
> lmc -m $config --add net --node node3 --nid ost2 --nettype tcp
> lmc -m $config --add net --node node4 --nid llc1 --nettype tcp
> -----------------
> I wonder what I''m missing to allow the init script to figure out
what
> node it is based on its hostname.
In general, unless you supply a specific node configuration (with the
"lconf --node foo" option), lconf will try to find a configuration by
comparing the hostname of the machine to the --node names.

If the hostname of your machine is "mds1", then running

    lmc ... --node mds1 --nid mds1 ...

should provide better results.  If I''m wrong, and your hostname is
node1, then we will have to debug further.
> # configure mds server
> lmc -m $config --add mds --node node1 --mds mds1 --fstype=ext3 --dev
$MDSDEV ||exit 20
As an aside, there is a bug in Lustre 1.0.1 which confuses lmc if you
name your MDS service (--mds foo) the same as your MDS hostname.  You
can call it anything else, it''s just an administrative handle, as long
as you use the same name when configuring your LOV and clients.

This bug is filed as issue #2103, and will be fixed in Lustre 1.0.2
> # create client config
> lmc -m $config --add mtpt --node node4 --path /mnt/lustre --mds mds1 --lov
lov1 || exit 40
> 
> I''m a little confused by the 0-config setup. Does this mean that I
can use
> a single -nid ''*'' option for a client and then mount the
lustre filesystem
> on multiple clients? Or just the one client at a time? Obviously needing 
> an entry for every possible client in our company would be unworkable. It 
> seems the lwizard script just creates one client (with the
''*'' option?).
If you run

    lmc ... --node client --nid ''*'' ...

for example, then you can start any client with

    lconf --node client foo.xml

When we refer to 0-config, we are referring to the code which lets you
just run "mount" on the clients, and not use lconf at all -- which you
allude to below.
> In the howto it details the mount command options. What exactly is the 
> profile? "mdshost:/mdsname/client-profile". So in my example
above I would
> do:
> 
> mount -t lustre -o nettype=tcp mds1:/mds1/client-profile /mnt/lustre
> 
> I''ve tried many things for "client-profile" but I always
get a "bad
> superblock or wrong mount options" type message. I have the correct 
> modules.conf entries.
The key to debugging most Lustre events lies in the kernel messages
found in "dmesg" and /var/log/messages.  Without those messages from
the
client (and perhaps the MDS), I will have a hard time helping.

This might be caused by the bug to which I referred above.  You may not
be doing anything wrong.  In your example, with:

    lmc -m $config --add mtpt --node node4 --path /mnt/lustre --mds mds1
--lov lov1 || exit 40

I would have expected mds1:/mds1/node4 to work -- bug 2103 notwithstanding.
> And finally about the active-active OST server failover - does this mean 
> that if I have a raid array that writes at 200M/s connected to two GigE 
> servers that I can expect both to write at 100M/s simultaneously (not to 
> same files)? A way to improve performance whilst reducing the cost of 
> buying raid arrays?
Assuming that all of the hardware in the stack is capable, yes, you can
connect one fast raid array to multiple OST nodes.  On the NCSA Tungsten
cluster, for example, one DataDirect hardware raid device provides
backend storage for eight OST nodes.

In your example, by striping a single file across both OSTs, you can in
fact write at 200 MB/s simultaneously to one file, from multiple clients.

Hope this helps--

-Phil

Daire Byrne

2006-May-19 07:36 UTC

head link

[Lustre-discuss] Some quickie config questions

Hello,

First may I congratulate all the developers on a very impressive piece of 
software. I''ve been keeping track of Lustre for over 6 months and have 
with the 1.0 release have started testing it in earnest.

However I have a couple of problems that I havn''t been able to quite 
figure out in my little head. Firstly the /etc/init.d/lustre startup 
script often complians about an "unkown host entry". ie Lustre doesnt
know
what "node" is trying to start up. I know its a configuration problem
as I
don''t get this problem if I use the lwizard script. If I manually edit
the
init script to specify what node is starting its all okay. To create the 
config xml I do something like this:

-----------------
# create nodes
lmc -o $config --add net --node node1 --nid mds1 --nettype tcp
lmc -m $config --add net --node node2 --nid ost1 --nettype tcp
lmc -m $config --add net --node node3 --nid ost2 --nettype tcp
lmc -m $config --add net --node node4 --nid llc1 --nettype tcp

# configure mds server
lmc -m $config --add mds --node node1 --mds mds1 --fstype=ext3 --dev $MDSDEV
||exit 20

# configure ost
lmc -m $config --add lov --lov lov1 --mds mds1 --stripe_sz 65536 --stripe_cnt
0--stripe_pattern 0
lmc -m $config --add ost --node node2 --lov lov1 --ost ost1 --fstype=ext3 --dev
/dev/sda1 || exit 30
lmc -m $config --add ost --node node3 --lov lov1 --ost ost2 --fstype=ext3 --dev
/dev/sda1 || exit 30

# create client config
lmc -m $config --add mtpt --node node4 --path /mnt/lustre --mds mds1 --lov lov1
|| exit 40
-----------------
I wonder what I''m missing to allow the init script to figure out what
node
it is based on its hostname.

I''m a little confused by the 0-config setup. Does this mean that I can
use
a single -nid ''*'' option for a client and then mount the
lustre filesystem
on multiple clients? Or just the one client at a time? Obviously needing 
an entry for every possible client in our company would be unworkable. It 
seems the lwizard script just creates one client (with the ''*''
option?).
In the howto it details the mount command options. What exactly is the 
profile? "mdshost:/mdsname/client-profile". So in my example above I
would
do:

mount -t lustre -o nettype=tcp mds1:/mds1/client-profile /mnt/lustre

I''ve tried many things for "client-profile" but I always get
a "bad
superblock or wrong mount options" type message. I have the correct 
modules.conf entries.

And finally about the active-active OST server failover - does this mean 
that if I have a raid array that writes at 200M/s connected to two GigE 
servers that I can expect both to write at 100M/s simultaneously (not to 
same files)? A way to improve performance whilst reducing the cost of 
buying raid arrays?

Thanks for your time,

Daire

Lustre discuss - May 2006 - Some quickie config questions

[Lustre-discuss] Some quickie config questions

[Lustre-discuss] Some quickie config questions