Phil Schwan
2006-May-19 07:36 UTC
[Lustre-discuss] Would Lustre benefit me in this situation?
Brent M. Clements wrote:> Hypothetically here is the situation: > > We would like to build at linux cluster that has all of it''s compute > nodes and master nodes attached to a fibre channel SAN. That SAN has one > SAN storage device with over 25TB of data. The SAN storage device is > pretty high end. > > Would Lustre even benefit me in this situation since the SAN would give me > the ability to do shared filesystems as well as the fact that all of the > nodes are directly connected to the storage device via the SAN? or does > lustre give me value add if I add it as another layer in this > configuration?I think the best answer is that it depends on your cluster size and workload. In many SAN file systems, blocks or inodes are locked individually by the client nodes as they are updated. Some of these blocks and inodes see very high contention, and these bottlenecks are those that Lustre was designed precisely to avoid. This is not a big deal on 5 nodes. Is a very big deal on 1000 nodes. Lustre may or may not add value in terms of recovery, management, or performance. It would depend on the cluster size, workload, and which file systems it''s being compared against. How POSIX-compliant do you require your file system to be? Are single points of failure OK? Do you need to be able to add capacity easily? What metadata and I/O performance do you require? These are the kinds of questions that I think you will have to answer. I''m curious to hear what you decide! -Phil
Kumaran Rajaram wrote:> > I was also able to successfully build and configure Lustre. However, > when trying to install Lustre (Client, MDS, and OST on single node), I get > the following error:...> 728 lmc -o single.xml --add node --node lustre0 > 729 lmc -m single.xml --add net --node lustre0 --nid 10.0.1.45 > --nettype tcp > 730 lmc -m single.xml --add mds --node lustre0 --mds mds1 --fstype ext3 > --dev /tmp/mds1 --size 50000 > 731 lmc -m single.xml --add ost --node lustre0 --ost ost1 --fstype ext3 > --dev /tmp/ost1 --size 100000 > 732 lmc -m single.xml --add mtpt --node lustre0 --path /mnt/lustre > --mds mds1 --ost ost1 > > Any help to resolve this would be greatly appreciated.Lustre 1.0.x requires an LOV, even for a single-OST configuration. One of the improvements made in Lustre 1.0.3 is that lmc will automatically add the LOV for you if you forget, in a single-OST configuration. -Phil
Peter Braam
2006-May-19 07:36 UTC
[Lustre-discuss] Would Lustre benefit me in this situation?
On Tue, Jan 20, 2004 at 12:34:46AM -0500, Phil Schwan wrote:> Brent M. Clements wrote: > > Hypothetically here is the situation: > > > > We would like to build at linux cluster that has all of it''s compute > > nodes and master nodes attached to a fibre channel SAN. That SAN has one > > SAN storage device with over 25TB of data. The SAN storage device is > > pretty high end. > > > > Would Lustre even benefit me in this situation since the SAN would give me > > the ability to do shared filesystems as well as the fact that all of the > > nodes are directly connected to the storage device via the SAN? or does > > lustre give me value add if I add it as another layer in this > > configuration?In the longer term we are planning to allow Lustre to use SAN''s for data transport to/from OST''s (so it would bypass the OST server for the data portion, still use it for Metadata stored on the OST, such as allocation data). We call this driver the SAN OST and a prototype was built a year ago. However, this is a pretty complicated project due to recovery issues. I would not really recommend using a SAN to the clients as one can probably get comparable or better performance with two gige links at a fraction of the price. Putting all the servers (OSS''s and MDS''s) on a SAN can be beneficial for failover configuration. - Peter -
This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. ---559023410-959030623-1074619413=:16915 Content-Type: TEXT/PLAIN; charset=US-ASCII Hi, Iam trying to install the Lustre version 1.0.2 on test cluster here and started off with single node installation. The kernel build process + installation process went smooth and following is the system info. lustre0root/usr/src/lustre-1.0.2/tests$uname -a Linux lustre0 2.4.20-28.9_lustre.1.0.2custom #2 SMP Tue Jan 20 22:06:47 CST 2004 i686 i686 i386 GNU/Linux I was also able to successfully build and configure Lustre. However, when trying to install Lustre (Client, MDS, and OST on single node), I get the following error: lustre0root/usr/src/lustre-1.0.2/tests$lconf --reformat --gdb single.xml loading module: portals srcdir None devdir libcfs loading module: ksocknal srcdir None devdir knals/socknal loading module: lvfs srcdir None devdir lvfs loading module: obdclass srcdir None devdir obdclass loading module: ptlrpc srcdir None devdir ptlrpc loading module: ost srcdir None devdir ost loading module: fsfilt_ext3 srcdir None devdir lvfs loading module: obdfilter srcdir None devdir obdfilter loading module: mdc srcdir None devdir mdc loading module: osc srcdir None devdir osc loading module: lov srcdir None devdir lov loading module: mds srcdir None devdir mds loading module: llite srcdir None devdir llite The GDB module script is in /tmp/ogdb-lustre0 NETWORK: NET_lustre0_tcp NET_lustre0_tcp_UUID tcp 10.0.1.45 988 OSD: ost1 ost1_UUID obdfilter /tmp/ost1 100000 ext3 no 0 MDSDEV: mds1 mds1_UUID /tmp/mds1 ext3 no recording clients for filesystem: FS_fsname_UUID Recording log mds1 on mds1 OSC: OSC_lustre0_ost1_mds1 c734f_mds1_b7cf96504a ost1_UUID End recording log mds1 on mds1 Recording log mds1-clean on mds1 OSC: OSC_lustre0_ost1_mds1 c734f_mds1_b7cf96504a End recording log mds1-clean on mds1 MDSDEV: mds1 mds1_UUID /tmp/mds1 ext3 50000 no ! /usr/sbin/lctl (22): error: setup: Invalid argument I have attached the config xml file. The installation of previous Lustre versions had been successful for variety of configuration and Iam not sure what Iam missing. /var/log/message content is as follows: Jan 21 00:00:10 lustre0 kernel: Lustre: 2310:(fsfilt_ext3.c:812:fsfilt_ext3_setup()) Enabling PDIROPS Jan 21 00:00:10 lustre0 acceptor[2007]: Accepted host: lustre0.mpi-softtech.com snd: 16777216 rcv 16777216 nagle: disabled Jan 21 00:00:10 lustre0 last message repeated 2 times Jan 21 00:00:10 lustre0 kernel: Lustre: 1972:(socknal_cb.c:1534:ksocknal_process_receive()) [c5402000] EOF from 0xa00012d ip 0a00012d:1029 Jan 21 00:00:10 lustre0 kernel: LustreError: 2314:(obd_config.c:285:class_cleanup()) Device 3 not setup and lctl device_list is as follows: lustre0root/usr/src/lustre-1.0.2/tests$lctl device_list 0 UP obdfilter ost1 ost1_UUID 2 1 UP ost OSS OSS_UUID 2 2 UP mdt MDT MDT_UUID 2 Tried the llmount.sh script, but the installation hangs as I have 127.0.0.1 in /etc/hosts, so I alternatively created single.xml with real ip address. 728 lmc -o single.xml --add node --node lustre0 729 lmc -m single.xml --add net --node lustre0 --nid 10.0.1.45 --nettype tcp 730 lmc -m single.xml --add mds --node lustre0 --mds mds1 --fstype ext3 --dev /tmp/mds1 --size 50000 731 lmc -m single.xml --add ost --node lustre0 --ost ost1 --fstype ext3 --dev /tmp/ost1 --size 100000 732 lmc -m single.xml --add mtpt --node lustre0 --path /mnt/lustre --mds mds1 --ost ost1 Any help to resolve this would be greatly appreciated. Thanks, -Kums ---559023410-959030623-1074619413=:16915 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="single.xml" Content-Transfer-Encoding: BASE64 Content-ID: <Pine.GSO.4.33.0401201123330.16915@mpi.mpi-softtech.com> Content-Description: Content-Disposition: attachment; filename="single.xml" PD94bWwgdmVyc2lvbj0nMS4wJyBlbmNvZGluZz0nVVRGLTgnPz4NCjwhRE9D VFlQRSBsdXN0cmU+DQo8bHVzdHJlIHZlcnNpb249JzIwMDMwNzA4MDEnPg0K ICA8bGRsbSB1dWlkPSdsZGxtX1VVSUQnIG5hbWU9J2xkbG0nLz4NCiAgPG5v ZGUgdXVpZD0nbHVzdHJlMF9VVUlEJyBuYW1lPSdsdXN0cmUwJz4NCiAgICA8 cHJvZmlsZV9yZWYgdXVpZHJlZj0nUFJPRklMRV9sdXN0cmUwX1VVSUQnLz4N CiAgICA8bmV0d29yayBuYW1lPSdORVRfbHVzdHJlMF90Y3AnIG5ldHR5cGU9 J3RjcCcgdXVpZD0nTkVUX2x1c3RyZTBfdGNwX1VVSUQnPg0KICAgICAgPG5p ZD4xMC4wLjEuNDU8L25pZD4NCiAgICAgIDxjbHVzdGVyaWQ+MDwvY2x1c3Rl cmlkPg0KICAgICAgPHBvcnQ+OTg4PC9wb3J0Pg0KICAgIDwvbmV0d29yaz4N CiAgPC9ub2RlPg0KICA8cHJvZmlsZSB1dWlkPSdQUk9GSUxFX2x1c3RyZTBf VVVJRCcgbmFtZT0nUFJPRklMRV9sdXN0cmUwJz4NCiAgICA8bGRsbV9yZWYg dXVpZHJlZj0nbGRsbV9VVUlEJy8+DQogICAgPG5ldHdvcmtfcmVmIHV1aWRy ZWY9J05FVF9sdXN0cmUwX3RjcF9VVUlEJy8+DQogICAgPG1kc2Rldl9yZWYg dXVpZHJlZj0nTUREX21kczFfbHVzdHJlMF9VVUlEJy8+DQogICAgPG9zZF9y ZWYgdXVpZHJlZj0nT1NEX29zdDFfbHVzdHJlMF9VVUlEJy8+DQogICAgPG1v dW50cG9pbnRfcmVmIHV1aWRyZWY9J01OVF9sdXN0cmUwX1VVSUQnLz4NCiAg PC9wcm9maWxlPg0KICA8bWRzIHV1aWQ9J21kczFfVVVJRCcgbmFtZT0nbWRz MSc+DQogICAgPGFjdGl2ZV9yZWYgdXVpZHJlZj0nTUREX21kczFfbHVzdHJl MF9VVUlEJy8+DQogICAgPGZpbGVzeXN0ZW1fcmVmIHV1aWRyZWY9J0ZTX2Zz bmFtZV9VVUlEJy8+DQogIDwvbWRzPg0KICA8bWRzZGV2IHV1aWQ9J01ERF9t ZHMxX2x1c3RyZTBfVVVJRCcgbmFtZT0nTUREX21kczFfbHVzdHJlMCc+DQog ICAgPGZzdHlwZT5leHQzPC9mc3R5cGU+DQogICAgPGRldnBhdGg+L3RtcC9t ZHMxPC9kZXZwYXRoPg0KICAgIDxhdXRvZm9ybWF0Pm5vPC9hdXRvZm9ybWF0 Pg0KICAgIDxkZXZzaXplPjUwMDAwPC9kZXZzaXplPg0KICAgIDxqb3VybmFs c2l6ZT4wPC9qb3VybmFsc2l6ZT4NCiAgICA8bm9kZV9yZWYgdXVpZHJlZj0n bHVzdHJlMF9VVUlEJy8+DQogICAgPHRhcmdldF9yZWYgdXVpZHJlZj0nbWRz MV9VVUlEJy8+DQogIDwvbWRzZGV2Pg0KICA8b3N0IG5hbWU9J29zdDEnIHV1 aWQ9J29zdDFfVVVJRCc+DQogICAgPGFjdGl2ZV9yZWYgdXVpZHJlZj0nT1NE X29zdDFfbHVzdHJlMF9VVUlEJy8+DQogIDwvb3N0Pg0KICA8b3NkIG9zZHR5 cGU9J29iZGZpbHRlcicgbmFtZT0nT1NEX29zdDFfbHVzdHJlMCcgdXVpZD0n T1NEX29zdDFfbHVzdHJlMF9VVUlEJz4NCiAgICA8dGFyZ2V0X3JlZiB1dWlk cmVmPSdvc3QxX1VVSUQnLz4NCiAgICA8bm9kZV9yZWYgdXVpZHJlZj0nbHVz dHJlMF9VVUlEJy8+DQogICAgPGZzdHlwZT5leHQzPC9mc3R5cGU+DQogICAg PGRldnBhdGg+L3RtcC9vc3QxPC9kZXZwYXRoPg0KICAgIDxhdXRvZm9ybWF0 Pm5vPC9hdXRvZm9ybWF0Pg0KICAgIDxkZXZzaXplPjEwMDAwMDwvZGV2c2l6 ZT4NCiAgICA8am91cm5hbHNpemU+MDwvam91cm5hbHNpemU+DQogIDwvb3Nk Pg0KICA8ZmlsZXN5c3RlbSB1dWlkPSdGU19mc25hbWVfVVVJRCcgbmFtZT0n RlNfZnNuYW1lJz4NCiAgICA8bWRzX3JlZiB1dWlkcmVmPSdtZHMxX1VVSUQn Lz4NCiAgICA8b2JkX3JlZiB1dWlkcmVmPSdvc3QxX1VVSUQnLz4NCiAgPC9m aWxlc3lzdGVtPg0KICA8bW91bnRwb2ludCB1dWlkPSdNTlRfbHVzdHJlMF9V VUlEJyBuYW1lPSdNTlRfbHVzdHJlMCc+DQogICAgPGZpbGVzeXN0ZW1fcmVm IHV1aWRyZWY9J0ZTX2ZzbmFtZV9VVUlEJy8+DQogICAgPHBhdGg+L21udC9s dXN0cmU8L3BhdGg+DQogIDwvbW91bnRwb2ludD4NCjwvbHVzdHJlPg0K ---559023410-959030623-1074619413=:16915--
Brent M. Clements
2006-May-19 07:36 UTC
[Lustre-discuss] Would Lustre benefit me in this situation?
Hypothetically here is the situation: We would like to build at linux cluster that has all of it''s compute nodes and master nodes attached to a fibre channel SAN. That SAN has one SAN storage device with over 25TB of data. The SAN storage device is pretty high end. Would Lustre even benefit me in this situation since the SAN would give me the ability to do shared filesystems as well as the fact that all of the nodes are directly connected to the storage device via the SAN? or does lustre give me value add if I add it as another layer in this configuration? We are trying to figure out in what cluster i/o scenarios lustre benefits us. Thanks, Brent Clements P.S. Lustre 1.0 is very very nice! Way to go!!! Brent Clements Linux Technology Specialist Information Technology Rice University