Sridharan Ramaswamy (srramasw)
2006-Dec-07 13:26 UTC
[Lustre-discuss] Lustre client as root filesystem
Hi, Has anyone used Lustre client as root filesystem for diskless nodes? I saw this old mail on using Lustre as root filesystem, https://mail.clusterfs.com/pipermail/lustre-discuss/2005-May/000702.html Is the HOWTO document mentioned in there available someplace? Seems it is supposed to work w/ few known problem(may be there are fixed in recent releases) Appreciate any help on this. thanks, Sridharan ------------------------------ Software Engineer Cisco Systems, Inc sridhar <at> cisco <dot> com ------------------------------
From: "Sridharan Ramaswamy \(srramasw\)" <srramasw@cisco.com> Date: Thu, 7 Dec 2006 12:25:51 -0800 Hi, Has anyone used Lustre client as root filesystem for diskless nodes? We have a test system in which diskless clients use lustre for their rootfs. I saw this old mail on using Lustre as root filesystem, https://mail.clusterfs.com/pipermail/lustre-discuss/2005-May/000702.html Is the HOWTO document mentioned in there available someplace? Seems it is supposed to work w/ few known problem(may be there are fixed in recent releases) I don''t know of such a howto, but what Andreas says there is the basic recipe. It''s slightly easier in 1.6 (because there are a few less extra pieces which must be included in the initrd) but the same basic ideas apply; put all the modules (all the ones you need, anyway) in your initrd, get them loaded from the init script, mount the fs in question, pivot root, and off you go. There are some other things you''ll probably want to pay attention to, for instance it''s a bad idea if all your clients share /var/log (!) but that kind of thing is the same as if you were netbooting using nfs or anything else; all easily handled from your init script.
hi, We are also interested in using lustre for this so if anyone does try this out it would be very useful to have some notes/how-to for this. Thanks Rene On 12/7/06 2:37 PM, "John R. Dunning" <jrd@sicortex.com> wrote:> From: "Sridharan Ramaswamy \(srramasw\)" <srramasw@cisco.com> > Date: Thu, 7 Dec 2006 12:25:51 -0800 > > Hi, > > Has anyone used Lustre client as root filesystem for diskless nodes? > > We have a test system in which diskless clients use lustre for their rootfs. > > I saw this old mail on using Lustre as root filesystem, > > https://mail.clusterfs.com/pipermail/lustre-discuss/2005-May/000702.html > > Is the HOWTO document mentioned in there available someplace? Seems it > is supposed to work w/ few known problem(may be there are fixed in > recent releases) > > I don''t know of such a howto, but what Andreas says there is the basic > recipe. It''s slightly easier in 1.6 (because there are a few less extra > pieces which must be included in the initrd) but the same basic ideas apply; > put all the modules (all the ones you need, anyway) in your initrd, get them > loaded from the init script, mount the fs in question, pivot root, and off you > go. > > There are some other things you''ll probably want to pay attention to, for > instance it''s a bad idea if all your clients share /var/log (!) but that kind > of thing is the same as if you were netbooting using nfs or anything else; all > easily handled from your init script. > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >-- Rene Salmon Tulane University Center for Computational Science http://www.ccs.tulane.edu rsalmon@tulane.edu Tel 504-862-8393 Fax 504-862-8392
From: Rene Salmon <rsalmon@tulane.edu> Date: Thu, 07 Dec 2006 15:09:19 -0600 hi, We are also interested in using lustre for this so if anyone does try this out it would be very useful to have some notes/how-to for this. Would it be sufficient for your purposes to see an inventory of the initrd, and the init script I''m using?
Sridharan Ramaswamy (srramasw)
2006-Dec-07 14:40 UTC
[Lustre-discuss] Lustre client as root filesystem
Hi John, Thanks for the rough outline. While I''m still brushing up on the basic initrd sequence, here are few questions...> > > https://mail.clusterfs.com/pipermail/lustre-discuss/2005-May/0 > 00702.html > > Is the HOWTO document mentioned in there available > someplace? Seems it > is supposed to work w/ few known problem(may be there are fixed in > recent releases) > > I don''t know of such a howto, but what Andreas says there is the basic > recipe. It''s slightly easier in 1.6 (because there are a few > less extra pieces which must be included in the initrd) but the same > basic ideas apply; put all the modules (all the ones you need, anyway)in your> initrd,by this you mean to include Lustre modules that typically gets loaded during lconf, right?> get them loaded from the init script, mount the fs in question, > pivot root, and off you go.I was not aware of this "pivot_root" until now. This is perfect!! Again, if you have this basic recipe described in more detailed, it will great.> > There are some other things you''ll probably want to pay > attention to, for > instance it''s a bad idea if all your clients share /var/log > (!) but that kind > of thing is the same as if you were netbooting using nfs or > anything else; all > easily handled from your init script. >Oh sure. I too was thinking that nodes shouldn''t share dirs like /etc, /var/log, etc. thanks, Sridhar ------------------------------ Software Engineer Cisco Systems, Inc sridhar <at> cisco <dot> com ------------------------------
Hi Sridhar, We set up lustre as a root file system on a 28 node ppc 970 blade cluster (using CHRP instead of PXE) at the University of Colorado. I got it almost working with PXE boot as well but ran into an issue with it (hang? can''t remember) and got side tracked. It worked rather well. Basic administration tasks such as logging in and accessing the root file system were quite slow, but our tests on a few distributed applications showed very little performance impact on running programs. Here''s a paper I presented on it at LCI 2006. We would love to hear your experiences setting it up, and hope this is useful. Let me know if we can be of any help here. http://www.linuxclustersinstitute.org/Linux-HPC-Revolution/Archive/PDF06/35-Cope_J_final.pdf -Adam "Sridharan Ramaswamy \(srramasw\)" <srramasw@cisco.com> wrote:> Hi, > Has anyone used Lustre client as root filesystem for diskless nodes? > > I saw this old mail on using Lustre as root filesystem, > https://mail.clusterfs.com/pipermail/lustre-discuss/2005-May/000702.html > Is the HOWTO document mentioned in there available someplace? Seems it > is supposed to work w/ few known problem(may be there are fixed in > recent releases) > > Appreciate any help on this. > > thanks, > Sridharan > ------------------------------ > Software Engineer > Cisco Systems, Inc > sridhar <at> cisco <dot> com > ------------------------------ >
Yes! that would be very helpful and greatly appreciated. Thanks Rene On 12/7/06 3:38 PM, "John R. Dunning" <jrd@sicortex.com> wrote:> From: Rene Salmon <rsalmon@tulane.edu> > Date: Thu, 07 Dec 2006 15:09:19 -0600 > > > hi, > > We are also interested in using lustre for this so if anyone does try this > out it would be very useful to have some notes/how-to for this. > > Would it be sufficient for your purposes to see an inventory of the initrd, > and the init script I''m using?
Sridharan Ramaswamy (srramasw)
2006-Dec-07 14:53 UTC
[Lustre-discuss] Lustre client as root filesystem
Thanks Adam. Sounds interesting. Sure, I''ll keep this list posted when I try this out. John''s offer on sharing his init scripts is great! Hopefully it will get me going. We have both 1.4.7.3 and 1.6b installed is our evaluation systems. I''ll probably use the latter to try this out. - Sridhar> -----Original Message----- > From: lustre-discuss-bounces@clusterfs.com > [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of Adam Boggs > Sent: Thursday, December 07, 2006 1:41 PM > To: lustre-discuss@clusterfs.com > Subject: Re: [Lustre-discuss] Lustre client as root filesystem > > Hi Sridhar, > > We set up lustre as a root file system on a 28 node ppc 970 blade > cluster (using CHRP instead of PXE) at the University of Colorado. > I got it almost working with PXE boot as well but ran into an issue > with it (hang? can''t remember) and got side tracked. It worked rather > well. Basic administration tasks such as logging in and accessing > the root file system were quite slow, but our tests on a few > distributed applications showed very little performance impact > on running programs. > > Here''s a paper I presented on it at LCI 2006. We would love to hear > your experiences setting it up, and hope this is useful. Let me know > if we can be of any help here. > > http://www.linuxclustersinstitute.org/Linux-HPC-Revolution/Arc > hive/PDF06/35-Cope_J_final.pdf > > -Adam > > > "Sridharan Ramaswamy \(srramasw\)" <srramasw@cisco.com> wrote: > > > Hi, > > Has anyone used Lustre client as root filesystem for diskless nodes? > > > > I saw this old mail on using Lustre as root filesystem, > > > https://mail.clusterfs.com/pipermail/lustre-discuss/2005-May/0 > 00702.html > > Is the HOWTO document mentioned in there available > someplace? Seems it > > is supposed to work w/ few known problem(may be there are fixed in > > recent releases) > > > > Appreciate any help on this. > > > > thanks, > > Sridharan > > ------------------------------ > > Software Engineer > > Cisco Systems, Inc > > sridhar <at> cisco <dot> com > > ------------------------------ > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >
From: Rene Salmon <rsalmon@tulane.edu> Date: Thu, 07 Dec 2006 15:42:14 -0600 Yes! that would be very helpful and greatly appreciated. Here you go. We use pxeboot, which is documented elsewhere. I would think you could use this recipe for just about anything that allows for booting a kernel with an initrd. We have multiple rootfs images in the lustrefs, this particular version of the script is set up to mount number 3, thus the magic with the mount bind. Sorry for the terseness, I''m hip deep in another firedrill and don''t have time for lots of detail now. If there''s a specific question, fling it at me and I''ll answer if I can. root@gs105 rootfs # gunzip pxeinitrd.img.gz root@gs105 rootfs # mount -o loop pxeinitrd.img /mnt/initrd root@gs105 rootfs # ls -lR /mnt/initrd /mnt/initrd: total 45 drwxr-xr-x 2 root root 2048 Oct 10 16:25 bin drwxr-xr-x 2 root root 1024 Dec 18 2003 dev drwxr-xr-x 2 root root 1024 Oct 10 16:09 etc drwxr-xr-x 3 root root 1024 Oct 10 15:07 lib drwxr-xr-x 2 root root 1024 Oct 10 15:37 lib64 -rwxr-xr-x 1 root root 3727 Nov 10 11:27 linuxrc -rwxr-xr-x 1 root root 3871 Feb 20 2006 linuxrc.lustre-working -rwxr-xr-x 1 root root 3670 Feb 17 2006 linuxrc.nfs -rwxr-xr-x 1 root root 3727 Nov 10 07:54 linuxrc~ drwx------ 2 root root 12288 Mar 16 2006 lost+found drwxr-xr-x 2 root root 1024 Dec 17 2003 proc lrwxrwxrwx 1 root root 3 Mar 16 2006 sbin -> bin -rwxr-xr-x 1 root root 6800 Feb 20 2006 scx-client.sh drwxr-xr-x 2 root root 1024 Dec 17 2003 sysroot drwxr-xr-x 2 root root 1024 Dec 18 2003 tmp drwxr-xr-x 2 root root 1024 Dec 18 2003 usr /mnt/initrd/bin: total 2111 lrwxrwxrwx 1 root root 7 Mar 16 2006 [ -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 ash -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 basename -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 bunzip2 -> busybox -rwxr-xr-x 1 root root 1057048 Mar 17 2006 busybox -rwxr-xr-x 1 root root 819544 Dec 18 2003 busybox.1.00-pre4 lrwxrwxrwx 1 root root 7 Mar 16 2006 bzcat -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 cat -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 chgrp -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 chmod -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 chown -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 chroot -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 clear -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 cp -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 cut -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 date -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 dd -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 df -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 dirname -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 du -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 echo -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 egrep -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 env -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 false -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 fgrep -> busybox -rwxr-xr-x 1 root root 52348 Oct 25 2003 find lrwxrwxrwx 1 root root 7 Mar 16 2006 free -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 freeramdisk -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 grep -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 gunzip -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 halt -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 head -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 hostname -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 id -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 ifconfig -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 init -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 insmod -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 ip -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 ipcalc -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 iproute -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 kill -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 killall -> busybox -rwxr-xr-x 1 root root 33765 Mar 17 2006 llmount lrwxrwxrwx 1 root root 7 Mar 16 2006 ln -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 losetup -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 ls -> busybox -rwxr-xr-x 1 root root 36224 Dec 8 2002 lspci lrwxrwxrwx 1 root root 7 Mar 16 2006 mesg -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 mkdir -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 mknod -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 mkswap -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 mktemp -> busybox -rwxr-xr-x 1 root root 74 Feb 20 2006 modprobe lrwxrwxrwx 1 root root 7 Mar 16 2006 more -> busybox -rws--x--x 1 root root 89289 Sep 28 06:27 mount -rwxr-xr-x 1 root root 39558 Sep 28 15:56 mount.lustre lrwxrwxrwx 1 root root 7 Mar 16 2006 mv -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 nslookup -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 pidof -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 ping -> busybox -rwxr-xr-x 1 root root 9139 Sep 28 06:27 pivot_root lrwxrwxrwx 1 root root 7 Mar 16 2006 poweroff -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 ps -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 pwd -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 readlink -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 reboot -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 reset -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 rm -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 rmdir -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 route -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 sed -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 sh -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 sleep -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 sort -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 sync -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 tail -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 tar -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 test -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 tftp -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 touch -> busybox -rwxr-xr-x 1 root root 97 Dec 18 2003 trim lrwxrwxrwx 1 root root 7 Mar 16 2006 true -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 tty -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 udhcpc -> busybox -rwxr-xr-x 1 root root 2349 Dec 18 2003 udhcpc.script lrwxrwxrwx 1 root root 7 Mar 16 2006 umount -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 uname -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 uniq -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 wget -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 which -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 whoami -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 yes -> busybox lrwxrwxrwx 1 root root 7 Mar 16 2006 zcat -> busybox /mnt/initrd/dev: total 0 crw-r--r-- 1 root root 5, 1 Dec 21 2003 console crw-r--r-- 1 root root 1, 3 Dec 21 2003 null brw-r--r-- 1 root root 1, 1 Dec 21 2003 ram crw-r--r-- 1 root root 4, 0 Dec 21 2003 systty crw-r--r-- 1 root root 4, 1 Dec 21 2003 tty1 crw-r--r-- 1 root root 4, 2 Dec 21 2003 tty2 crw-r--r-- 1 root root 4, 3 Dec 21 2003 tty3 crw-r--r-- 1 root root 4, 4 Dec 21 2003 tty4 crw-r--r-- 1 root root 1, 9 Dec 21 2003 urandom /mnt/initrd/etc: total 3 -rw-r--r-- 1 root root 6 Oct 10 16:09 ld.so.conf -rw-r--r-- 1 root root 1142 Dec 25 2002 nsswitch.conf /mnt/initrd/lib: total 1892 -rwxr-xr-x 1 root root 107716 Oct 27 2003 ld-2.3.2.so lrwxrwxrwx 1 root root 11 Mar 16 2006 ld-linux.so.2 -> ld-2.3.2.so -rwxr-xr-x 1 root root 1573232 Oct 27 2003 libc-2.3.2.so lrwxrwxrwx 1 root root 13 Mar 16 2006 libc.so.6 -> libc-2.3.2.so -rwxr-xr-x 1 root root 93028 Oct 27 2003 libnsl-2.3.2.so lrwxrwxrwx 1 root root 15 Mar 16 2006 libnsl.so.1 -> libnsl-2.3.2.so -rwxr-xr-x 1 root root 18316 Oct 27 2003 libnss_dns-2.3.2.so lrwxrwxrwx 1 root root 19 Mar 16 2006 libnss_dns.so.2 -> libnss_dns-2.3.2.so -rwxr-xr-x 1 root root 51152 Oct 27 2003 libnss_files-2.3.2.so lrwxrwxrwx 1 root root 21 Mar 16 2006 libnss_files.so.2 -> libnss_files-2.3.2.so -rwxr-xr-x 1 root root 78048 Oct 27 2003 libresolv-2.3.2.so lrwxrwxrwx 1 root root 18 Mar 16 2006 libresolv.so.2 -> libresolv-2.3.2.so drwxr-xr-x 3 root root 1024 Oct 17 11:43 modules /mnt/initrd/lib/modules: total 1 drwxr-xr-x 3 root root 1024 Oct 17 04:28 2.6.15-sc-lustre-1.6b5-devo /mnt/initrd/lib/modules/2.6.15-sc-lustre-1.6b5-devo: total 42 lrwxrwxrwx 1 root root 42 Oct 17 11:43 build -> /usr/src/linux-2.6.15-sc-lustre-1.6b5-devo drwxr-xr-x 4 root root 1024 Oct 17 09:23 kernel -rw-r--r-- 1 root root 45 Oct 17 04:28 modules.alias -rw-r--r-- 1 root root 69 Oct 17 04:28 modules.ccwmap -rw-r--r-- 1 root root 6400 Oct 17 04:28 modules.dep -rw-r--r-- 1 root root 73 Oct 17 04:28 modules.ieee1394map -rw-r--r-- 1 root root 132 Oct 17 04:28 modules.inputmap -rw-r--r-- 1 root root 81 Oct 17 04:28 modules.isapnpmap -rw-r--r-- 1 root root 99 Oct 17 04:28 modules.pcimap -rw-r--r-- 1 root root 26120 Oct 17 04:28 modules.symbols -rw-r--r-- 1 root root 189 Oct 17 04:28 modules.usbmap lrwxrwxrwx 1 root root 42 Oct 17 11:43 source -> /usr/src/linux-2.6.15-sc-lustre-1.6b5-devo /mnt/initrd/lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel: total 2 drwxr-xr-x 3 root root 1024 Oct 17 09:23 fs drwxr-xr-x 4 root root 1024 Oct 17 09:23 net /mnt/initrd/lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs: total 1 drwxr-xr-x 2 root root 1024 Oct 17 09:23 lustre /mnt/initrd/lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre: total 34163 -rw-r--r-- 1 root root 358862 Oct 17 09:23 fsfilt_ldiskfs.ko -rw-r--r-- 1 root root 2727190 Oct 17 09:23 ldiskfs.ko -rw-r--r-- 1 root root 2595268 Oct 17 09:23 lov.ko -rw-r--r-- 1 root root 1334772 Oct 17 09:23 lquota.ko -rw-r--r-- 1 root root 4083678 Oct 17 09:23 lustre.ko -rw-r--r-- 1 root root 772918 Oct 17 09:23 lvfs.ko -rw-r--r-- 1 root root 1287666 Oct 17 09:23 mdc.ko -rw-r--r-- 1 root root 3458326 Oct 17 09:23 mds.ko -rw-r--r-- 1 root root 336344 Oct 17 09:23 mgc.ko -rw-r--r-- 1 root root 1219402 Oct 17 09:23 mgs.ko -rw-r--r-- 1 root root 5309336 Oct 17 09:23 obdclass.ko -rw-r--r-- 1 root root 815425 Oct 17 09:23 obdecho.ko -rw-r--r-- 1 root root 1775738 Oct 17 09:23 obdfilter.ko -rw-r--r-- 1 root root 974242 Oct 17 09:23 osc.ko -rw-r--r-- 1 root root 634238 Oct 17 09:23 ost.ko -rw-r--r-- 1 root root 7130717 Oct 17 09:23 ptlrpc.ko /mnt/initrd/lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/net: total 2 drwxr-xr-x 2 root root 1024 Oct 17 09:14 bridge drwxr-xr-x 2 root root 1024 Oct 17 09:23 lustre /mnt/initrd/lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/net/bridge: total 59 -rw-r--r-- 1 root root 59143 Oct 17 09:14 bridge.ko /mnt/initrd/lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/net/lustre: total 5316 -rw-r--r-- 1 root root 902540 Oct 17 09:23 ksocklnd.ko -rw-r--r-- 1 root root 2198509 Oct 17 09:23 libcfs.ko -rw-r--r-- 1 root root 2315387 Oct 17 09:23 lnet.ko /mnt/initrd/lib64: total 5676 -rwxr-xr-x 1 root root 113681 Feb 20 2006 ld-2.3.5.so -rwxr-xr-x 1 root root 134718 Sep 28 04:52 ld-2.4.so lrwxrwxrwx 1 root root 9 Oct 10 15:37 ld-linux-x86-64.so.2 -> ld-2.4.so lrwxrwxrwx 1 root root 13 Oct 10 15:37 libblkid.so -> libblkid.so.1 lrwxrwxrwx 1 root root 15 Oct 10 15:37 libblkid.so.1 -> libblkid.so.1.0 -rwxr-xr-x 1 root root 48415 Sep 28 06:26 libblkid.so.1.0 -rwxr-xr-x 1 root root 1256800 Feb 20 2006 libc-2.3.5.so -rwxr-xr-x 1 root root 1512019 Sep 28 04:52 libc-2.4.so lrwxrwxrwx 1 root root 11 Oct 10 15:37 libc.so.6 -> libc-2.4.so -rwxr-xr-x 1 root root 23248 Sep 1 2005 libcrypt-2.3.5.so -rwxr-xr-x 1 root root 29417 Sep 28 04:52 libcrypt-2.4.so lrwxrwxrwx 1 root root 15 Oct 10 15:37 libcrypt.so.1 -> libcrypt-2.4.so lrwxrwxrwx 1 root root 11 Mar 17 2006 libgpm.so -> libgpm.so.1 lrwxrwxrwx 1 root root 16 Mar 17 2006 libgpm.so.1 -> libgpm.so.1.19.0 -rwxr-xr-x 1 root root 24264 Jul 26 2005 libgpm.so.1.19.0 -rwxr-xr-x 1 root root 570832 Sep 1 2005 libm-2.3.5.so -rwxr-xr-x 1 root root 415072 Sep 28 04:52 libm-2.4.so lrwxrwxrwx 1 root root 11 Oct 10 15:37 libm.so.6 -> libm-2.4.so lrwxrwxrwx 1 root root 15 Oct 10 15:37 libncurses.so -> libncurses.so.5 lrwxrwxrwx 1 root root 17 Oct 10 15:37 libncurses.so.5 -> libncurses.so.5.5 -rwxr-xr-x 1 root root 375232 Jul 28 2005 libncurses.so.5.4 -rwxr-xr-x 1 root root 407583 Sep 28 04:55 libncurses.so.5.5 -rwxr-xr-x 1 root root 32008 Nov 18 2005 libnss_compat-2.3.5.so -rwxr-xr-x 1 root root 40650 Sep 28 04:52 libnss_compat-2.4.so lrwxrwxrwx 1 root root 20 Oct 10 15:37 libnss_compat.so.2 -> libnss_compat-2.4.so -rwxr-xr-x 1 root root 19176 Nov 18 2005 libnss_dns-2.3.5.so -rwxr-xr-x 1 root root 25049 Sep 28 04:52 libnss_dns-2.4.so lrwxrwxrwx 1 root root 17 Oct 10 15:37 libnss_dns.so.2 -> libnss_dns-2.4.so -rwxr-xr-x 1 root root 44392 Nov 18 2005 libnss_files-2.3.5.so -rwxr-xr-x 1 root root 55726 Sep 28 04:52 libnss_files-2.4.so lrwxrwxrwx 1 root root 19 Oct 10 15:37 libnss_files.so.2 -> libnss_files-2.4.so lrwxrwxrwx 1 root root 16 Oct 10 15:37 libreadline.so -> libreadline.so.5 lrwxrwxrwx 1 root root 18 Oct 10 15:37 libreadline.so.5 -> libreadline.so.5.1 -rwxr-xr-x 1 root root 234520 Jul 26 2005 libreadline.so.5.0 -rwxr-xr-x 1 root root 291035 Sep 28 05:42 libreadline.so.5.1 -rwxr-xr-x 1 root root 83472 Sep 28 04:52 libresolv-2.4.so lrwxrwxrwx 1 root root 16 Oct 10 15:37 libresolv.so.2 -> libresolv-2.4.so lrwxrwxrwx 1 root root 14 Oct 10 14:45 libuuid.so.1 -> libuuid.so.1.2 -rwxr-xr-x 1 root root 17049 Sep 28 06:26 libuuid.so.1.2 /mnt/initrd/lost+found: total 0 /mnt/initrd/proc: total 0 /mnt/initrd/sysroot: total 0 /mnt/initrd/tmp: total 0 /mnt/initrd/usr: total 0 lrwxrwxrwx 1 root root 6 Mar 16 2006 bin -> ../bin lrwxrwxrwx 1 root root 7 Mar 16 2006 sbin -> ../sbin root@gs105 rootfs # root@gs105 rootfs # cat /mnt/initrd/linuxrc #!/bin/sh # get the infrastructure for lustre set up. # these modules are coming out of the initrd # we probably don''t need all of these to be a client, figure out later # supposedly we don''t need to hand-load these modules any more echo "Loading lustre modules" echo " /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/net/lustre/libcfs.ko" /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/net/lustre/libcfs.ko echo " /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/net/lustre/lnet.ko" /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/net/lustre/lnet.ko echo " /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/net/lustre/ksocklnd.ko" /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/net/lustre/ksocklnd.ko echo " /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/lvfs.ko" /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/lvfs.ko echo " /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/obdclass.ko" /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/obdclass.ko echo " /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/ptlrpc.ko" /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/ptlrpc.ko echo " /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/osc.ko" /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/osc.ko echo " /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/lov.ko" /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/lov.ko echo " /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/mdc.ko" /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/mdc.ko echo " /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/mgc.ko" /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/mgc.ko echo " /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/lustre.ko" /sbin/insmod /lib/modules/2.6.15-sc-lustre-1.6b5-devo/kernel/fs/lustre/lustre.ko echo "Mounting Lustre at /sysroot" mount -n -t proc /proc /proc # # It looks like at least for now, we need to create enough in # /etc/hosts to allow for referring to all the components of # the lusterfs by name. Later, if we can figure out how to # make dns work, this can go away. # echo -e "10.2.2.21 gsrv021" > /etc/hosts echo -e "10.2.2.22 gsrv022" >> /etc/hosts echo -e "10.2.2.23 gsrv023" >> /etc/hosts echo -e "10.2.2.24 gsrv024" >> /etc/hosts echo -e "10.2.2.27 gsrv027" >> /etc/hosts echo -e "10.2.2.28 gsrv028" >> /etc/hosts echo -e "10.2.2.29 gsrv029" >> /etc/hosts echo -e "10.2.2.30 gsrv030" >> /etc/hosts # # now actually mount the fs # LD_LIBRARY_PATH=/lib64 export LD_LIBRARY_PATH # # NB! This must be invoked as /sbin/mount, not just mount. Not sure why... # /sbin/mount -t lustre gsrv022:/scx1 /sysroot umount /proc # when we thought we could use mount bind, we did it this way. need to # try it again... # [Later] it works after all. echo "Creating mount binding" mkdir -p /mnt mount -o bind /sysroot/rootfs/3 /mnt echo "Pivoting to new root" # cd /sysroot # mkdir -p initrd # pivot_root . /sysroot/initrd cd /mnt mkdir -p initrd pivot_root . /mnt/initrd # # run the script that straightens out /var etc # echo "Populating /var" sh /var_populate.sh echo "Starting init" # something''s still not right, we seem to need a sleep here???? sleep 1 exec chroot . sh -c ''exec /sbin/init'' </dev/console >/dev/console 2>&1 root@gs105 rootfs #
From: "Sridharan Ramaswamy \(srramasw\)" <srramasw@cisco.com> Date: Thu, 7 Dec 2006 13:39:51 -0800 by this you mean to include Lustre modules that typically gets loaded during lconf, right? It''s been long enough since I used lconf that I can''t speak to that. The short answer is "any module which is needed in order to be a client should be copied into the initrd". > get them loaded from the init script, mount the fs in question, > pivot root, and off you go. I was not aware of this "pivot_root" until now. This is perfect!! Again, if you have this basic recipe described in more detailed, it will great. See the script I just posted. Mine''s got a bit of extra magic in it because we wanted to maintain several rootfs images, which meant we wanted them in subdirs. Sadly, there appears to be no way to mount a subdir of a lustrefs, only the top level (perhaps that could be taken as a feature request, hint, hint) so we have to play the mount bind game. It''s a hack, but it works. Oh sure. I too was thinking that nodes shouldn''t share dirs like /etc, /var/log, etc. Right. Think of it as a mostly read-only fs. There are a small number of parts which can''t be read-only, such as /tmp, /var, perhaps parts of /usr etc. Those parts must have something else mounted on them.
Sridharan Ramaswamy (srramasw)
2006-Dec-07 15:29 UTC
[Lustre-discuss] Lustre client as root filesystem
> by this you mean to include Lustre modules that typically > gets loaded > during lconf, right? > > It''s been long enough since I used lconf that I can''t speak > to that. The > short answer is "any module which is needed in order to be a > client should be > copied into the initrd".Fair enough. Looks I better move my eval systems to 1.6 :-)> > > get them loaded from the init script, mount the fs in question, > > pivot root, and off you go. > > I was not aware of this "pivot_root" until now. This is perfect!! > > Again, if you have this basic recipe described in more > detailed, it will > great. > > See the script I just posted. > > Mine''s got a bit of extra magic in it because we wanted to > maintain several > rootfs images, which meant we wanted them in subdirs. Sadly, > there appears to > be no way to mount a subdir of a lustrefs, only the top level > (perhaps that > could be taken as a feature request, hint, hint) so we have > to play the mount > bind game. It''s a hack, but it works.I was wondering the same whether lustre would allow to mount just a subdir of lustrefs. I would agree to request that as feature enhancement. Thanks, Sridhar> > Oh sure. I too was thinking that nodes shouldn''t share > dirs like /etc, > /var/log, etc. > > Right. > > Think of it as a mostly read-only fs. There are a small > number of parts which > can''t be read-only, such as /tmp, /var, perhaps parts of /usr > etc. Those > parts must have something else mounted on them. >
On Thursday 07 December 2006 20:25, Sridharan Ramaswamy (srramasw) wrote:> Hi, > > Has anyone used Lustre client as root filesystem for diskless nodes? > > I saw this old mail on using Lustre as root filesystem, > > https://mail.clusterfs.com/pipermail/lustre-discuss/2005-May/000702.html > > Is the HOWTO document mentioned in there available someplace? Seems it > is supposed to work w/ few known problem(may be there are fixed in > recent releases) > > Appreciate any help on this. >Just to give a "me too", we''re running lustre as a root filesystem for 130 nodes. It''s a "mostly shared root". Though bear in mind, AFAIK such things are so not a Clusterfs supported configuration. Our setup is quite similar to other people''s described in this thread. PXE Boot a (large) initrd (but you can get away with quite large initrds when tftping them :-) ). Initrd is made by letting redhat tools build a normal initrd using mkinitrd --preload, and mutilating it by uncompressing it, and rpm2cpio installing certain rpms to it, and using a different init script. Note that RHEL4 includes a "system-config-netboot" package for autobuilding NFS-root clients- this was very helpful in working out what parts of the FS were safe to share and what parts should be per-host, and how various initscripts should be modified. The initrd sets up networking, uses the client-side zero-configuration support of lustre 1.4.7.x to mount lustre without bloating the initrd too much, then pivot_roots to a bind-mounted subdir of the lustre fs (similar to other users supporting multiple roots). When control is transferred to that root (i.e. the exec /sbin/init...), a script /etc/rc.bindmounts is called from very early on in the init process. This bind mounts /perhost/currenthost to /perhost/<hostname>, then bind mounts the various host-specific parts (like, say, /var/spool/) to subdirs of /perhost/currenthost This makes admin mostly a breeze, though I have definite qualms about the client shutdown "process" I use* and the large numbers of locks sharedroot clients are holding. * Hard shutdown, basically. As reported by others, for system level stuff the fs is kind of slow (as system level stuff means lots of small files and metadata changes I guess, which lustre doesn''t currently excel at), but has little impact on user level stuff. But pay close attention to cron jobs and such if you go this route - only _one_ node needs to update the locatedb or rpmdb or whatever, and you cause horrible load spikes at best, crashes at worst, if you mess cron jobs up, all nodes e.g. trying to "yum update" the same FS at the one time would be messy.
followup with some scripts. I''m kind of ashamed of some of them, follow them entirely at your own risk! This is a "Brute Force and Ignorance" method. I meant to write it up when I had refined it... just a little more... but might as well chime in now, John might want to compare notes. "unity" is just what I started calling the shared root setup. Assuming RHEL4 and Lustre 1.4.x (1.6.x will make this easier I''d guess) -2. Be set up to PXE boot. -1. Have a lustre filesystem set up. 0. Rsync (or whatever) an OS install to a subdir of it (outside scope of this quick pseudo-howto). Hack its initscripts a bit, similar to system-config-netboot''s hack job. Make rc.sysinit call rc.bindmounts to do a bunch of bindmounts to split out hostspecific parts. The bit at the start is to prevent little accidents, aborting early on if the hostname isn''t around: -8<---rc.bindmounts------------------- HOSTNAME=`hostname` if [ ! -d "/perhost/$HOSTNAME" ]; then echo "No /perhost/$HOSTNAME Directory" sulogin umount -a /usr/local/sbin/lustre_flush_cache mount -n -o remount,ro / sync sync sync reboot -f fi mount --bind "/lustre1/home" /home mount --bind "/perhost/$HOSTNAME" /perhost/currenthost mount --bind /perhost/currenthost/etc/rc.d/rc3.d/ /etc/rc.d/rc3.d/ mount --bind /perhost/currenthost/etc/rc.d/rc4.d/ /etc/rc.d/rc4.d/ mount --bind /perhost/currenthost/etc/rc.d/rc5.d/ /etc/rc.d/rc5.d/ mount --bind /perhost/currenthost/etc/xinetd.d/ /etc/xinetd.d/ mount --bind /perhost/currenthost/etc/xinetd.conf /etc/xinetd.conf mount --bind /perhost/currenthost/etc/ntp/step-tickers /etc/ntp/step-tickers mount --bind /perhost/currenthost/etc/adjtime /etc/adjtime mount --bind /perhost/currenthost/etc/motd /etc/motd mount --bind /perhost/currenthost/etc/crontab /etc/crontab mount --bind /perhost/currenthost/etc/cups/certs/ /etc/cups/certs/ mount --bind /perhost/currenthost/etc/sysconfig/hwconf /etc/sysconfig/hwconf mount --bind /perhost/currenthost/var/tmp/ /var/tmp/ mount --bind /perhost/currenthost/var/lib/nfs/ /var/lib/nfs/ mount --bind /perhost/currenthost/var/lib/random-seed /var/lib/random-seed mount --bind /perhost/currenthost/var/lock/ /var/lock/ mount --bind /perhost/currenthost/var/run/ /var/run/ mount --bind /perhost/currenthost/var/spool/ /var/spool/ mount --bind /perhost/currenthost/var/log/ /var/log/ mount --bind /perhost/currenthost/var/lib/logrotate.status /var/lib/logrotate.status mount --bind /perhost/currenthost/etc/sysconfig/network /etc/sysconfig/network mount --bind /perhost/currenthost/etc/sysconfig/network-scripts/ifcfg-eth0 /etc/sysconfig/network-scripts/ifcfg-eth0 mount --bind /perhost/currenthost/etc/sysconfig/network-scripts/ifcfg-eth1 /etc/sysconfig/network-scripts/ifcfg-eth1 ------------------------------- 1. Make an initrd using RH tools. tg3, libata and ata_piix are hardware-specific, just our nics and hdds. -8<---unity_mkinitrd.basic---- #!/bin/bash mkinitrd --preload tg3 \ --preload ksocklnd \ --preload ptlrpc \ --preload lov \ --preload osc \ --preload llite \ --preload libata \ --preload ata_piix \ initrd.basic.img 2.6.9-42.0.2.EL_lustre.1.4.7.1smp ------------------------------ 2. cpio-decompress the initrd and mutilate it. 2.1. Split its bin and sbin. 2.2. make some extra directories in it- dhcp may fail without a /tmp, and lustre itself may need a mountpoint. 2.3 rpm2cpio decomproess a bunch of redhat rpms, and rsync ''em to the initrd. As we''re tftp booting, we can make an enormous initrd if we want (see also: warewulf, and this is the main "brute force" part of the comment above...). Might want to strip out manual pages and stuff in /usr/share a bit to keep the size below the 16MByte threshhold where the boot process may get upset though. for i in \ initrd.basic \ initrd.extradirs \ bash \ coreutils \ dhclient \ glibc \ libacl \ libattr \ libselinux \ libtermcap \ lustre \ ncurses \ net-tools \ readline \ util-linux \ mktemp \ iproute \ initscripts \ procps \ sed \ gawk \ grep \ pcre \ ; do echo $i rsync -a $i/ initrd/ done ---------------------------------- 2.4. replace the initrd''s init with something else... Yes, I was actually lazy enough to just stick bash in the initrd and script in that instead of busybox''s reduced shell! After you''ve debugged a bit, you might want to replace the "exit 1s" with reboots, so that clients cycle and try again if something is transiently wrong, instead of dumping you at a shell prompt within the initrd. I think the /dev and /proc umount shenanigans at the end of this script are wrong somehow, but do mostly work. you should also watch out for /.oldroot/ being left accessible to users when boot is complete - make sure you''re not opening security holes... -8<---unityrc--------------------- #!/bin/bash PATH=/sbin:/usr/sbin:/bin:/usr/bin export PATH mount -t proc /proc /proc echo Mounted /proc filesystem echo Mounting sysfs mount -t sysfs none /sys echo Creating /dev mount -o mode=0755 -t tmpfs none /dev mknod /dev/console c 5 1 mknod /dev/null c 1 3 mknod /dev/zero c 1 5 mkdir /dev/pts mkdir /dev/shm echo Starting udev /sbin/udevstart echo -n "/sbin/hotplug" > /proc/sys/kernel/hotplug echo "Loading tg3.ko module" insmod /lib/tg3.ko echo "Bringing up Network" echo "loopback" ifconfig lo 127.0.0.1 netmask 255.0.0.0 up hostname ''(none)'' echo "DHCP: querying" dhclient -pf /tmp/dhclient.pid -lf /tmp/dhclient.leases eth1>/tmp/dhclient.out 2>&1if [ $? -ne 0 ]; then echo "ERROR! dhclient failed!" exec /bin/bash exit 1 fi if [ `hostname` == ''(none)'' ]; then echo "ERROR! did not get a hostname - assuming that there''s Trouble." exec /bin/bash exit 1 fi echo "DHCP: kill -9 dhclient, (hopefully) leaving interface up" kill -9 $(</tmp/dhclient.pid) echo "Lustre Init" echo "Loading libcfs.ko module" insmod /lib/libcfs.ko echo "Loading lnet.ko module" insmod /lib/lnet.ko echo "Loading ksocklnd.ko module" insmod /lib/ksocklnd.ko echo "Loading lvfs.ko module" insmod /lib/lvfs.ko echo "Loading obdclass.ko module" insmod /lib/obdclass.ko echo "Loading ptlrpc.ko module" insmod /lib/ptlrpc.ko echo "Loading lov.ko module" insmod /lib/lov.ko echo "Loading osc.ko module" insmod /lib/osc.ko echo "Loading mdc.ko module" insmod /lib/mdc.ko echo "Loading llite.ko module" insmod /lib/llite.ko echo "Loading scsi_mod.ko module" insmod /lib/scsi_mod.ko echo "Loading sd_mod.ko module" insmod /lib/sd_mod.ko echo "Loading libata.ko module" insmod /lib/libata.ko echo "Loading ata_piix.ko module" insmod /lib/ata_piix.ko /sbin/udevstart echo Mounting Lustre root filesystem mount -t lustre -o user_xattr,acl <YOUR_LUSTRE_SERVER>:/mds1/client /lustre1 if [ $? -ne 0 ]; then echo "ERROR! Lustre mount failed!" exec /bin/bash exit 1 fi grep lustre1 /proc/mounts if [ $? -ne 0 ]; then echo "ERROR! Lustre mount not found!" exec /bin/bash exit 1 fi mount --bind /lustre1/centos4a /sysroot mount --bind /lustre1 /sysroot/lustre1 umount /lustre1 mount -t tmpfs --bind /dev /sysroot/dev umount /sys umount /proc umount /dev echo Switching to new root cd /sysroot pivot_root /sysroot .oldroot umount /.oldroot/dev umount /.oldroot/proc exec /sbin/init ---------------------------------- 2.5 re-compress the initrd. 3. attempt to pxe-boot clients with your new setup. 4. Attempt to debug the mess you''ve just made and try again :-)
On Mon, 11 Dec 2006, David Golden wrote:> Just to give a "me too", we''re running lustre as a root filesystem > for 130 nodes. It''s a "mostly shared root". Though bear in mind, > AFAIK such things are so not a Clusterfs supported configuration.By "not supported" I guess you mean only that CFS did not automate or document the process? Because it seems to me to be a very normal use of Lustre, and I don''t see any technical reason why this would not be in the scope of regular support contracts. Anyone from CFS cares to comment? :) -- Jean-Marc Saffroy - jean-marc.saffroy@ext.bull.net
> replace the "exit 1s" with reboots,er, the exec /bin/bashs, even. the exit 1''s won''t be reached (both were in to allow commenting out one or the other.)
On Monday 11 December 2006 14:27, Jean-Marc Saffroy wrote:> By "not supported" I guess you mean only that CFS did not automate or > document the process?Well, undocumented => unsupported was what I was thinking alright. I didn''t actually ask CFS at all, though: In our case I did it without direct consultation with CFS, we''re not paying customers right now anyway, or the distro supplier, since we''re not paying customers of them either, so there are bound to be areas where they''d do it quite differently (read: better...) at least.
Sridharan Ramaswamy (srramasw)
2006-Dec-11 12:56 UTC
[Lustre-discuss] Lustre client as root filesystem
Thanks David for the script and pointers. I''m a week or so away trying this out in my box(s). Will post my results back. It seems most of the examples use PXE, as anyone tried this using Uboot? I hope it would have similar capability to pick a custom initrd image w/ Lustre modules and mount/pivot_root to a mounted subdir. - Sridhar> -----Original Message----- > From: lustre-discuss-bounces@clusterfs.com > [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of > David Golden > Sent: Monday, December 11, 2006 4:55 AM > To: lustre-discuss@clusterfs.com > Subject: Re: [Lustre-discuss] Lustre client as root filesystem > > followup with some scripts. I''m kind of ashamed of > some of them, follow them entirely at your own risk! > This is a "Brute Force and Ignorance" method. I meant > to write it up when I had refined it... just a little more... > but might as well chime in now, John might want to compare notes. > > "unity" is just what I started calling the shared root setup. > > > Assuming RHEL4 and Lustre 1.4.x (1.6.x will make this > easier I''d guess) > > -2. Be set up to PXE boot. > > -1. Have a lustre filesystem set up. > > 0. Rsync (or whatever) > an OS install to a subdir of it (outside scope of > this quick pseudo-howto). Hack its initscripts a bit, > similar to system-config-netboot''s hack job. Make > rc.sysinit call rc.bindmounts to do a bunch of bindmounts > to split out hostspecific parts. The bit at the start is > to prevent little accidents, aborting early on if the > hostname isn''t around: > > -8<---rc.bindmounts------------------- > HOSTNAME=`hostname` > if [ ! -d "/perhost/$HOSTNAME" ]; then > echo "No /perhost/$HOSTNAME Directory" > sulogin > umount -a > /usr/local/sbin/lustre_flush_cache > mount -n -o remount,ro / > sync > sync > sync > reboot -f > fi > > > mount --bind "/lustre1/home" /home > > mount --bind "/perhost/$HOSTNAME" /perhost/currenthost > > mount --bind /perhost/currenthost/etc/rc.d/rc3.d/ /etc/rc.d/rc3.d/ > mount --bind /perhost/currenthost/etc/rc.d/rc4.d/ /etc/rc.d/rc4.d/ > mount --bind /perhost/currenthost/etc/rc.d/rc5.d/ /etc/rc.d/rc5.d/ > mount --bind /perhost/currenthost/etc/xinetd.d/ /etc/xinetd.d/ > mount --bind /perhost/currenthost/etc/xinetd.conf /etc/xinetd.conf > mount --bind /perhost/currenthost/etc/ntp/step-tickers > /etc/ntp/step-tickers > mount --bind /perhost/currenthost/etc/adjtime /etc/adjtime > mount --bind /perhost/currenthost/etc/motd /etc/motd > mount --bind /perhost/currenthost/etc/crontab /etc/crontab > mount --bind /perhost/currenthost/etc/cups/certs/ /etc/cups/certs/ > mount --bind /perhost/currenthost/etc/sysconfig/hwconf > /etc/sysconfig/hwconf > mount --bind /perhost/currenthost/var/tmp/ /var/tmp/ > mount --bind /perhost/currenthost/var/lib/nfs/ /var/lib/nfs/ > mount --bind /perhost/currenthost/var/lib/random-seed > /var/lib/random-seed > mount --bind /perhost/currenthost/var/lock/ /var/lock/ > mount --bind /perhost/currenthost/var/run/ /var/run/ > mount --bind /perhost/currenthost/var/spool/ /var/spool/ > mount --bind /perhost/currenthost/var/log/ /var/log/ > mount --bind /perhost/currenthost/var/lib/logrotate.status > /var/lib/logrotate.status > mount --bind /perhost/currenthost/etc/sysconfig/network > /etc/sysconfig/network > mount --bind > /perhost/currenthost/etc/sysconfig/network-scripts/ifcfg-eth0 > /etc/sysconfig/network-scripts/ifcfg-eth0 > mount --bind > /perhost/currenthost/etc/sysconfig/network-scripts/ifcfg-eth1 > /etc/sysconfig/network-scripts/ifcfg-eth1 > > ------------------------------- > > > 1. Make an initrd using RH tools. tg3, libata and ata_piix > are hardware-specific, just our nics and hdds. > > -8<---unity_mkinitrd.basic---- > #!/bin/bash > mkinitrd --preload tg3 \ > --preload ksocklnd \ > --preload ptlrpc \ > --preload lov \ > --preload osc \ > --preload llite \ > --preload libata \ > --preload ata_piix \ > initrd.basic.img 2.6.9-42.0.2.EL_lustre.1.4.7.1smp > ------------------------------ > > 2. cpio-decompress the initrd and mutilate it. > > 2.1. Split its bin and sbin. > 2.2. make some extra directories in it- dhcp may fail without > a /tmp, and > lustre itself may need a mountpoint. > 2.3 rpm2cpio decomproess a bunch of redhat rpms, and rsync > ''em to the initrd. > As we''re tftp booting, we can make an enormous initrd if we want > (see also: warewulf, and this is the main "brute force" part > of the comment > above...). Might want to strip out manual pages and stuff in > /usr/share > a bit to keep the size below the 16MByte threshhold where the > boot process > may get upset though. > > for i in \ > initrd.basic \ > initrd.extradirs \ > bash \ > coreutils \ > dhclient \ > glibc \ > libacl \ > libattr \ > libselinux \ > libtermcap \ > lustre \ > ncurses \ > net-tools \ > readline \ > util-linux \ > mktemp \ > iproute \ > initscripts \ > procps \ > sed \ > gawk \ > grep \ > pcre \ > ; do > > echo $i > rsync -a $i/ initrd/ > > done > > ---------------------------------- > > 2.4. replace the initrd''s init with something else... > Yes, I was actually lazy enough to just stick bash > in the initrd and script in that instead of busybox''s reduced > shell! After you''ve debugged a bit, you might want to > replace the "exit 1s" with reboots, so that clients > cycle and try again if something is transiently wrong, > instead of dumping you at a shell prompt within the initrd. > I think the /dev and /proc umount shenanigans at the end of this > script are wrong somehow, but do mostly work. you should > also watch out for /.oldroot/ being left accessible to users > when boot is complete - make sure you''re not opening security holes... > > -8<---unityrc--------------------- > #!/bin/bash > > PATH=/sbin:/usr/sbin:/bin:/usr/bin > export PATH > > mount -t proc /proc /proc > echo Mounted /proc filesystem > echo Mounting sysfs > mount -t sysfs none /sys > echo Creating /dev > mount -o mode=0755 -t tmpfs none /dev > mknod /dev/console c 5 1 > mknod /dev/null c 1 3 > mknod /dev/zero c 1 5 > mkdir /dev/pts > mkdir /dev/shm > echo Starting udev > /sbin/udevstart > echo -n "/sbin/hotplug" > /proc/sys/kernel/hotplug > echo "Loading tg3.ko module" > insmod /lib/tg3.ko > > echo "Bringing up Network" > > echo "loopback" > ifconfig lo 127.0.0.1 netmask 255.0.0.0 up > > hostname ''(none)'' > > echo "DHCP: querying" > dhclient -pf /tmp/dhclient.pid -lf /tmp/dhclient.leases eth1 > >/tmp/dhclient.out 2>&1 > if [ $? -ne 0 ]; then > echo "ERROR! dhclient failed!" > exec /bin/bash > exit 1 > fi > if [ `hostname` == ''(none)'' ]; then > echo "ERROR! did not get a hostname - assuming that > there''s Trouble." > exec /bin/bash > exit 1 > fi > > echo "DHCP: kill -9 dhclient, (hopefully) leaving interface up" > kill -9 $(</tmp/dhclient.pid) > > > echo "Lustre Init" > > echo "Loading libcfs.ko module" > insmod /lib/libcfs.ko > echo "Loading lnet.ko module" > insmod /lib/lnet.ko > echo "Loading ksocklnd.ko module" > insmod /lib/ksocklnd.ko > echo "Loading lvfs.ko module" > insmod /lib/lvfs.ko > echo "Loading obdclass.ko module" > insmod /lib/obdclass.ko > echo "Loading ptlrpc.ko module" > insmod /lib/ptlrpc.ko > echo "Loading lov.ko module" > insmod /lib/lov.ko > echo "Loading osc.ko module" > insmod /lib/osc.ko > echo "Loading mdc.ko module" > insmod /lib/mdc.ko > echo "Loading llite.ko module" > insmod /lib/llite.ko > > echo "Loading scsi_mod.ko module" > insmod /lib/scsi_mod.ko > echo "Loading sd_mod.ko module" > insmod /lib/sd_mod.ko > echo "Loading libata.ko module" > insmod /lib/libata.ko > echo "Loading ata_piix.ko module" > insmod /lib/ata_piix.ko > /sbin/udevstart > > > echo Mounting Lustre root filesystem > mount -t lustre -o user_xattr,acl > <YOUR_LUSTRE_SERVER>:/mds1/client /lustre1 > if [ $? -ne 0 ]; then > echo "ERROR! Lustre mount failed!" > exec /bin/bash > exit 1 > fi > > grep lustre1 /proc/mounts > if [ $? -ne 0 ]; then > echo "ERROR! Lustre mount not found!" > exec /bin/bash > exit 1 > fi > > > mount --bind /lustre1/centos4a /sysroot > mount --bind /lustre1 /sysroot/lustre1 > umount /lustre1 > > mount -t tmpfs --bind /dev /sysroot/dev > > umount /sys > umount /proc > umount /dev > echo Switching to new root > cd /sysroot > pivot_root /sysroot .oldroot > umount /.oldroot/dev > umount /.oldroot/proc > exec /sbin/init > > ---------------------------------- > > > 2.5 re-compress the initrd. > > 3. attempt to pxe-boot clients with your new setup. > 4. Attempt to debug the mess you''ve just made and try again :-) > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >
On Monday 11 December 2006 19:56, Sridharan Ramaswamy (srramasw) wrote:> Thanks David for the script and pointers.Note that some of the bind mounts are optional, it''s not quite a "minimal set" of stuff you need - e.g. I used per host sysvinit runlevels (/etc/rc.d/*) to run different services on different hosts, and the host specific networking stuff arose because only a small subset of our nodes have infiniband, but if you have an entirely uniform cluster you mightn''t have to bother...
I''ve only been half following this thread, but I just had a thought. Has anyone attempted to use lustre with UnionFS to create a root filesystem? Right now there are some diskless packages that mount a ramdisk and a read-only NFS mount, use unionfs to create a unified filesystem and mount that over /. If you were to replace NFS with a read-only mount of lustre (does that exist?), you could easily create a shared root filesystem over lustre with full r/w privileges to each client node. (If this was mentioned before, re-read my first sentence.) -----Original Message----- From: lustre-discuss-bounces@clusterfs.com on behalf of David Golden Sent: Tue 12/12/2006 5:48 AM To: lustre-discuss@clusterfs.com Subject: Re: [Lustre-discuss] Lustre client as root filesystem On Monday 11 December 2006 19:56, Sridharan Ramaswamy (srramasw) wrote:> Thanks David for the script and pointers.Note that some of the bind mounts are optional, it''s not quite a "minimal set" of stuff you need - e.g. I used per host sysvinit runlevels (/etc/rc.d/*) to run different services on different hosts, and the host specific networking stuff arose because only a small subset of our nodes have infiniband, but if you have an entirely uniform cluster you mightn''t have to bother... _______________________________________________ Lustre-discuss mailing list Lustre-discuss@clusterfs.com https://mail.clusterfs.com/mailman/listinfo/lustre-discuss