From: Phil Schwan <phil@clusterfs.com> Date: Wed, 11 Jan 2006 11:56:08 -0500 I don''t suppose there''s any talking you into just using a precompiled kernel? I could probably be talked into it for evaluation purposes, but in the fullness of time I really need to be able to do the compile-from-source thing, for assorted reasons. It sounds like you''re doing the right thing, and it''s usually pretty straightforward -- patch and build the kernel, configure and "make" Lustre. What does the failure look like a little bit before and after it builds the bad cp command? Elided output: root@rodan # cd linux-2.6.9-5.0.5.EL... root@rodan # make ... root@rodan # cd ../lustre-1.4.5 root@rodan # ./configure ... ... Type ''make'' to build Lustre. root@rodan # make make all-recursive ... make sources -C ldiskfs make[4]: Entering directory `/local/jrd/hack/lustre-1.4.5/lustre/ldiskfs'' rm -rf linux-stage linux sources mkdir -p linux-stage/fs/ext3 linux-stage/include/linux cp linux-stage/fs/ext3 cp: missing destination file Try `cp --help'' for more information. make[4]: *** [sources] Error 1 make[4]: Leaving directory `/local/jrd/hack/lustre-1.4.5/lustre/ldiskfs'' make[3]: *** [ldiskfs-sources] Error 2 make[3]: Leaving directory `/local/jrd/hack/lustre-1.4.5/lustre'' make[2]: *** [all-sources] Error 2 make[2]: Leaving directory `/local/jrd/hack/lustre-1.4.5'' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/local/jrd/hack/lustre-1.4.5'' make: *** [all] Error 2 root@rodan # Does that make any sense? Thanks for the quick comeback!
Hi John-- On Jan 11, 2006, at 12:59, John R. Dunning wrote:> > ... but I tried it again with an absolute path, and it still loses, > but differently, it now dies trying to do the quilt push.Ah, that''s good news. Now I think I see your other problem -- we released 1.4.5 based on the 2.6.9-22 RHEL4 kernel, not the older 2.6.9-5.0.5. It''s available under ftp://ftp.clusterfs.com/pub/kernels/; I have reasonable confidence that will fix you up. -Phil
From: cliff white <cliffw@clusterfs.com> Date: Wed, 11 Jan 2006 14:54:08 -0800 I don''t think you need the ''--with-kernel-source-header=../kernel.h'' option. Probably not. I added that because there was a piece of the howto which suggested that in some circumstances it would help, but it doesn''t seem to make any difference. You might try using the CFS version of quilt (ftp.lustre.org:/pub/quilt/) That''s an idea too. The method you are using is the normal one we use and is usually successful. (that doesn''t help, i know) Actually, it helps a lot! If somebody else is doing it this way, that means there''s an existence proof, and I just need to figure out what I''m doing different than what you''re doing. What kernel are you using? Do you by chance have a build script that you could send me? Thanx!
From: "John R. Dunning" <jrd@jrd.org> Date: Thu, 12 Jan 2006 11:04:54 -0500 Thanks for the hints, will post back later once I''ve finished compiling all this stuff. What with distractions, it took me all day to work through trying this. It turns out that I can''t build the kernel from source (kernel-source-2.6.9-22.EL_lustre.1.4.5) because my machine is reiserfs, and attempting to enable that in the config causes the compile to die in reiserfs. A bit of investigation reveals that there seems to be at least one definition (the return type of get_rkey) that''s inconsistent between the headers and the reiserfs code, resulting in compile errors. That definition is not inconsistent in the base 2.6.9 sources, nor does it seem to be in any of the patch sets that come with lustre, which leads me to believe that it''s part of the diffs from 2.6.9 to RHEL2.6.9. So, as before, if anybody has hints on how to proceed, please let me know. At this point I think my best bet is to try to fix up the reiserfs defs and go on from there.
David Vasil wrote:> John R. Dunning wrote: > >> I''m using a gcc 3.4 toolchain. It''s running on an amd64 machine, but >> it''s hard to believe that has anything to do with what I''m seeing. > > > I recently installed lustre on a cluster of Opteron''s with RHEL4. The > easiest way to get the job done is to download the kernel source RPM''s > as well as the lustre source rpm from the clusterfs.com download section:Sorry for the dual emails, I went to the ftp site you listed and noticed it did not have the rpms easily findable. Here is where I downloaded the RPMS from: (It seems the policy for downloading has recently changed, you now must supply an email address to get a download link) Go here, fill out your email address: http://www.clusterfs.com/download_form.php Once you receive the link, traverse to the directory: v1.4/latest Which has distro specific RPMs. The RPMs and source RPMs I used were in this directory: rhel-2.6-x86_64 -- | David Vasil <dmvasil@ornl.gov> | Oak Ridge National Laboratory NCCS Division | High Performance Computing Systems Administrator | Bldg: 5600-A115 Phone: (865)241-5562
Hi John-- On Jan 11, 2006, at 11:50, John R. Dunning wrote:> FWIW, the place it always seems to fail is building lustre/ldiskfs, at > the stage where it''s attempting to copy assorted sources (from the > kernel tree, I believe) into linux-stage/fs/ext3; it comes up with an > empty list of things to copy, generates a malformed cp command, and > dies. > > If anybody can point me at a foolproof recipe for compiling a kernel > (any kernel) plus lustre from sources, I''d greatly appreciate it.I don''t suppose there''s any talking you into just using a precompiled kernel? It sounds like you''re doing the right thing, and it''s usually pretty straightforward -- patch and build the kernel, configure and "make" Lustre. What does the failure look like a little bit before and after it builds the bad cp command? -Phil
Hi John-- On Jan 11, 2006, at 12:43, John R. Dunning wrote:> > Elided output: > > root@rodan # cd linux-2.6.9-5.0.5.EL... > root@rodan # make > ... > root@rodan # cd ../lustre-1.4.5 > root@rodan # ./configure ...What does your full ./configure command line look like? In particular, does it include --with-linux=/path/to/new/kernel? The makefile builds that ''cp'' command by excluding *.mod.c from the list of kernel/fs/ext3/*.c -- so it looks like it''s not finding any .c files in your specified source tree. If you can''t find an obvious explanation for why that would be, let me know, I''ll scratch my head some more. Thanks, -Phil
From: Phil Schwan <phil@clusterfs.com> Date: Wed, 11 Jan 2006 12:49:05 -0500 Hi John-- On Jan 11, 2006, at 12:43, John R. Dunning wrote: > > Elided output: > > root@rodan # cd linux-2.6.9-5.0.5.EL... > root@rodan # make > ... > root@rodan # cd ../lustre-1.4.5 > root@rodan # ./configure ... What does your full ./configure command line look like? In particular, does it include --with-linux=/path/to/new/kernel? Yes, sorry, if I had been thinking I would have realized that was relevant. ./configure --with-linux=../linux-2.6.9-5.0.5.EL_lustre.1.4.5 --with-kernel-source-header=../kernel.h The makefile builds that ''cp'' command by excluding *.mod.c from the list of kernel/fs/ext3/*.c -- so it looks like it''s not finding any .c files in your specified source tree. If you can''t find an obvious explanation for why that would be, let me know, I''ll scratch my head some more. Actually, thinking about it for 3 more seconds, I bet it''s the fact that I used a relative path. If it''s down in a subdir trying to use that path to find source files, it''s going to lose. ... but I tried it again with an absolute path, and it still loses, but differently, it now dies trying to do the quilt push. I suspect I''ve manged my source trees, time to zap it all and start fresh. Thanks again!
On Jan 11, 2006 13:01 -0500, Phil Schwan wrote:> On Jan 11, 2006, at 12:59, John R. Dunning wrote: > > > >... but I tried it again with an absolute path, and it still loses, > >but differently, it now dies trying to do the quilt push. > > Ah, that''s good news. Now I think I see your other problem -- we > released 1.4.5 based on the 2.6.9-22 RHEL4 kernel, not the older > 2.6.9-5.0.5.Actually, 1.4.5 had 2.6.9-5.0.5 I believe, and 1.4.5.1 had an update to 2.6.9-22EL. It might be the fact that the build process expects the CFS-patched quilt. To work around this just delete the "if USE_QUILT" section from the lustre/ldiskfs/autoMakefile.am file. In 1.4.6 we have a --disable-quilt configure option. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.
From: Andreas Dilger <adilger@clusterfs.com> Date: Wed, 11 Jan 2006 12:08:20 -0700 Actually, 1.4.5 had 2.6.9-5.0.5 I believe, and 1.4.5.1 had an update to 2.6.9-22EL. It might be the fact that the build process expects the CFS-patched quilt. To work around this just delete the "if USE_QUILT" section from the lustre/ldiskfs/autoMakefile.am file. In 1.4.6 we have a --disable-quilt configure option. Don''t think so. I tried the 2.6.9-22EL kernel sources, but it died the same way the previous round did, trying to quilt push. Then I tried getting rid of the "if USE_QUILT" section, per your suggestion, but that also died, trying to execute the "Applying ext3 patches" op; looks like fs/ext3/namei.c is out of sync, as the patches didn''t apply clean. I''ll try going back to the earlier kernel and redoing the build with USE_QUILT disabled.
From: "John R. Dunning" <jrd@jrd.org> Date: Wed, 11 Jan 2006 14:21:22 -0500 I''ll try going back to the earlier kernel and redoing the build with USE_QUILT disabled. Well, that works better, in that it gets farther, but it''s still not right. With linux 2.6.9-5.0.5.EL and lustre 1.4.5, commenting out the part of lustre/ldiskfs/autoMakefile.am that invokes quilt, I get as far as compiling lustre/ldiskfs, and get CC [M] /local/jrd/hack/lustre-1.4.5/lustre/ldiskfs/ioctl.o /local/jrd/hack/lustre-1.4.5/lustre/ldiskfs/ioctl.c: In function `ldiskfs_ioctl'': /local/jrd/hack/lustre-1.4.5/lustre/ldiskfs/ioctl.c:36: warning: implicit declaration of function `lookup_one_len'' /local/jrd/hack/lustre-1.4.5/lustre/ldiskfs/ioctl.c:36: warning: assignment makes pointer from integer without a cast CC [M] /local/jrd/hack/lustre-1.4.5/lustre/ldiskfs/namei.o /local/jrd/hack/lustre-1.4.5/lustre/ldiskfs/namei.c: In function `ldiskfs_new_inode_wantedi'': /local/jrd/hack/lustre-1.4.5/lustre/ldiskfs/namei.c:1643: error: dereferencing pointer to incomplete type I''ll spend a little more time on this to see if I can find the missing defs, but may have to fall back to using a precompiled kernel and modules. As before, if anybody can point me at a foolproof recipe for compiling the whole works, that would be much appreciated.
John R. Dunning wrote:> From: "John R. Dunning" <jrd@jrd.org> > Date: Wed, 11 Jan 2006 14:21:22 -0500 > > > I''ll try going back to the earlier kernel and redoing the build with > USE_QUILT disabled. > > Well, that works better, in that it gets farther, but it''s still not > right. With linux 2.6.9-5.0.5.EL and lustre 1.4.5, commenting out the > part of lustre/ldiskfs/autoMakefile.am that invokes quilt, I get as > far as compiling lustre/ldiskfs, and get > > CC [M] /local/jrd/hack/lustre-1.4.5/lustre/ldiskfs/ioctl.o > /local/jrd/hack/lustre-1.4.5/lustre/ldiskfs/ioctl.c: In function `ldiskfs_ioctl'': > /local/jrd/hack/lustre-1.4.5/lustre/ldiskfs/ioctl.c:36: warning: implicit declaration of function `lookup_one_len'' > /local/jrd/hack/lustre-1.4.5/lustre/ldiskfs/ioctl.c:36: warning: assignment makes pointer from integer without a cast > CC [M] /local/jrd/hack/lustre-1.4.5/lustre/ldiskfs/namei.o > /local/jrd/hack/lustre-1.4.5/lustre/ldiskfs/namei.c: In function `ldiskfs_new_inode_wantedi'': > /local/jrd/hack/lustre-1.4.5/lustre/ldiskfs/namei.c:1643: error: dereferencing pointer to incomplete type > > I''ll spend a little more time on this to see if I can find the missing > defs, but may have to fall back to using a precompiled kernel and > modules. > > As before, if anybody can point me at a foolproof recipe for compiling > the whole works, that would be much appreciated.I don''t think you need the ''--with-kernel-source-header=../kernel.h'' option. You might try using the CFS version of quilt (ftp.lustre.org:/pub/quilt/) The method you are using is the normal one we use and is usually successful. (that doesn''t help, i know) cliffw> > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@lists.clusterfs.com > https://lists.clusterfs.com/mailman/listinfo/lustre-discuss
John R. Dunning wrote:> From: cliff white <cliffw@clusterfs.com> > Date: Wed, 11 Jan 2006 14:54:08 -0800 > > > I don''t think you need the ''--with-kernel-source-header=../kernel.h'' > option. > > Probably not. I added that because there was a piece of the howto > which suggested that in some circumstances it would help, but it > doesn''t seem to make any difference. > > You might try using the CFS version of quilt (ftp.lustre.org:/pub/quilt/) > > That''s an idea too. > > The method you are using is the normal one we use and is usually > successful. (that doesn''t help, i know) > > Actually, it helps a lot! If somebody else is doing it this way, that > means there''s an existence proof, and I just need to figure out what > I''m doing different than what you''re doing. > > What kernel are you using? Do you by chance have a build script that > you could send me?I take the kernel-source and lustre-source rpms from the web site. Usually, I first build, install and boot on the kernel. If you want to be fancy, the kernel tree must be configured and $ make include/asm $ make include/linux/version.h $ make SUBDIRS=scripts Which is the bare minimum. Usually I have time, so I avoid that and run a normal kernel compile. Then, (really) i do just what you are doing, execpt i usually create RPMs: ./configure --with-linux=<path to compiled kernel tree> make rpms Other things to consider: The only other head-pain-maker I know about is caused by a few versions of GCC. We do just fine with GCC v3.3, 3.4, I am aware of people claiming success with v3.2, but > 3.4 or < 2.96, I know of no goodness. If you have a newer gcc, you can use gcc-compat, usually that''s invoked by exporting CC=gcc33 ( i don''t think that matches to your errors, but worth mentioning ) Hope this helps. cliffw> > Thanx! >
From: cliff white <cliffw@clusterfs.com> Date: Wed, 11 Jan 2006 18:10:45 -0800 I take the kernel-source and lustre-source rpms from the web site. Presumably you mean the pre-patched kernel sources? Or are you pulling one of the kernels from ftp://ftp.clusterfs.com/pub/kernels ? If you tell me exactly what kernel rpm(s) you''ve made this work with, I''ll try to duplicate that here. Usually, I first build, install and boot on the kernel. Hmmm. I''ve mostly been trying to build the kernel and lustre off in sidecar directories. That could have something to do with it. From the doc it looked like that was supposed to work, but who knows. Assuming you''re starting with pre-patched kernel sources, that probably means you didn''t have to reboot a new kernel after building lustre, you could just install the modules and go, correct? If you want to be fancy, the kernel tree must be configured and $ make include/asm $ make include/linux/version.h $ make SUBDIRS=scripts Which is the bare minimum. Usually I have time, so I avoid that and run a normal kernel compile. Yeah, I''ve tried it both ways. Then, (really) i do just what you are doing, execpt i usually create RPMs: ./configure --with-linux=<path to compiled kernel tree> make rpms Well, good to know; that''s pretty close to what I was doing yesterday. Other things to consider: The only other head-pain-maker I know about is caused by a few versions of GCC. We do just fine with GCC v3.3, 3.4, I am aware of people claiming success with v3.2, but > 3.4 or < 2.96, I know of no goodness. If you have a newer gcc, you can use gcc-compat, usually that''s invoked by exporting CC=gcc33 ( i don''t think that matches to your errors, but worth mentioning ) I''m using a gcc 3.4 toolchain. It''s running on an amd64 machine, but it''s hard to believe that has anything to do with what I''m seeing. Hope this helps. Absolutely. Many thanks.
John R. Dunning wrote:> I''m using a gcc 3.4 toolchain. It''s running on an amd64 machine, but > it''s hard to believe that has anything to do with what I''m seeing.I recently installed lustre on a cluster of Opteron''s with RHEL4. The easiest way to get the job done is to download the kernel source RPM''s as well as the lustre source rpm from the clusterfs.com download section: kernel-source-2.6.9-5.0.5.EL_lustre.1.4.5.x86_64.rpm lustre-source-1.4.5-2.6.9_5.0.5.EL_lustre.1.4.5smp.x86_64.rpm Install the rpms, for simplicity sake make a link /usr/src/linux -> /usr/src/linux-2.6.9-5.0.5.EL_lustre.1.4.5-obj. cd /usr/src/linux && make config ; make all && make modules_install copy over the System.map, .config, and bzImage to /boot, update grub, and reboot into the newly compiled lustre kernel. Then cd /usr/src/lustre-1.4.5 and ''./configure --with-linux=/usr/src/linux && make rpms'' That should spit the RPMS out to /usr/src/redhat/RPMS/x86_64, which you can then install the lustre-modules, and lustre-1.4.5 rpms. After that it is a matter of using lmc and lconf to configure/start lustre. I am writing this from memory, so I may have missed a step, please excuse any mistakes I have made. Alternatively, if you do not need to recompile your kernel, all you should have to do is install the following rpms: kernel-smp-2.6.9-5.0.5.EL_lustre.1.4.5.x86_64.rpm lustre-modules-1.4.5-2.6.9_5.0.5.EL_lustre.1.4.5smp.x86_64.rpm lustre-1.4.5-2.6.9_5.0.5.EL_lustre.1.4.5smp.x86_64.rpm Hope this helps. -- | David Vasil <dmvasil@ornl.gov> | Oak Ridge National Laboratory NCCS Division | High Performance Computing Systems Administrator | Bldg: 5600-A115 Phone: (865)241-5562
From: David Vasil <dmvasil@ornl.gov> Date: Thu, 12 Jan 2006 10:57:31 -0500 (It seems the policy for downloading has recently changed, you now must supply an email address to get a download link) Go here, fill out your email address: http://www.clusterfs.com/download_form.php Yah, I''ve done that, and am starting with a fresh set of distros. I''ve pulled the lustre 1.4.5.1 source and the rh kernel-source-2.6.9-22.EL_lustre-1.4.5.1 source. Currently cleaning the machine to start from scratch with those. I should be able to replicate your recipe, with the exception that I''m starting from slackware (I have no redhat boxes accessable) but I''m hoping that rpm and the companion tools will work well enough that I can survive the patch and compile phases, which is where I''ve always hung up on prior attempts. Thanks for the hints, will post back later once I''ve finished compiling all this stuff.
On Jan 12, 2006 16:46 -0500, John R. Dunning wrote:> What with distractions, it took me all day to work through trying > this. It turns out that I can''t build the kernel from source > (kernel-source-2.6.9-22.EL_lustre.1.4.5) because my machine is > reiserfs, and attempting to enable that in the config causes the > compile to die in reiserfs. A bit of investigation reveals that there > seems to be at least one definition (the return type of get_rkey) > that''s inconsistent between the headers and the reiserfs code, > resulting in compile errors. That definition is not inconsistent in > the base 2.6.9 sources, nor does it seem to be in any of the patch > sets that come with lustre, which leads me to believe that it''s part > of the diffs from 2.6.9 to RHEL2.6.9.We had other problems compiling 2.6.9-22 with some non-RH-standard config options that we fixed. Our patches only touched the i386 arch code and didn''t do anything for reiserfs. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc.
Hi all. I''m just getting started exploring lustre, so please be kind if this is a dumb question. I''m trying to install lustre on a family of slackware systems (mostly to play with at this point, not production yet) and am trying to figure out whether that''s a Really Bad Idea (tm). I normally run kernel.org kernels, and I understand that that''s not really supported at this point, but I''ve also been trying to use the patched kernel-source tarball with the lustre-1.4.5 sources, and I haven''t been able to get that to build either. I also tried going through the process of applying patches to a RH5.0.5EL kernel; the patches worked, but I was still unable to get lustre to build. FWIW, the place it always seems to fail is building lustre/ldiskfs, at the stage where it''s attempting to copy assorted sources (from the kernel tree, I believe) into linux-stage/fs/ext3; it comes up with an empty list of things to copy, generates a malformed cp command, and dies. If anybody can point me at a foolproof recipe for compiling a kernel (any kernel) plus lustre from sources, I''d greatly appreciate it. TIA...