Hello, is there any chance to see patchless server similarly to the client? I like Lustre but I dislike the need for patched kernel on the server. -- Luk?? Hejtm?nek
On Wed, 2008-10-08 at 14:03 +0200, Lukas Hejtmanek wrote:> Hello,Hi,> is there any chance to see patchless server similarly to the client?Maybe some day. We seem to be eliminating patches from the kernel little by little, but by no means at a break-neck pace.> I like Lustre but I dislike the need for patched kernel on the server.You are not the first person to express this desire, by a long shot, but I wonder, given that the servers are supposed to be dedicated Lustre servers, why are the already-patched binary kernel packages that we supply not sufficient for you? b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081008/33314a06/attachment.bin
Hi, Excellent question. In our particular situation, we have all of our servers running OFED 1.2 and the redhat kernel 2.6.9-55. In order to get more stability, we would like to upgrade to the latest lustre verstion, but that is not possible without upgrading the kernel, infiniband stack and all IB dependent libraries. In essense, we have to rebuild our entire cluster to do that because the latest lustre does not support our kernel, and the kernel it does support is not supported by our infiniband stack. It''s very annoying, actually. It would be awesome if Sun supported a few kernels behind the latest and greatest. On Wed, Oct 8, 2008 at 7:40 AM, Brian J. Murrell <Brian.Murrell at sun.com>wrote:> On Wed, 2008-10-08 at 14:03 +0200, Lukas Hejtmanek wrote: > > Hello, > > Hi, > > > is there any chance to see patchless server similarly to the client? > > Maybe some day. We seem to be eliminating patches from the kernel > little by little, but by no means at a break-neck pace. > > > I like Lustre but I dislike the need for patched kernel on the server. > > You are not the first person to express this desire, by a long shot, but > I wonder, given that the servers are supposed to be dedicated Lustre > servers, why are the already-patched binary kernel packages that we > supply not sufficient for you? > > b. > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >-- /* ################################################### #Josh Abadie #HPC Systems Administrator #High Performance Computing @ LSU #work:225-578-8425 #cell:225-202-5633 #email: jabadi2 at gmail.com #calendar: http://www.google.com/calendar/embed?src=jabadi2%40gmail.com&ctz=America/Chicago ################################################### */ "You must never give in to despair. Allow yourself to slip down that road, and you surrender to your lowest instincts. In the darkest times, hope is something you give yourself. That is the meaning of inner strength." - 2.5, "Avatar Day" -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081008/a84753de/attachment.html
On Wed, 2008-10-08 at 07:51 -0500, Josh Abadie wrote:> Hi, > > Excellent question. In our particular situation, we have all of our > servers running OFED 1.2 and the redhat kernel 2.6.9-55. In order to > get more stability, we would like to upgrade to the latest lustre > verstion, but that is not possible without upgrading the kernel, > infiniband stack and all IB dependent libraries.Right. But the latest verison, 1.6.5.1 comes with OFED 1.3 already built -- as an installable RPM. Is that not suitable for your environment?> It would be awesome if Sun supported a few kernels behind the latest > and greatest.The problem with that is that it increases our QA efforts exponentially. Many people think our QA cycle is already too long. Throw another few kernels at it and it would take months to QA a release. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081008/f3953cc6/attachment.bin
On Wed, Oct 08, 2008 at 08:40:07AM -0400, Brian J. Murrell wrote:> You are not the first person to express this desire, by a long shot, but > I wonder, given that the servers are supposed to be dedicated Lustre > servers, why are the already-patched binary kernel packages that we > supply not sufficient for you?well, one use case scenario with dedicated Luster servers could addopt almost any kernel. However, at the university supercomputing center, we are thinking about scenario of a cluster file system where each computing node is also Lustre data server so that we can unify all the local disks into one big scratch area.. Is this insane? -- Luk?? Hejtm?nek
On Wed, 2008-10-08 at 15:31 +0200, Lukas Hejtmanek wrote:> > well, one use case scenario with dedicated Luster servers could addopt almost > any kernel.I''m afraid I didn''t grok that.> However, at the university supercomputing center, we are thinking about > scenario of a cluster file system where each computing node is also Lustre > data server so that we can unify all the local disks into one big scratch > area.. Is this insane?Yes. If you run a lustre client and OST on the same machine you can get deadlocks under memory pressure. This has been discussed many times on this list. I''m sure the archives can provide good details. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20081008/22c52185/attachment.bin
How do you get involved in the QA process. I would like to get involved to speed up the efforts if possible. TIA On Wed, Oct 8, 2008 at 9:04 AM, Brian J. Murrell <Brian.Murrell at sun.com> wrote:> On Wed, 2008-10-08 at 07:51 -0500, Josh Abadie wrote: >> Hi, >> >> Excellent question. In our particular situation, we have all of our >> servers running OFED 1.2 and the redhat kernel 2.6.9-55. In order to >> get more stability, we would like to upgrade to the latest lustre >> verstion, but that is not possible without upgrading the kernel, >> infiniband stack and all IB dependent libraries. > > Right. But the latest verison, 1.6.5.1 comes with OFED 1.3 already > built -- as an installable RPM. Is that not suitable for your > environment? > >> It would be awesome if Sun supported a few kernels behind the latest >> and greatest. > > The problem with that is that it increases our QA efforts exponentially. > Many people think our QA cycle is already too long. Throw another few > kernels at it and it would take months to QA a release. > > b. > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >
On Thu, 2008-10-09 at 21:05 -0400, Mag Gam wrote:> How do you get involved in the QA process.You get a job at Sun in the Lustre Group''s QA department. :-)> I would like to get involved to speed up the efforts if possible.It''s not really something that can be sped up by external efforts. Of course any amount of testing you can do to help find and report bugs is appreciated. We always welcome bug reports and efforts from the community. b.
On Wed, Oct 08, 2008 at 08:40:07AM -0400, Brian J. Murrell wrote:> On Wed, 2008-10-08 at 14:03 +0200, Lukas Hejtmanek wrote: > > Hello, > > Hi, > > > is there any chance to see patchless server similarly to the client? > > Maybe some day. We seem to be eliminating patches from the kernel > little by little, but by no means at a break-neck pace. > > > I like Lustre but I dislike the need for patched kernel on the server. > > You are not the first person to express this desire, by a long shot, but > I wonder, given that the servers are supposed to be dedicated Lustre > servers, why are the already-patched binary kernel packages that we > supply not sufficient for you?While I think I understand why you say this, it very easily can sound like a monopolistic tactic to sell more Sun hardware. You start with "Well, aren''t our binary kernels good enough" .. which next turns to "Well we don''t support vendor X''s raid controler with our binaries".. next on the list is "Well, we QA everything on Sun hardware" I think you will have a defensible case for "Use our binaries" when you can support patched Debian and Ubuntu kernel packages, and there''s been discussion on the debian-kernel list about the merits of the patches.
On Fri, 2008-10-10 at 12:51 -0500, Troy Benjegerdes wrote:> > While I think I understand why you say this, it very easily can sound > like a monopolistic tactic to sell more Sun hardware.Heh. I''m not sure I''m going to be able to say anything that will convince you otherwise. But to your points I will say...> You start with "Well, aren''t our binary kernels good enough" .. which > next turns to "Well we don''t support vendor X''s raid controler with our > binaries"..Our available binary kernels are those from the two biggest commercial (as in the distros that our experience tells us our customers use and want) distributions in our customer base, which are Red Hat''s EL and Suse''s ES kernels. This is driven by customer demand. We don''t remove hardware support from those kernels. We have in the past added (i.e. newer) support for hardware such as the Qlogic QLA drivers because that''s what our customers were using and demanding. That was long before we were even a Sun interest so it''s immaterial to your argument. Currently we do replace the RHEL supplied OFED stack but we do that to cleanly provide the OFED 1.3 stack in return. This again was driven by customer demand, not "sell more Sun hardware" (which is a preposterous argument considering the O in OFED is for "open" and as such OFED supports everyone''s hardware).> next on the list is "Well, we QA everything on Sun hardware"We may or may not test on Sun hardware. That''s pretty irrelevant though. We don''t, and never have to the best of my knowledge, refused to support other people''s hardware. On the contrary, I don''t think anyone ever even asks what hardware a particular bug report is related to unless it''s somehow material to the bug. Indeed there are peculiarities to some given pieces of hardware but we strive (within reason) to accommodate those, not shun them. Linux generally takes care of much of that for us though.> I think you will have a defensible case for "Use our binaries" when you > can support patched Debian and Ubuntu kernel packages,Certainly it would be nice to support everyone''s kernels, but we have limited resources and our customers have told us what kernels they want support for and that''s what we support. You have to appreciate that Lustre development costs money to keep going and that money has to come from somewhere and currently it''s coming from customers who want RHEL and SLES kernels. If there was a business case in supporting Debian/Ubuntu kernels, I think we''d be doing it. That said, we are proud that Lustre has been able to continue as an Open Source development project and as such are happy to see the community take up the packaging of Debian/Ubuntu packages in some of the Debian distributions for the community user base. In addition, IIRC there was an offer made on this list to include some amount of Debian/Ubuntu packaging foo in our official source repository should somebody want to contribute something. I don''t think anyone has stepped up (yet). I am still hopeful. b.
On Fri, Oct 10, 2008 at 02:23:27PM -0400, Brian J. Murrell wrote:> On Fri, 2008-10-10 at 12:51 -0500, Troy Benjegerdes wrote: > > > > While I think I understand why you say this, it very easily can sound > > like a monopolistic tactic to sell more Sun hardware. > > Heh. I''m not sure I''m going to be able to say anything that will > convince you otherwise. But to your points I will say...I appreciate the effort ;)> Certainly it would be nice to support everyone''s kernels, but we have > limited resources and our customers have told us what kernels they want > support for and that''s what we support. You have to appreciate that > Lustre development costs money to keep going and that money has to come > from somewhere and currently it''s coming from customers who want RHEL > and SLES kernels. If there was a business case in supporting > Debian/Ubuntu kernels, I think we''d be doing it. > > That said, we are proud that Lustre has been able to continue as an Open > Source development project and as such are happy to see the community > take up the packaging of Debian/Ubuntu packages in some of the Debian > distributions for the community user base. > > In addition, IIRC there was an offer made on this list to include some > amount of Debian/Ubuntu packaging foo in our official source repository > should somebody want to contribute something. I don''t think anyone has > stepped up (yet). I am still hopeful.I went through the process of installing on Debian a month or two ago. It seems to work relatively well. http://wiki.lustre.org/index.php?title=Debian_Install All this effort in packaging and QA problems seems to kinda be something that would just go away with a patchless server though. Which I think leads back to having good documentation on what each patch in the set is for, and what issues it has in getting merged into upstream kernel.org.
On Fri, 2008-10-10 at 22:06 -0500, Troy Benjegerdes wrote:> > I appreciate the effort ;):-)> I went through the process of installing on Debian a month or two ago. > It seems to work relatively well.Good to hear.> All this effort in packaging and QA problemsI wondering what QA problems you are referring to.> seems to kinda be something > that would just go away with a patchless server though.True enough, some amount of packaging effort would go away with patchless server support. Patchless server support doesn''t really do anything to make QA any easier though. But ultimately, at least currently, a patchless server would have a significant performance impact. Most of our customers, as much as they would like a patchless server, appreciate the performance gains that can be made for the patched kernel (on what should be a dedicated server anyway).> Which I think > leads back to having good documentation on what each patch in the set is > for,Sure. In an ideal world where there were no resource limitations.> and what issues it has in getting merged into upstream kernel.org.You can probably dig into lkml for that. Rest assured, we have tried, more than once in the past from what I understand and were met with and tried to work through various objections each time. I won''t attempt to even give opinions on why those attempts were blocked as I was not at all involved in the effort. But we''ve been there and done that. b.
so, when is this available? On Fri, Oct 10, 2008 at 11:28 PM, Brian J. Murrell <Brian.Murrell at sun.com> wrote:> On Fri, 2008-10-10 at 22:06 -0500, Troy Benjegerdes wrote: >> >> I appreciate the effort ;) > > :-) > >> I went through the process of installing on Debian a month or two ago. >> It seems to work relatively well. > > Good to hear. > >> All this effort in packaging and QA problems > > I wondering what QA problems you are referring to. > >> seems to kinda be something >> that would just go away with a patchless server though. > > True enough, some amount of packaging effort would go away with > patchless server support. Patchless server support doesn''t really do > anything to make QA any easier though. > > But ultimately, at least currently, a patchless server would have a > significant performance impact. Most of our customers, as much as they > would like a patchless server, appreciate the performance gains that can > be made for the patched kernel (on what should be a dedicated server > anyway). > >> Which I think >> leads back to having good documentation on what each patch in the set is >> for, > > Sure. In an ideal world where there were no resource limitations. > >> and what issues it has in getting merged into upstream kernel.org. > > You can probably dig into lkml for that. Rest assured, we have tried, > more than once in the past from what I understand and were met with and > tried to work through various objections each time. I won''t attempt to > even give opinions on why those attempts were blocked as I was not at > all involved in the effort. But we''ve been there and done that. > > b. > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
On Sat, 2008-10-11 at 01:01 -0400, Mag Gam wrote:> so, when is this available?When is *what* available? b.
On Oct 08, 2008 07:51 -0500, Josh Abadie wrote:> Excellent question. In our particular situation, we have all of our servers > running OFED 1.2 and the redhat kernel 2.6.9-55. In order to get more > stability, we would like to upgrade to the latest lustre verstion, but that > is not possible without upgrading the kernel, infiniband stack and all IB > dependent libraries. In essense, we have to rebuild our entire cluster to > do that because the latest lustre does not support our kernel, and the > kernel it does support is not supported by our infiniband stack. It''s very > annoying, actually. It would be awesome if Sun supported a few kernels > behind the latest and greatest.Why not just rebuild Lustre against your current kernel? Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
I suspect the big problem being, when you modify the kernel you lose support from vendors. Thats alteast our case. On Sat, Oct 11, 2008 at 12:03 PM, Andreas Dilger <adilger at sun.com> wrote:> On Oct 08, 2008 07:51 -0500, Josh Abadie wrote: >> Excellent question. In our particular situation, we have all of our servers >> running OFED 1.2 and the redhat kernel 2.6.9-55. In order to get more >> stability, we would like to upgrade to the latest lustre verstion, but that >> is not possible without upgrading the kernel, infiniband stack and all IB >> dependent libraries. In essense, we have to rebuild our entire cluster to >> do that because the latest lustre does not support our kernel, and the >> kernel it does support is not supported by our infiniband stack. It''s very >> annoying, actually. It would be awesome if Sun supported a few kernels >> behind the latest and greatest. > > Why not just rebuild Lustre against your current kernel? > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
On Oct 10, 2008 22:06 -0500, Troy Benjegerdes wrote:> All this effort in packaging and QA problems seems to kinda be something > that would just go away with a patchless server though. Which I think > leads back to having good documentation on what each patch in the set is > for, and what issues it has in getting merged into upstream kernel.org.Whether the kernel is patched or not is almost irrelevant to the testing of Lustre. Yes, it''s true that there are sometimes bugs in our kernel patches, but I don''t think that makes up an significant portion of our testing efforts. As for a patchless kernel, I recently did an analysis of our server patches, and it seems possible that we could remove kernel patches with Lustre 2.0 if development effort is put in that direction. As yet, this hasn''t been a priority from any of our customers, so it by necessity takes a back seat to implementing other features they are interested in. That doesn''t mean it won''t happen, but rather on an "as possible" basis. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
On Fri, Oct 10, 2008 at 11:28:24PM -0400, Brian J. Murrell wrote:> But ultimately, at least currently, a patchless server would have a > significant performance impact. Most of our customers, as much as they > would like a patchless server, appreciate the performance gains that can > be made for the patched kernel (on what should be a dedicated server > anyway).well, general significant performance boost patches should go to mainstream, am I wrong? Does it cost more resources to try them merge to the mainstream than maintaining them separately? Would there be significant performance loss if the server would be moved completely to the user space and then there would be minimum problems with kernels. -- Luk?? Hejtm?nek
2008/10/14 Lukas Hejtmanek <xhejtman at ics.muni.cz>:> On Fri, Oct 10, 2008 at 11:28:24PM -0400, Brian J. Murrell wrote: >> But ultimately, at least currently, a patchless server would have a >> significant performance impact. Most of our customers, as much as they >> would like a patchless server, appreciate the performance gains that can >> be made for the patched kernel (on what should be a dedicated server >> anyway). > > well, general significant performance boost patches should go to mainstream, > am I wrong? Does it cost more resources to try them merge to the mainstream > than maintaining them separately?I think you misunderstood Brian, the server patches aren''t really "general performance" patches, they are specific to Lustre and the mainline kernel people won''t accept patches like that without an in-kernel user among other reasons... The ldiskfs work is being integrated and forms a large chunk of ext4.> Would there be significant performance loss if the server would be moved > completely to the user space and then there would be minimum problems with > kernels.That is/was the plan last time I heard... not sure what the current status is... search bugzilla for uoss/umds