Hi again,
some good news and some bad news.
The lustre seems to work without complaints so far and bonnie++
happily runs.
On the other hand we notice that on the MDS system the used memory is
constanly growing and dmesg says:
[ 8521.310497] VFS: file-max limit 759122 reached
MemTotal: 8207172 kB
MemFree: 7266360 kB sinking
Buffers: 240348 kB
Cached: 10848 kB
SwapCached: 0 kB
Active: 73860 kB
Inactive: 188220 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 8207172 kB
LowFree: 7266360 kB sinking
SwapTotal: 4208888 kB
SwapFree: 4208888 kB
Dirty: 2048 kB
Writeback: 0 kB
Mapped: 16524 kB
Slab: 337072 kB growing
CommitLimit: 8312472 kB
Committed_AS: 83052 kB growing slowly
PageTables: 832 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 5924 kB
VmallocChunk: 34359732423 kB
I fear the kernel allocates struct file somewhere and doesn''t free it
again. I had to do some guessing on the new it_file member so that is
the likeliest spot.
MfG
Goswin
Hi,
Roland Fehrenbacher taked me to update the lustre kernel patches and
lustre to work with 2.6.15 kernels and so far it looks like I
succeeded. We started formating with 2 nodes but haven''t tested with
an actual client yet (object storage is done, meta data still
running).
If anyone is intrested you can download the changes from
http://mrvn.homeip.net/lustre/
I would very much welcome if someone with more lustre experience could
look over the changes to see if I guessed wrong anywhere.
MfG
Goswin
Erich Focht
2006-May-19 07:36 UTC
[Lustre-discuss] Update for LustreFS with 2.6.15-vanilla
Hi Goswin, is the file leak which you fixed specific to 2.6.15? Or is it a problem on other kernels, too? I''m playing with the RHEL4 2.6.9-22.0.2.EL kernel and am wondering whether I should expect trouble. Thanks, best regards, Erich On Monday 06 March 2006 15:37, Goswin von Brederlow wrote:> Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de> writes: > > > If anyone is intrested you can download the changes from > > http://mrvn.homeip.net/lustre/ > > I tracked down the file leak in the kernel and corrected the > vfs_intent-2.6.15.patch. With this problem solved real testing can > commence. > > MfG > Goswin-- Dr. Erich Focht Solution Architecture Group, Linux R&D NEC High Performance Computing Europe Stuttgart, Germany
Peter Kjellström
2006-May-19 07:36 UTC
[Lustre-discuss] Update for LustreFS with 2.6.15-vanilla
On Monday 06 March 2006 16:16, Goswin von Brederlow wrote:> Peter Kjellstr?m <cap@nsc.liu.se> writes: > > On Monday 06 March 2006 15:37, Goswin von Brederlow wrote: > >> Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de> writes: > >> > If anyone is intrested you can download the changes from > >> > http://mrvn.homeip.net/lustre/ > >> > >> I tracked down the file leak in the kernel and corrected the > >> vfs_intent-2.6.15.patch. With this problem solved real testing can > >> commence. > > > > Good job there, maybe I should try that and see if my performance > > problems goes away :-) > > > > /Peter > > I doubt it has any impact on performance. Much more important is the > hardware support for newer x86_64 cpus and boards.I''ll agree with that, but this is more like a ext3 bug I think, causing extremely bad performance. There may or may not be a relevant patch added between 2.6.9 rhel4 and 2.6.15. Anyways, I''m very glad to se someone working on mainline support. /Peter> > MfG > Goswin-- ------------------------------------------------------------ Peter Kjellstr?m | E-mail: cap@nsc.liu.se National Supercomputer Centre | Sweden | http://www.nsc.liu.se -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 191 bytes Desc: not available Url : http://mail.clusterfs.com/pipermail/lustre-discuss/attachments/20060306/c8070031/attachment.bin
Goswin von Brederlow
2006-May-19 07:36 UTC
[Lustre-discuss] Update for LustreFS with 2.6.15-vanilla
Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de> writes:> If anyone is intrested you can download the changes from > http://mrvn.homeip.net/lustre/I tracked down the file leak in the kernel and corrected the vfs_intent-2.6.15.patch. With this problem solved real testing can commence. MfG Goswin
Peter Kjellström
2006-May-19 07:36 UTC
[Lustre-discuss] Update for LustreFS with 2.6.15-vanilla
On Monday 06 March 2006 15:37, Goswin von Brederlow wrote:> Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de> writes: > > If anyone is intrested you can download the changes from > > http://mrvn.homeip.net/lustre/ > > I tracked down the file leak in the kernel and corrected the > vfs_intent-2.6.15.patch. With this problem solved real testing can > commence.Good job there, maybe I should try that and see if my performance problems goes away :-) /Peter> > MfG > Goswin-- ------------------------------------------------------------ Peter Kjellstr?m | National Supercomputer Centre | Sweden | http://www.nsc.liu.se -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 191 bytes Desc: not available Url : http://mail.clusterfs.com/pipermail/lustre-discuss/attachments/20060306/10750f2e/attachment.bin
Goswin von Brederlow
2006-May-19 07:36 UTC
[Lustre-discuss] Update for LustreFS with 2.6.15-vanilla
Erich Focht <efocht@hpce.nec.com> writes:> Hi Goswin, > > is the file leak which you fixed specific to 2.6.15? Or is it a problem on > other kernels, too? I''m playing with the RHEL4 2.6.9-22.0.2.EL kernel and am > wondering whether I should expect trouble. > > Thanks, > best regards, > ErichSpecific to 2.6.15. The problem was in filp_open() where the code changed between 2.6.12 and 2.6.15 and at first failed to adapt the code properly to that change. The leakage was extreme, a few thousand files a second on a running system. Everyone would have noticed this. MfG Goswin
Goswin von Brederlow
2006-May-19 07:36 UTC
[Lustre-discuss] Update for LustreFS with 2.6.15-vanilla
Peter Kjellstr?m <cap@nsc.liu.se> writes:> On Monday 06 March 2006 15:37, Goswin von Brederlow wrote: >> Goswin von Brederlow <brederlo@informatik.uni-tuebingen.de> writes: >> > If anyone is intrested you can download the changes from >> > http://mrvn.homeip.net/lustre/ >> >> I tracked down the file leak in the kernel and corrected the >> vfs_intent-2.6.15.patch. With this problem solved real testing can >> commence. > > Good job there, maybe I should try that and see if my performance problems > goes away :-) > > /PeterI doubt it has any impact on performance. Much more important is the hardware support for newer x86_64 cpus and boards. MfG Goswin