Hi, Here would seem to be a good place to ask my somewhat arcane questions on Lustre development. These are intended not only as a "is this officially planned", but also an "is anyone else coding this" and also an "if I were to work on this, am I more likely to see interest from others, or just gentlemen with a white jacket for me to try on". First up, underlying network protocols. At present, if I understand it correctly, Lustre uses TCP as the underlying mechanism. These days, Linux sports other reliable TCP-like protocols (such as DCCP), but it is unclear if there would be any benefit to Lustre using them. How hard would it be to add support for TCP-like protocols? Is there any current work on determining if there would be any obvious benefit in providing such support? Associated with this is something I noted in the docs, that there''s a lot of RDMA and other fairly hefty network activity. This is "obvious" enough. However, it must get very complicated if many machines attempt to access relatively nearby regions of disk. Other than the group I''m working with, I know of nobody who is doing RDMA over reliable multicast - it''s all unicast. The upshot of this is that it would be hard to escape exponential degradation of performance for cases where you have a lot of reads in parallel. It would need to be in parallel as any staggering in the access would mean only one node would listen to the multicast anyway. As such, both the problem and the solution are special cases, and thus are simply not going to coincide that often. The question would then be one of "how often do these cases coincide in real world situations?" and "assuming there to be enough incidents to make the solution interesting, would the added weight outweigh any benefits it might have for Lustre?" Now onto the even dodgier ground of third-party extensions to Linux. There are two that look like they might be of interest, in a Lustre-enabled world. The first is ABISS - the Active Block I/O Scheduling System - which purports to provide soft real-time and prioritized best-effort block device access, with a better scheduler than CFQ (which I notice Lustre patches). It might be interesting to see if ABISS'' scheduler works better for Lustre than CFQ. The second extension is SGI''s "Scheduled Transfer Protocol". At the time it was developed, nobody could think of a use for it that couldn''t be done better by other means, so they dropped it. However, it occurs to me that a clustered filesystem would be a case in which pre-negotiation would have some value. There may, therefore, be code that could be salvaged from STP and placed into Lustre, or merely ideas that could be recycled from one of their papers on the protocol. Finally, there''s optimization and bug-hunting. (Bugs? No, no bugs here! :) Sandia has an interesting project called DAKOTA, for analyzing and profiling massively parallel projects. Web100 does TCP instrumentation. kTAU provides yet another set of kernel profiling tools. It would seem that providing hooks to one or more of these projects would allow both internal and external developers to compile a greater range of statistics on what happens inside the Lustre framework. (Linux Trace Toolkit would have been interesting, but it would require serious voodoo to get its corpse re-animated.) __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com