On 2/3/2005 18:07, Christopher Alexander Stein wrote:>
>> From some lustre documentation: "Lustre uses the Portals software
> originally developed at Sandia Labs to provide a network
> abstraction layer that simplifies using Lustre across multiple
> types of networks."
>
> Isn''t this network abstraction exactly why IP, TCP, and UDP
> exist? Why portals?
Yes and no. TCP is a useful abstraction for a lot of purposes, but
it''s far
from perfect for a high-performance cluster file system.
For organizations that invest in the high-end cluster networking gear --
Quadrics Elan, InfiniBand, SCI, Myrinet etc. -- we can do a _lot_ better
than TCP. We can provide 300% or 400% better performance over native
Infiniband than we can over TCP over IB. The same for Quadrics.
And because these interconnects support RDMA for zero-copy transmit and
receive, we do it with a fraction of the CPU overhead that TCP imposes on
us.
The next question is often "but what about TCP offload cards?" We
experimented with a few cards and found them severely lacking. They
didn''t
go far enough -- or change enough of the kernel APIs -- to eliminate the
really costly parts of the protocol, and the memory copies.
In short: to reach our performance targets, we could not always accept the
costs and limitations of TCP. We needed a different abstraction layer that
could be efficient on a variety of specialized hardware. Because we also
want to support commodity interconnects like Ethernet, we of course
implement TCP/IP as one such Portals backend.
-Phil