Eric Barton
2006-Nov-24 11:37 UTC
[Lustre-discuss] RE: portals/lnet as an abstraction over rdma and/or verbs
Moiz, Dave, My previous comments may be more of an insight into me than LNET :) the concept of getting LNET more widely adopted certainly has support within CFS.> We could possibly make a case to have a enterprise fork that > is somewhere in the middle of Portals and LNET.I''d push LNET itself rather than any derivative if we''re going to head in this direction.> If we can > make a case that this truly provides higher throughput/lower > latency than IPoIB or SDP. Eric/Peter, do you guys have any > measurements that may substantiate this performance assumption?Performance comparisons are fraught with danger viz. hardware/firmware/software revisions. You really have to run dedicated tests on the same hardware before you can compare with confidence. One thing I haven''t mentioned is that LNET has both kernel and userspace implementations. These share the bulk of the network-independent code, but the LND implementations are not shared. Currently we only support TCP/IP and the native Cray XT3 network in userspace. It would be quite easy to add a system call interface to export the kernel LNET API to userspace, but dedicated userspace LND versions would be required to deliver the lowest latency you''d expect from OS bypass. Cheers, Eric --------------------------------------------------- |Eric Barton Barton Software | |9 York Gardens Tel: +44 (117) 330 1575 | |Clifton Mobile: +44 (7909) 680 356 | |Bristol BS8 4LL Fax: call first | |United Kingdom E-Mail: eeb@bartonsoftware.com| ---------------------------------------------------
Scott Atchley
2006-Nov-24 12:35 UTC
[Lustre-discuss] RE: portals/lnet as an abstraction over rdma and/or verbs
>> If we can >> make a case that this truly provides higher throughput/lower >> latency than IPoIB or SDP. Eric/Peter, do you guys have any >> measurements that may substantiate this performance assumption? > > Performance comparisons are fraught with danger viz. > hardware/firmware/software revisions. You really have to run > dedicated > tests on the same hardware before you can compare with confidence. > > One thing I haven''t mentioned is that LNET has both kernel and > userspace > implementations. These share the bulk of the network-independent > code, but > the LND implementations are not shared. Currently we only support > TCP/IP > and the native Cray XT3 network in userspace. It would be quite > easy to add > a system call interface to export the kernel LNET API to userspace, > but > dedicated userspace LND versions would be required to deliver the > lowest > latency you''d expect from OS bypass. > > Cheers, > EricIn the MX (Myrinet Express) LND, the MX API itself is identical in kernel-space vs user-space. The differences are in the LND only and would refer to the thread creation and synchronization methods (kernel threads vs pthreads, spinlocks vs pthread_mutex_lock(), etc.). Latency and bandwidth should be equivalent for user vs kernel space with the exception below. The only performance difference between MX in the kernel and user- space is handling of multiple segments. In the kernel, we made optimizations for Lustre to handle 256 kernel pages, for example. In user-space, MX is not similarly optimized and would try to copy them into a single buffer before sending. This cuts bandwidth by half on 10 Gb/s fabrics. If LNET were pushed into user-space, we would look at providing this optimization. Scott