So does lustre work without having IP over IB enabled? I seem to be having problems with this. LustreError: 8847:0:(o2iblnd.c:1569:kiblnd_startup()) Can''t query IPoIB interface ib0: it''s down LustreError: 105-4: Error -100 starting up LNI o2ib LustreError: 8847:0:(events.c:707:ptlrpc_init_portals()) network initialisation failed Our security plan for the box prevents us from having IP over the IB is there any way to run lustre over the IB without IP like Quadrics can? Thanks, - David Brown
On Jun 23, 2008 11:44 -0700, David Brown wrote:> So does lustre work without having IP over IB enabled? I seem to be > having problems with this. > > LustreError: 8847:0:(o2iblnd.c:1569:kiblnd_startup()) Can''t query > IPoIB interface ib0: it''s down > LustreError: 105-4: Error -100 starting up LNI o2ib > LustreError: 8847:0:(events.c:707:ptlrpc_init_portals()) network > initialisation failed > > Our security plan for the box prevents us from having IP over the IB > is there any way to run lustre over the IB without IP like Quadrics > can?The o2iblnd (as with all other IB LNDs) uses IPoIB addresses to identify nodes. It doesn''t use IPoIB to do any communication, however. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Guy Coates
2008-Jun-25 13:30 UTC
[Lustre-discuss] seek heavy workloads and client page cache
I have a seek heavy workload on a couple of files which are small enough to fit into the client page cache. If I run the workload on local disk, the first run is comparatively slow but subsequent runs are much faster as the files are all cached. If I run the same workload over lustre the initial and subsequent run-times are always the same (~10-15% slower than the local disk with no caching). It looks as if there is some attempt at caching; during the initial lustre run the client page cache grows and in subsequent runs there is no disk IO on the OSTs. However, the caching does not improve performance. I am running lustre 1.6.4.3 / kernel 2.6.18 x86 on debian etch. strace snippet of the workload: read(5, "Q1CJ86.1/2-397 2 397 Q1CJ86.1\n", 2097152) = 30 _llseek(4, 92274688, [92274688], SEEK_SET) = 0 read(4, "z\0\0\0\0\0\0\1\217Q1AWM4.1\0\0\0\0\0\0\0S6\254\20\0\0"..., 2097152) 2097152 _llseek(4, 46137344, [46137344], SEEK_SET) = 0 read(4, "\0\1IA5L198.1\0\0\0\0\0\0\0B\333\245\210\0\0\0\0B\333\245"..., 2097152) = 2097152 _llseek(4, 69206016, [69206016], SEEK_SET) = 0 read(4, "\0\0\0\0\3\6A8D589.1\0\0\0\0\0\0\0p\3253\206\0\0\0\0p\325"..., 2097152) = 2097152 _llseek(4, 79691776, [79691776], SEEK_SET) = 0 read(4, "dQ06L05.1\0\0\0\0\0\0\0q\37A\376\0\0\0\0q\37B&\0\0\0\0"..., 2097152) 2097152 _llseek(4, 85983232, [85983232], SEEK_SET) = 0 read(4, "\0\0\0\1\254Q0RI72.1\0\0\0\0\0\0\0Q\22C\251\0\0\0\0Q\22"..., 2097152) 2097152 _llseek(4, 90177536, [90177536], SEEK_SET) = 0 read(4, "Y2.1\0\0\0\0\0\0\0q\222+\10\0\0\0\0q\222+>\0\0\0\0\0\0"..., 2097152) 2097152 _llseek(4, 92274688, [92274688], SEEK_SET) = 0 _llseek(4, 92274688, [92274688], SEEK_SET) = 0 read(4, "z\0\0\0\0\0\0\1\217Q1AWM4.1\0\0\0\0\0\0\0S6\254\20\0\0"..., 2097152) 2097152 _llseek(4, 94371840, [94371840], SEEK_SET) = 0 _llseek(4, 94371840, [94371840], SEEK_SET) = 0 _llseek(4, 94371840, [94371840], SEEK_SET) = 0 _llseek(4, 94371840, [94371840], SEEK_SET) = 0 _llseek(4, 94371840, [94371840], SEEK_SET) = 0 _llseek(4, 94371840, [94371840], SEEK_SET) = 0 _llseek(4, 94371840, [94371840], SEEK_SET) = 0 _llseek(4, 94371840, [94371840], SEEK_SET) = 0 _llseek(4, 94371840, [94371840], SEEK_SET) = 0 _llseek(4, 94371840, [94371840], SEEK_SET) = 0 _llseek(4, 94371840, [94371840], SEEK_SET) = 0 _llseek(4, 94371840, [94371840], SEEK_SET) = 0 _llseek(4, 94371840, [94371840], SEEK_SET) = 0 _llseek(3, 1402994688, [1402994688], SEEK_SET) = 0 read(3, "NGLNDFIVINEKYNTTIASRKN\nSVGMTVIDD"..., 2097152) = 2097152 _llseek(3, 1405091840, [1405091840], SEEK_SET) = 0 Cheers, Guy -- Dr. Guy Coates, Informatics System Group The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1HH, UK Tel: +44 (0)1223 834244 x 6925 Fax: +44 (0)1223 496802 -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
Andreas Dilger
2008-Jun-25 23:13 UTC
[Lustre-discuss] seek heavy workloads and client page cache
On Jun 25, 2008 14:30 +0100, Guy Coates wrote:> I have a seek heavy workload on a couple of files which are small > enough to fit into the client page cache. If I run the workload on > local disk, the first run is comparatively slow but subsequent runs > are much faster as the files are all cached.> strace snippet of the workload: > read(4, "z\0\0\0\0\0\0\1\217Q1AWM4.1\0\0\0\0\0\0\0S6\254\20\0\0"..., 2097152) > 2097152The problem is that your application is using the st_blocksize reported from Lustre (2MB) as the IO size for the application, and this will hurt if you are continuously reading that much data from even the kernel cache. The application should really only read as much data as it needs. It is possible to tune this by specifying a small stripe size (min 64kB), which will translate into an small st_blocksize for the file, but it still isn''t as good as fixing the application. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.
Is there any way to set the address explicitly, say though a modprobe.conf entry, instead of having it probe the interface for it? This cluster can not have IPoIB on it. Thanks, Kevin On Tue, 2008-06-24 at 20:34 -0700, Andreas Dilger wrote:> On Jun 23, 2008 11:44 -0700, David Brown wrote: > > So does lustre work without having IP over IB enabled? I seem to be > > having problems with this. > > > > LustreError: 8847:0:(o2iblnd.c:1569:kiblnd_startup()) Can''t query > > IPoIB interface ib0: it''s down > > LustreError: 105-4: Error -100 starting up LNI o2ib > > LustreError: 8847:0:(events.c:707:ptlrpc_init_portals()) network > > initialisation failed > > > > Our security plan for the box prevents us from having IP over the IB > > is there any way to run lustre over the IB without IP like Quadrics > > can? > > The o2iblnd (as with all other IB LNDs) uses IPoIB addresses to > identify > nodes. It doesn''t use IPoIB to do any communication, however. > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >