search for: o2ib

Displaying 18 results from an estimated 18 matches for "o2ib".

Did you mean: o2cb
2007 Nov 12
8
More failover issues
In 1.6.0, when creating a MDT, you could specify multiple --mgsnode options and it would failover between them. 1.6.3 only seems to take the last one and --mgsnode=192.168.1.252 at o2ib:192.168.1.253 at o2ib doesn''t seem to failover to the other node. Any ideas how to get around this? Robert Robert LeBlanc College of Life Sciences Computer Support Brigham Young University leblanc at byu.edu (801)422-1882
2008 Mar 07
2
Multihomed question: want Lustre over IB andEthernet
...2:03:17 2008 Subject: Re: [Lustre-discuss] Multihomed question: want Lustre over IB andEthernet On Fri, Mar 7, 2008 at 9:39 AM, Craig Prescott <prescott at hpc.ufl.edu> wrote: > > I think your client modprobe.conf lnet option > should be this: > > > options lnet networks=o2ib(ib0) > > (not ''o2ib0''). It still seems to want the TCP connection: Lustre: Added LNI 36.122.255.1 at o2ib [8/64] Lustre: Lustre Client File System; info at clusterfs.com LustreError: 11043:0:(events.c:401:ptlrpc_uuid_to_peer()) No NID found for 36.121.255.201 at tcp Lustr...
2013 Mar 26
1
Lustre 2.2 with centos 6.3 gives problem while loading o2ib module for infiniband
Dear All, we are facing problem while connecting o2ib module. Lustre 2.2 with centos 6.3 gives problem while loading o2ib module for infiniband. Thanks in advance Regards, Faheem Patel -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20130326/d0eb1e39/atta...
2010 Jun 22
7
lnet infiniband config
Hi all, I''m getting my feet wet in the infiniband lake and of course I run into some problems. It would seem I got the compilation part of sles11 kernel 2.6.27 + Lustre 1.8.3 + ofed 1.4.2 right, because it allows me to see and use the infiniband fabric, and because ko2iblnd loads without any complaints. In /etc/modprobe.d/lustre (this is a Debian system, hence this subdir of modprobe-configs), I have > options ip2nets="o2ib0 192.168.0.[1-5]" I load lnet and do ''lctl network up'', but then ''lctl list_nids'' will invari...
2006 Sep 25
4
Re: [openib-general] problems with lustre o2ib module & ofed
...kernel is 2.6.16.21-0.8-default and not 2.6.16.21-0.8-smp Thierry. On Mon, 25 Sep 2006, Thierry Delaitre wrote: > > On Mon, 25 Sep 2006, Michael S. Tsirkin wrote: > > > Quoting r. Thierry Delaitre <delaitt@cpc.wmin.ac.uk>: > > > > > > I''ve set the o2ib path to /usr/local/ofed/src/openib-1.1 as shown in the > > > lustre''s configure line below. Lustre''s configure script looks for a > > > driver/infiniband directory which only seems to exist under > > > /usr/local/ofed/src/openib-1.1 > > > > &...
2007 Dec 11
2
lustre + nfs + alphas
...he nfs export on another nfs server and run the same benchmark (bonnie) everything is fine. The lustre mount on the export server can take a real pounding (ive seen it push 300MB/sec) so I don''t know why nfs is crashing it. On the nfs export server i see these messages-- Lustre: 4224:0:(o2iblnd_cb.c:412:kiblnd_handle_rx()) PUT_NACK from 192.168.64.70 at o2ib LustreError: 4400:0:(client.c:969:ptlrpc_expire_one_request()) @@@ timeout (sent at 1197415542, 100s ago) req at ffff810827bfbc00 x38827/t0 o36->data-MDT0000_UUID at 192.168.64.70@o2ib:12 lens 14256/672 ref 1 fl Rpc:/0/0 rc 0/-...
2008 Mar 11
2
Problems mountine lustre thru an ib2ip gateway
Hello, I am trying to mount a lustre filesystem thru an ib2ip gateway. The MDS''s have infiniband connections. The client nodes are tcp/ip connections. I am able to route between the client nodes and the MDS''s. I have the following in /etc/fstab: abe-mds1 at o2ib0,abe-mds2 at o2ib0:/home/client /abehome lustre _netdev,flock 0 0 I get the following when trying to mount: [root at t3honest5 lustre]# mount -v /abehome verbose: 1 arg[0] = /sbin/mount.lustre arg[1] = abe-mds1 at o2ib0,abe-mds2 at o2ib0:/home/client arg[2] = /abehome arg[3] = -v arg[...
2008 Feb 12
0
Lustre-discuss Digest, Vol 25, Issue 17
...the hang is to reboot the server. My users are getting extremely impatient :-/ I see this on the clients- LustreError: 2814:0:(client.c:975:ptlrpc_expire_one_request()) @@@ timeout (sent at 1202756629, 301s ago) req at ffff8100af233600 x1796079/ t0 o6->data-OST0000_UUID at 192.168.64.71@o2ib:28 lens 336/336 ref 1 fl Rpc:/0/0 rc 0/-22 Lustre: data-OST0000-osc-ffff810139ce4800: Connection to service data- OST0000 via nid 192.168.64.71 at o2ib was lost; in progress operations using this service will wait for recovery to complete. LustreError: 11-0: an error occurred while communicati...
2008 Apr 15
5
o2ib module prevents shutdown
Hello, Not sure if this is the right forum: I''m encountering difficulties with o2ib which prevents an LNET shutdown from proceeding: Unloading OpenIB kernel modules:NET: Unregistered protocal family 27 Failed to unload rdma_cm Failed to unload rdma_cm Failed to unload ib_cm Failed to unload ib_sa LustreError: 131-3: Received notification of device removal Please shutdown L...
2009 Apr 21
1
Lustre patchless client with ofed-1.4
I am trying to compile lustre-1.6.6 against ofed-1.4 using following configure options : ./configure --disable-server --with-o2ib=/usr/src/ofa_kernel-1.4 --with-linux=/usr/src/kernels/2.6.18-92.el5-x86_64 I am getting following errors: configure: error: can''t compile with OpenIB gen2 headers under /usr/src/ofa_kernel-1.4 Is it that lustre-1.6.6 will only work with ofed-1.3. or I am doing something wrong. System e...
2008 Feb 04
32
Luster clients getting evicted
on our cluster that has been running lustre for about 1 month. I have 1 MDT/MGS and 1 OSS with 2 OST''s. Our cluster uses all Gige and has about 608 nodes 1854 cores. We have allot of jobs that die, and/or go into high IO wait, strace shows processes stuck in fstat(). The big problem is (i think) I would like some feedback on it that of these 608 nodes 209 of them have in dmesg
2013 Apr 29
1
OSTs inactive on one client (only)
...lrpc_invalidate_import()) Skipped 18 previous similar messages Apr 29 16:21:18 abacus kernel: LustreError: 28707:0:(import.c:350:ptlrpc_invalidate_import()) @@@ still on sending list req at ffff8803b45c6c00 x1430098383471272/t0(0) o101->lustre-OST0003-osc-ffff880331f33400 at 192.168.100.103@o2ib:28/4 lens 328/352 e 0 to 0 dl 1367194410 ref 1 fl Interpret:RE/0/0 rc -5/0 Apr 29 16:21:18 abacus kernel: LustreError: 28707:0:(import.c:350:ptlrpc_invalidate_import()) Skipped 61 previous similar messages Apr 29 16:21:18 abacus kernel: LustreError: 28707:0:(import.c:366:ptlrpc_invalidate_impor...
2012 Dec 28
6
problem with installing lustre and ofed
...this lustre kernel 4. install the remaining rpms 5. download ofed from mellanox "MLNX_OFED_LINUX-1.5.3-3.1.0-rhel6.3-x86_64.iso" * build mellanox ofed bits using the lustre kernel and kernel-devel info * install mellanox ofed 6. reboot 7. upon reboot, if I do NOT have o2ib3 in my lnet networks parameters, I can modprobe lnet and lustre. 8. if I DO have o2ib3 present in the lnet parameters, running modprobe lustre gets me: ib/modules/2.6.32-279.14.1.el6_lustre.x86_64/updates/kernel/fs/lustre/fld.ko): Input/output error WARNING: Error inserting fid (/lib/modules/2.6...
2008 Feb 26
1
Network problem using 1.6.4.1 and OFED-1.3
...aving problem to bring up the network using lustre 1.6.4.1 (2.6.18-8) with OFED-1.3 (InfiniBand). When I run lctl network up, I''m getting the following: LNET configure error 100: Network is down dmesg shows: LustreError: 21080:0:(api-ni.c:1025:lnet_startup_lndnis()) Can''t load LND o2ib, module ko2iblnd, rc=256 Note that the InfiniBand IPoIB network is working properly (I tried ping <ip addr> between the nodes) Also previous OFED-1.2.5 is working fine with the current lustre. Any clue what could be the problem? TIA, Alberto. -------------- next part -------------- An...
2007 Dec 21
0
FW: faking IB multi-rail with multihomed clients
...8-253].* ...here are some different configurations you could create... A. I''ve got many more clients than servers in my cluster. I don''t care if an individual client can''t get 2 rails of bandwidth because the servers are the actual bottleneck... ip2nets="o2ib0(ib0),o2ib1(ib1) 192.168.[0-1].* #all servers;\ o2ib0(ib0) 192.168.[2-253].[0-252/2]#even clients;\ o2ib1(ib1) 192.168.[2-253].[1-253/2]#odd clients" This configuration gives every server 2 NIDs, one on each network - and statically...
2007 Nov 29
2
Balancing I/O Load
We are seeing some disturbing (probably due to our ignorance) behavior from lustre 1.6.3 right now. We have 8 OSSs with 3 OSTs per OSS (24 physical LUNs). We just created a brand new lustre file system across this configuration using the default mkfs.lustre formatting options. We have this file system mounted across 400 clients. At the moment, we have 63 IOzone threads running
2008 Feb 14
9
how do you mount mountconf (i.e. 1.6) lustre on your servers?
As any of you using version 1.6 of Lustre knows, Lustre servers can now be started simply my mounting the devices it is using. Even an /etc/fstab entry can be used if you can have the mount delayed until the network is started. Given this change, you have also notices that we have eliminated the initscript for Lustre that used to exist for releases prior to 1.6. I''d like to take a
2008 Mar 04
16
Cannot send after transport endpoint shutdown (-108)
This morning I''ve had both my infiniband and tcp lustre clients hiccup. They are evicted from the server presumably as a result of their high load and consequent timeouts. My question is- why don''t the clients re-connect. The infiniband and tcp clients both give the following message when I type "df" - Cannot send after transport endpoint shutdown (-108). I''ve