Hi all,
I''m having some troubles migrating from lustre v1.4 to v1.6
I have the following arch:
- EMC CLARiiON CX300 storage
- 2 machines connected throw fibre to the storage
In order to take advantage of the 2 fibres we tought that we could assign 2
OSTs one in each machine for the same device, and lustre would take care of the
locking.
I version 1.4 , we couldn''t do that because it claims there was
allready one NID associated with the device. I don''t know if this
cen?rio is possible or not, but it would be the prefered one ?
Has we couldn''t made that configuration, we have created 2 devices one
for each machine.
- Each machine would have its own OST
In version 1.4 we have this working. But in version 1.6 their is some problems.
we manage to set the filesystem with the 2 OSTs (500G each) and do a client
mount in bouth.
X.X.X.X@tcp:/homefs
1.2T 946M 1.1T 1% /local/home
But, when trying to mount with a client that is in another network it simply
doesn''t work:
We manage to find that now it needs one aditional port (998), after setting it
in FW rules we can execute the mount comand but it get''s stuck with the
following errors we doing a "df -h":
Jul 6 09:42:05 kernel: LustreError:
14525:0:(client.c:950:ptlrpc_expire_one_request()) @@@ timeout (sent at
1183711320, 5s ago) req@c31a8600 x7941/t0
o8->homefs-OST0001_UUID@193.137.51.23@tcp:6 lens 240/272 ref 1 fl Rpc:/0/0 rc
0/-22
Jul 6 09:42:05 kernel: LustreError:
14525:0:(client.c:950:ptlrpc_expire_one_request()) Skipped 2 previous similar
messages
If we umount the volume, it can show half the size. What makes me bealive that
it can reach one of the OSTs.
Do i need to open any set of ports for each OST in order for this to work ? are
their any other ones besides 998 ?
Any comments ?
Best Regards
--
Rui Ramos
=============================================Universidade do Porto - IRICUP
Pra?a Gomes Teixeira, 4099-002 Porto, Portugal
email: rramos[at]reit.up.pt
phone: +351 22 040 8164
==============================================