Michael Barnes
2011-Feb-24 15:38 UTC
[Lustre-discuss] unable to mount ost after mdt migration and mds/mgs failover configuration
Hello list, I''ve migrated our mdt data to a new machine via tar and getfattr, and then did the following: mds/mgs: rm OBJECTS/* CATALOGS mds/mgs: tunefs.lustre --writeconf --mgs --mdt --fsname=lustre --erase-param --param mdt.quota_type=ug2 --param mdt.group_upcall=/usr/sbin/l_getgroups --param failover.node=172.17.4.124 at o2ib /dev/sdb mds/mgs: mount -t lustre /dev/sdb /mdt Everything seems OK up to this point. I then modified each ost with: oss: tunefs.lustre --writeconf --erase-param --param mgsnode=172.17.4.123 at o2ib:172.17.4.124 at o2ib --param ost.quota_type=ug2 /dev/sdb Then, when I try to mount the ost I get in dmesg: Lustre: 3269:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request x1361639653769217 sent from MGC172.17.4.123 at o2ib to NID 172.17.4.124 at o2ib 0s ago has failed due to network error (5s prior to deadline). req at ffff810133f24800 x1361639653769217/t0 o250->MGS at MGC172.17.4.123@o2ib_0:26/25 lens 368/584 e 0 to 1 dl 1298560772 ref 1 fl Rpc:N/0/0 rc 0/0 LustreError: 3164:0:(client.c:858:ptlrpc_import_delay_req()) @@@ IMP_INVALID req at ffff810106b39c00 x1361639653769218/t0 o253->MGS at MGC172.17.4.123@o2ib_0:26/25 lens 4736/4928 e 0 to 1 dl 0 ref 1 fl Rpc:/0/0 rc 0/0 LustreError: 3164:0:(obd_mount.c:1097:server_start_targets()) Required registration failed for lustre-OST0037: -108 LustreError: 3164:0:(obd_mount.c:1655:server_fill_super()) Unable to start targets: -108 LustreError: 3164:0:(obd_mount.c:1438:server_put_super()) no obd lustre-OST0037 LustreError: 3164:0:(obd_mount.c:147:server_deregister_mount()) lustre-OST0037 not registered The first mds/mgsnode has the mdt mounted. lctl ping and regular TCP/ip works over the ib interface. Did I miss something that the ost needs to join the new mds/mgs configuration? Thanks, -mb -- +----------------------------------------------- | Michael Barnes | | Thomas Jefferson National Accelerator Facility | Scientific Computing Group | 12000 Jefferson Ave. | Newport News, VA 23606 | (757) 269-7634 +-----------------------------------------------
Michael Barnes
2011-Feb-24 18:36 UTC
[Lustre-discuss] unable to mount ost after mdt migration and mds/mgs failover configuration
Update: If I do this: tunefs.lustre --writeconf --erase-param --param mgsnode=172.17.4.123 at o2ib --param ost.quota_type=ug2 --fsname=lustre /dev/sdc ie, no failover specification on the mgsnode the ost mounts. Do I have the wrong syntax in the below line? tunefs.lustre --writeconf --erase-param --param mgsnode=172.17.4.123 at o2ib:172.17.4.124 at o2ib --param ost.quota_type=ug2 --fsname=lustre /dev/sdc Thanks in advance, -mb On Feb 24, 2011, at 10:38 AM, Michael Barnes wrote:> > Hello list, > > I''ve migrated our mdt data to a new machine via tar and getfattr, and then did the following: > > mds/mgs: rm OBJECTS/* CATALOGS > > mds/mgs: tunefs.lustre --writeconf --mgs --mdt --fsname=lustre --erase-param --param mdt.quota_type=ug2 --param mdt.group_upcall=/usr/sbin/l_getgroups --param failover.node=172.17.4.124 at o2ib /dev/sdb > > mds/mgs: mount -t lustre /dev/sdb /mdt > > > Everything seems OK up to this point. > > > I then modified each ost with: > > oss: tunefs.lustre --writeconf --erase-param --param mgsnode=172.17.4.123 at o2ib:172.17.4.124 at o2ib --param ost.quota_type=ug2 /dev/sdb > > > Then, when I try to mount the ost I get in dmesg: > > Lustre: 3269:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request x1361639653769217 sent from MGC172.17.4.123 at o2ib to NID 172.17.4.124 at o2ib 0s ago has failed due to network error (5s prior to deadline). > req at ffff810133f24800 x1361639653769217/t0 o250->MGS at MGC172.17.4.123@o2ib_0:26/25 lens 368/584 e 0 to 1 dl 1298560772 ref 1 fl Rpc:N/0/0 rc 0/0 > LustreError: 3164:0:(client.c:858:ptlrpc_import_delay_req()) @@@ IMP_INVALID req at ffff810106b39c00 x1361639653769218/t0 o253->MGS at MGC172.17.4.123@o2ib_0:26/25 lens 4736/4928 e 0 to 1 dl 0 ref 1 fl Rpc:/0/0 rc 0/0 > LustreError: 3164:0:(obd_mount.c:1097:server_start_targets()) Required registration failed for lustre-OST0037: -108 > LustreError: 3164:0:(obd_mount.c:1655:server_fill_super()) Unable to start targets: -108 > LustreError: 3164:0:(obd_mount.c:1438:server_put_super()) no obd lustre-OST0037 > LustreError: 3164:0:(obd_mount.c:147:server_deregister_mount()) lustre-OST0037 not registered > > > The first mds/mgsnode has the mdt mounted. lctl ping and regular TCP/ip works over the ib interface. Did I miss something that the ost needs to join the new mds/mgs configuration? > > Thanks, > > -mb > > -- > +----------------------------------------------- > | Michael Barnes > | > | Thomas Jefferson National Accelerator Facility > | Scientific Computing Group > | 12000 Jefferson Ave. > | Newport News, VA 23606 > | (757) 269-7634 > +----------------------------------------------- > > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss-- +----------------------------------------------- | Michael Barnes | | Thomas Jefferson National Accelerator Facility | Scientific Computing Group | 12000 Jefferson Ave. | Newport News, VA 23606 | (757) 269-7634 +-----------------------------------------------
D. Marc Stearman
2011-Feb-24 20:44 UTC
[Lustre-discuss] unable to mount ost after mdt migration and mds/mgs failover configuration
According to the Lustre Manual, you specify multiple mgsnode parameters on the OSTs, for example: tunefs.lustre --writeconf --erase-param --param mgsnode=172.17.4.123 at o2ib --param mgsnode=172.17.4.124 at o2ib --param ost.quota_type=ug2 --fsname=lustre /dev/sdc should accomplish what you are trying to do. When building the file system using mkfs.lustre, the syntax is slightly different with multiple --mgsnode= options specified. The clients would then mount using: mount -t lustre 172.17.4.123 at o2ib:172.17.4.124 at o2ib:/lustre <mountpoint> This is in section 4.4.1 of the manual (at least the version I''m looking at - from July 2010) -Marc ---- D. Marc Stearman Lustre Operations Lead marc at llnl.gov 925.423.9670 Pager: 1.888.203.0641 On Feb 24, 2011, at 10:36 AM, Michael Barnes wrote:> > Update: > > If I do this: > > tunefs.lustre --writeconf --erase-param --param > mgsnode=172.17.4.123 at o2ib --param ost.quota_type=ug2 -- > fsname=lustre /dev/sdc > > ie, no failover specification on the mgsnode the ost mounts. > > Do I have the wrong syntax in the below line? > > tunefs.lustre --writeconf --erase-param --param > mgsnode=172.17.4.123 at o2ib:172.17.4.124 at o2ib --param > ost.quota_type=ug2 --fsname=lustre /dev/sdc > > Thanks in advance, > > -mb > > On Feb 24, 2011, at 10:38 AM, Michael Barnes wrote: > >> >> Hello list, >> >> I''ve migrated our mdt data to a new machine via tar and getfattr, >> and then did the following: >> >> mds/mgs: rm OBJECTS/* CATALOGS >> >> mds/mgs: tunefs.lustre --writeconf --mgs --mdt --fsname=lustre -- >> erase-param --param mdt.quota_type=ug2 --param mdt.group_upcall=/ >> usr/sbin/l_getgroups --param failover.node=172.17.4.124 at o2ib /dev/sdb >> >> mds/mgs: mount -t lustre /dev/sdb /mdt >> >> >> Everything seems OK up to this point. >> >> >> I then modified each ost with: >> >> oss: tunefs.lustre --writeconf --erase-param --param >> mgsnode=172.17.4.123 at o2ib:172.17.4.124 at o2ib --param >> ost.quota_type=ug2 /dev/sdb >> >> >> Then, when I try to mount the ost I get in dmesg: >> >> Lustre: 3269:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ >> Request x1361639653769217 sent from MGC172.17.4.123 at o2ib to NID >> 172.17.4.124 at o2ib 0s ago has failed due to network error (5s prior >> to deadline). >> req at ffff810133f24800 x1361639653769217/t0 o250- >> >MGS at MGC172.17.4.123@o2ib_0:26/25 lens 368/584 e 0 to 1 dl >> 1298560772 ref 1 fl Rpc:N/0/0 rc 0/0 >> LustreError: 3164:0:(client.c:858:ptlrpc_import_delay_req()) @@@ >> IMP_INVALID req at ffff810106b39c00 x1361639653769218/t0 o253->MGS at MGC172.17.4.123 >> @o2ib_0:26/25 lens 4736/4928 e 0 to 1 dl 0 ref 1 fl Rpc:/0/0 rc 0/0 >> LustreError: 3164:0:(obd_mount.c:1097:server_start_targets()) >> Required registration failed for lustre-OST0037: -108 >> LustreError: 3164:0:(obd_mount.c:1655:server_fill_super()) Unable >> to start targets: -108 >> LustreError: 3164:0:(obd_mount.c:1438:server_put_super()) no obd >> lustre-OST0037 >> LustreError: 3164:0:(obd_mount.c:147:server_deregister_mount()) >> lustre-OST0037 not registered >> >> >> The first mds/mgsnode has the mdt mounted. lctl ping and regular >> TCP/ip works over the ib interface. Did I miss something that the >> ost needs to join the new mds/mgs configuration? >> >> Thanks, >> >> -mb >> >> -- >> +----------------------------------------------- >> | Michael Barnes >> | >> | Thomas Jefferson National Accelerator Facility >> | Scientific Computing Group >> | 12000 Jefferson Ave. >> | Newport News, VA 23606 >> | (757) 269-7634 >> +----------------------------------------------- >> >> >> >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > -- > +----------------------------------------------- > | Michael Barnes > | > | Thomas Jefferson National Accelerator Facility > | Scientific Computing Group > | 12000 Jefferson Ave. > | Newport News, VA 23606 > | (757) 269-7634 > +----------------------------------------------- > > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Nirmal Seenu
2011-Feb-24 21:24 UTC
[Lustre-discuss] unable to mount ost after mdt migration and mds/mgs failover configuration
I use multiple --mgsnode= options in the tunefs command just like the mkfs.lustre commands and they work as well: tunefs.lustre --erase-params --writeconf --ost --mgsnode=iblustre1 at o2ib --mgsnode=iblustre2 at o2ib --param ost.quota_type=ug /dev/sdc1 Nirmal