I tried to move my MDS from one filesystem on the same machine to another, using the procedure outlined in the Lustre manuals (I didn''t use dd, since the underlying disks weren''t the same size and also I did not think it was required). Specifically, I used rsync to copy the files, and also used getfattr/setfattr to copy over the extended attributes. Some brief poking around seemed to show that the EA information made it into the new filesystem. However, when I went to mount the "new" MDS partition, it failed with the following error: May 30 23:36:50 mds-foo kernel: [ 186.604083] LustreError: 3082:0:(md_local_object.c:433:llo_local_objects_setup()) creating obj [fld] fid = [0x200000001:0x3:0x0] rc = -116 May 30 23:36:50 mds-foo kernel: [ 186.698205] LustreError: 3082:0:(mdt_handler.c:4576:mdt_init0()) Can''t init device stack, rc -116 May 30 23:36:50 mds-foo kernel: [ 186.797206] LustreError: 3082:0:(obd_config.c:522:class_setup()) setup foo-MDT0000 failed (-116) May 30 23:36:50 mds-foo kernel: [ 186.806140] LustreError: 3082:0:(obd_config.c:1363:class_config_llog_handler()) Err -116 on cfg command: May 30 23:36:50 mds-foo kernel: [ 186.815615] Lustre: cmd=cf003 0:foo-MDT0000 1:foo-MDT0000_UUID 2:0 3:foo-MDT0000-mdtlov 4:f There were more errors, bu they all pretty much were cascading from these errors. I switched back to the original filesystem and everything worked. I am willing to believe I did something wrong, but I''m not sure what; I did everything the directions said to do. -116 is ESTALE, and I found in the code where I believe that error was returned, but it was a little unclear to me what the root cause was. Can anyone offer any advice? --Ken
Colin Faber
2013-Jun-04 17:56 UTC
Re: [HPDD-discuss] Unable to move MDS using procedure in the manual
Hi Ken, Which version? -cf On Tue, Jun 4, 2013 at 11:44 AM, Ken Hornstein <kenh-vT06rRrALxcmhCb6mdbn6A@public.gmane.org>wrote:> I tried to move my MDS from one filesystem on the same machine to another, > using the procedure outlined in the Lustre manuals (I didn''t use dd, since > the underlying disks weren''t the same size and also I did not think > it was required). > > Specifically, I used rsync to copy the files, and also used > getfattr/setfattr > to copy over the extended attributes. Some brief poking around seemed to > show that the EA information made it into the new filesystem. > > However, when I went to mount the "new" MDS partition, it failed with the > following error: > > May 30 23:36:50 mds-foo kernel: [ 186.604083] LustreError: > 3082:0:(md_local_object.c:433:llo_local_objects_setup()) creating obj [fld] > fid = [0x200000001:0x3:0x0] rc = -116 > May 30 23:36:50 mds-foo kernel: [ 186.698205] LustreError: > 3082:0:(mdt_handler.c:4576:mdt_init0()) Can''t init device stack, rc -116 > May 30 23:36:50 mds-foo kernel: [ 186.797206] LustreError: > 3082:0:(obd_config.c:522:class_setup()) setup foo-MDT0000 failed (-116) > May 30 23:36:50 mds-foo kernel: [ 186.806140] LustreError: > 3082:0:(obd_config.c:1363:class_config_llog_handler()) Err -116 on cfg > command: > May 30 23:36:50 mds-foo kernel: [ 186.815615] Lustre: cmd=cf003 > 0:foo-MDT0000 1:foo-MDT0000_UUID 2:0 3:foo-MDT0000-mdtlov 4:f > > There were more errors, bu they all pretty much were cascading from these > errors. I switched back to the original filesystem and everything worked. > > I am willing to believe I did something wrong, but I''m not sure what; I > did everything the directions said to do. -116 is ESTALE, and I found > in the code where I believe that error was returned, but it was a little > unclear to me what the root cause was. Can anyone offer any advice? > > --Ken > _______________________________________________ > HPDD-discuss mailing list > HPDD-discuss-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org > https://lists.01.org/mailman/listinfo/hpdd-discuss >_______________________________________________ Lustre-discuss mailing list Lustre-discuss-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Kalpak Shah
2013-Jun-04 17:57 UTC
[HPDD-discuss] Unable to move MDS using procedure in the manual
Which version of Lustre is this? File based backup / restore does not work in 2.x. OI scrub which rebuilds the object index is available from Lustre 2.3 onwards. So file based backup / restore will work from 2.3 onwards. Regards, Kalpak On Tuesday, June 4, 2013, Ken Hornstein wrote:> I tried to move my MDS from one filesystem on the same machine to another, > using the procedure outlined in the Lustre manuals (I didn''t use dd, since > the underlying disks weren''t the same size and also I did not think > it was required). > > Specifically, I used rsync to copy the files, and also used > getfattr/setfattr > to copy over the extended attributes. Some brief poking around seemed to > show that the EA information made it into the new filesystem. > > However, when I went to mount the "new" MDS partition, it failed with the > following error: > > May 30 23:36:50 mds-foo kernel: [ 186.604083] LustreError: > 3082:0:(md_local_object.c:433:llo_local_objects_setup()) creating obj [fld] > fid = [0x200000001:0x3:0x0] rc = -116 > May 30 23:36:50 mds-foo kernel: [ 186.698205] LustreError: > 3082:0:(mdt_handler.c:4576:mdt_init0()) Can''t init device stack, rc -116 > May 30 23:36:50 mds-foo kernel: [ 186.797206] LustreError: > 3082:0:(obd_config.c:522:class_setup()) setup foo-MDT0000 failed (-116) > May 30 23:36:50 mds-foo kernel: [ 186.806140] LustreError: > 3082:0:(obd_config.c:1363:class_config_llog_handler()) Err -116 on cfg > command: > May 30 23:36:50 mds-foo kernel: [ 186.815615] Lustre: cmd=cf003 > 0:foo-MDT0000 1:foo-MDT0000_UUID 2:0 3:foo-MDT0000-mdtlov 4:f > > There were more errors, bu they all pretty much were cascading from these > errors. I switched back to the original filesystem and everything worked. > > I am willing to believe I did something wrong, but I''m not sure what; I > did everything the directions said to do. -116 is ESTALE, and I found > in the code where I believe that error was returned, but it was a little > unclear to me what the root cause was. Can anyone offer any advice? > > --Ken > _______________________________________________ > HPDD-discuss mailing list > HPDD-discuss-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org > https://lists.01.org/mailman/listinfo/hpdd-discuss >-- CEO | Clogeny Technologies | http://www.clogeny.com US Direct: +1 408-556-9645 *|* (O) +91 20 661 43 482 *|* (C) +91 98903 61223 _______________________________________________ Lustre-discuss mailing list Lustre-discuss-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Dilger, Andreas
2013-Jun-04 18:02 UTC
Re: [HPDD-discuss] Unable to move MDS using procedure in the manual
On 2013/04/06 11:44 AM, "Ken Hornstein" <kenh-vT06rRrALxcmhCb6mdbn6A@public.gmane.org> wrote:>I tried to move my MDS from one filesystem on the same machine to another, >using the procedure outlined in the Lustre manuals (I didn''t use dd, since >the underlying disks weren''t the same size and also I did not think >it was required).What version of Lustre is this?>Specifically, I used rsync to copy the files, and also used >getfattr/setfattr >to copy over the extended attributes. Some brief poking around seemed to >show that the EA information made it into the new filesystem.File-level backups are not supported with Lustre 2.1 and 2.2. You either need to use "dd" for backup/restore, or use Lustre 2.3.0 or later with the "LFSCK OI Scrub" functionality. Lustre 2.4.0 is preferred (especially if this is for testing purposes) since it can do proper rebuilding of the FID-in-dirent and LinkEA attributes on the MDT.>However, when I went to mount the "new" MDS partition, it failed with the >following error: > >May 30 23:36:50 mds-foo kernel: [ 186.604083] LustreError: >3082:0:(md_local_object.c:433:llo_local_objects_setup()) creating obj >[fld] fid = [0x200000001:0x3:0x0] rc = -116 >May 30 23:36:50 mds-foo kernel: [ 186.698205] LustreError: >3082:0:(mdt_handler.c:4576:mdt_init0()) Can''t init device stack, rc -116 >May 30 23:36:50 mds-foo kernel: [ 186.797206] LustreError: >3082:0:(obd_config.c:522:class_setup()) setup foo-MDT0000 failed (-116) >May 30 23:36:50 mds-foo kernel: [ 186.806140] LustreError: >3082:0:(obd_config.c:1363:class_config_llog_handler()) Err -116 on cfg >command: >May 30 23:36:50 mds-foo kernel: [ 186.815615] Lustre: cmd=cf003 >0:foo-MDT0000 1:foo-MDT0000_UUID 2:0 3:foo-MDT0000-mdtlov 4:f > >There were more errors, bu they all pretty much were cascading from these >errors. I switched back to the original filesystem and everything worked. > >I am willing to believe I did something wrong, but I''m not sure what; I >did everything the directions said to do. -116 is ESTALE, and I found >in the code where I believe that error was returned, but it was a little >unclear to me what the root cause was. Can anyone offer any advice? > >--Ken >_______________________________________________ >HPDD-discuss mailing list >HPDD-discuss-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org >https://lists.01.org/mailman/listinfo/hpdd-discuss >Cheers, Andreas -- Andreas Dilger Lustre Software Architect Intel High Performance Data Division
Ken Hornstein
2013-Jun-04 18:03 UTC
Re: [HPDD-discuss] Unable to move MDS using procedure in the manual
>Which version?Whoops, can you believe I forgot that? It''s 2.1.2. --Ken
Ken Hornstein
2013-Jun-04 18:10 UTC
Re: [HPDD-discuss] Unable to move MDS using procedure in the manual
>Which version of Lustre is this? File based backup / restore does not work >in 2.x. OI scrub which rebuilds the object index is available from Lustre >2.3 onwards. So file based backup / restore will work from 2.3 onwards.Well, crud. I guess that''s what Colin was going to tell me, and I see Andreas said the same thing. So, this leads to a follow-up question: _where_ is latest and greatest Lustre manual? I used the one labelled 2.0 here: http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html Which doesn''t actually mention that you can''t do a file-level backup on the MDT. Some poking around led me to the Whamcloud one, which actually does say that. Perhaps an upgrade to 2.4 is in order (which we were interested in doing anyway). --Ken
Christopher J. Morrone
2013-Jun-04 18:17 UTC
Re: [HPDD-discuss] Unable to move MDS using procedure in the manual
On 06/04/2013 11:10 AM, Ken Hornstein wrote:>> Which version of Lustre is this? File based backup / restore does not work >> in 2.x. OI scrub which rebuilds the object index is available from Lustre >> 2.3 onwards. So file based backup / restore will work from 2.3 onwards. > > Well, crud. I guess that''s what Colin was going to tell me, and I see > Andreas said the same thing. > > So, this leads to a follow-up question: _where_ is latest and greatest > Lustre manual? I used the one labelled 2.0 here: > > http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.htmlThe portal for the most up to date information is now lustre.opensfs.org. The "Documentation" tab at the top will have links for the latest manual. I do see a note about OI Scrub being added in 2.3 in that manual. I only browsed, so I''m not sure if the information is _enough_ for your needs. But if it isn''t, you can open an LUDOC bug to let folks know that it needs to be expanded. Chris
Colin Faber
2013-Jun-04 18:21 UTC
Re: [HPDD-discuss] Unable to move MDS using procedure in the manual
You could try this http://xyratex.prod.acquia-sites.com/sites/default/files/Migration_WC2.1_Patches_1-0.tar.gz Shadow wrote this tool, it allows you to correct OI database, though it''s not been updated in a while so I''m not sure how cleanly it will patch. -cf On Tue, Jun 4, 2013 at 12:10 PM, Ken Hornstein <kenh-vT06rRrALxcmhCb6mdbn6A@public.gmane.org>wrote:> >Which version of Lustre is this? File based backup / restore does not work > >in 2.x. OI scrub which rebuilds the object index is available from Lustre > >2.3 onwards. So file based backup / restore will work from 2.3 onwards. > > Well, crud. I guess that''s what Colin was going to tell me, and I see > Andreas said the same thing. > > So, this leads to a follow-up question: _where_ is latest and greatest > Lustre manual? I used the one labelled 2.0 here: > > http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html > > Which doesn''t actually mention that you can''t do a file-level backup > on the MDT. Some poking around led me to the Whamcloud one, which actually > does say that. > > Perhaps an upgrade to 2.4 is in order (which we were interested in doing > anyway). > > --Ken > _______________________________________________ > HPDD-discuss mailing list > HPDD-discuss-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org > https://lists.01.org/mailman/listinfo/hpdd-discuss >_______________________________________________ Lustre-discuss mailing list Lustre-discuss-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Dilger, Andreas
2013-Jun-04 20:17 UTC
Re: [HPDD-discuss] Unable to move MDS using procedure in the manual
On 2013/04/06 12:21 PM, "Colin Faber" <cfaber-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:>You could try this >http://xyratex.prod.acquia-sites.com/sites/default/files/Migration_WC2.1_P >atches_1-0.tar.gz > >Shadow wrote this tool, it allows you to correct OI database, though it''s >not been updated in a while so I''m not sure how cleanly it will patch.If you are going to patch 2.1.x, then there are patches in Gerrit which backport the online LFSCK OI Scrub: http://review.whamcloud.com/#q,message:LU-957,n,v Upgrading to 2.4 is probably preferred, but depends on your system. Cheers, Andreas>On Tue, Jun 4, 2013 at 12:10 PM, Ken Hornstein ><kenh-vT06rRrALxcmhCb6mdbn6A@public.gmane.org> wrote: > >>Which version of Lustre is this? File based backup / restore does not >>work >>in 2.x. OI scrub which rebuilds the object index is available from Lustre >>2.3 onwards. So file based backup / restore will work from 2.3 onwards. > > >Well, crud. I guess that''s what Colin was going to tell me, and I see >Andreas said the same thing. > >So, this leads to a follow-up question: _where_ is latest and greatest >Lustre manual? I used the one labelled 2.0 here: > >http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html > >Which doesn''t actually mention that you can''t do a file-level backup >on the MDT. Some poking around led me to the Whamcloud one, which >actually >does say that. > >Perhaps an upgrade to 2.4 is in order (which we were interested in doing >anyway).Cheers, Andreas -- Andreas Dilger Lustre Software Architect Intel High Performance Data Division
Jones, Peter A
2013-Jun-05 03:10 UTC
Re: [HPDD-discuss] Unable to move MDS using procedure in the manual
I would caution use of those back ports as we never got around to testing them. So, if you do decide to go ahead anyway then please let us know how you fare... On 6/4/13 1:17 PM, "Dilger, Andreas" <andreas.dilger-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:>If you are going to patch 2.1.x, then there are patches in Gerrit which >backport the online LFSCK OI Scrub: >http://review.whamcloud.com/#q,message:LU-957,n,v > >Upgrading to 2.4 is probably preferred, but depends on your system