thr3ads.net - Lustre discuss - [Lustre-discuss] MD1000 woes and OSS migration suggestions [Dec 2009]

If this information is useful, please help other people find it:
Share via:

Nick Jennings

2009-Dec-30 02:33 UTC

[Lustre-discuss] MD1000 woes and OSS migration suggestions

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Everyone,

 We''ve been using an MD1000 as our storage array for close to a year
now, just hooked up to one OSS (LVM+ldiskfs). I recently ordered 2 more
servers, one to be hooked up to the MD1000 to help distribute the load,
the other to act as a lustre client (web node).

 The hosting company informs me that the MD1000 was never setup to
operate in split mode (which I asked for in the beginning) so basically
only one server can be connected to it.

 I now am faced with a tough call, we can''t bring the filesystem down
for any extended period of time (a few minutes is OK, though 0 downtime
would be perfect!) and I''m not sure how to proceed in a way that would
make things cause the least amount of headache.

 The only thing I can think of is to set up a second MD1000 (configured
for split mode) connect it to OSS2 (the new one which is not yet being
used), add it to the Lustre filesystem and then somehow migrate the data
from OSS1 (old MD1000) to OSS2 (new MD1000) ... then, bring OSS1
offline, and connect it to the second partition of new MD1000 and bring
that end online once more.

 I''ve never done anything like this and am not entirely sure if this is
the best method. Any suggestions, alternatives, docs or things to look
out for would be greatly appreciated.

Thanks,
Nick

- -- 
Nick Jennings
Director of Technology
Creative Motion Design
www.creativemotiondesign.com

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAks6u+QACgkQ3WjKacHecdMqgwCfZorkD1w1ri3I2/M3APHIpxQI
/68An0GvkWvR6F5vOY5zz9Ty2u23rtaO
=Rurj
-----END PGP SIGNATURE-----

Andreas Dilger

2009-Dec-30 22:44 UTC

head link

[Lustre-discuss] MD1000 woes and OSS migration suggestions

On 2009-12-29, at 19:33, Nick Jennings wrote:> Hi Everyone,
Hi Nick,
> We''ve been using an MD1000 as our storage array for close to a
year
> now, just hooked up to one OSS (LVM+ldiskfs). I recently ordered 2  
> more
> servers, one to be hooked up to the MD1000 to help distribute the  
> load,
> the other to act as a lustre client (web node).
>
> The hosting company informs me that the MD1000 was never setup to
> operate in split mode (which I asked for in the beginning) so  
> basically
> only one server can be connected to it.
>
> I now am faced with a tough call, we can''t bring the filesystem
down
> for any extended period of time (a few minutes is OK, though 0  
> downtime
> would be perfect!) and I''m not sure how to proceed in a way that
would
> make things cause the least amount of headache.
>
> The only thing I can think of is to set up a second MD1000 (configured
> for split mode) connect it to OSS2 (the new one which is not yet being
> used), add it to the Lustre filesystem and then somehow migrate the  
> data
> from OSS1 (old MD1000) to OSS2 (new MD1000) ... then, bring OSS1
> offline, and connect it to the second partition of new MD1000 and  
> bring
> that end online once more.
There is a section in the manual about "manual data migration", which
should let you move the data.  Note that this is not 100% transparent  
to applications, but it is safe if you know either that files are  
unused, or only open for reading.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Aaron Knister

2009-Dec-31 00:34 UTC

head link

[Lustre-discuss] MD1000 woes and OSS migration suggestions

Andreas,

Out of sheer curiosity, I was wondering when the "lfs migrate" feature
mentioned on the lustre arch wiki might show up :) I can''t wait to add
it to my lustre utility belt.

-Aaron

On Dec 30, 2009, at 5:44 PM, Andreas Dilger wrote:
> On 2009-12-29, at 19:33, Nick Jennings wrote:
>> Hi Everyone,
> 
> Hi Nick,
> 
>> We''ve been using an MD1000 as our storage array for close to a
year
>> now, just hooked up to one OSS (LVM+ldiskfs). I recently ordered 2  
>> more
>> servers, one to be hooked up to the MD1000 to help distribute the  
>> load,
>> the other to act as a lustre client (web node).
>> 
>> The hosting company informs me that the MD1000 was never setup to
>> operate in split mode (which I asked for in the beginning) so  
>> basically
>> only one server can be connected to it.
>> 
>> I now am faced with a tough call, we can''t bring the
filesystem down
>> for any extended period of time (a few minutes is OK, though 0  
>> downtime
>> would be perfect!) and I''m not sure how to proceed in a way
that would
>> make things cause the least amount of headache.
>> 
>> The only thing I can think of is to set up a second MD1000 (configured
>> for split mode) connect it to OSS2 (the new one which is not yet being
>> used), add it to the Lustre filesystem and then somehow migrate the  
>> data
>> from OSS1 (old MD1000) to OSS2 (new MD1000) ... then, bring OSS1
>> offline, and connect it to the second partition of new MD1000 and  
>> bring
>> that end online once more.
> 
> There is a section in the manual about "manual data migration",
which
> should let you move the data.  Note that this is not 100% transparent  
> to applications, but it is safe if you know either that files are  
> unused, or only open for reading.
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Wojciech Turek

2009-Dec-31 01:55 UTC

head link

[Lustre-discuss] MD1000 woes and OSS migration suggestions

Hi Nick,

I don''t think you should invest into new MD1000 brick just to make it
working in split mode.
Split mode doesn''t give you much, except that you will have your 15
disk split between two servers but individual servers won''t be able to
see other half of the MD1000 storage. This is not that great because
you don''t get extra redundancy and failover functionality in lustre.
I think the best approach here will be to buy MD3000 RAID array
enclosure (which is basically MD1000 + two built in raid controller
modules). It costs around ?1.5k more than MD1000 but it is definitely
worth it.
MD3000 allows to connect up to two servers with fully redundant data
paths from each server to any virtual disk configured on the MD3000
controller. if you follow link below you find cabling diagram Figure
2-9. Cabling Two Hosts (with Dual HBAs) Using Redundant Data Paths
Also you can connect maximum up to four server with non redundant data
paths, Figure 2-6. Cabling Up to Four Hosts with Non redundant Data
Paths
http://support.dell.com/support/edocs/systems/md3000/en/2ndGen/HOM/HTML/operate.htm
In addition to that you can hook up extra two MD1000 enclosures to a
single MD3000 array and they will be managed by MD3000 RAID
controllers which will make your life much easier.

In order to migrate your data from lustre file system
''lustre1''
located on OSS1 I suggest to set up brand new lustre file system
''lustre2'' on OSS2 connected to MD3000 enclosure and then using
your
third server acting as a lustre client mount both lustre file systems
and copy data from lustre1 to lustre2. At some point you will need to
make lustre1 quiescent so there is no new writes done to it, you can
do that by deactivating all lustre1 OST on the MDS and then you can
make a final rsync between lustre1 and lustre2. Once this is done you
can umount lustre1 and lustre2 and then mount lustre2 back under
lustre1 mount point. Once you have your production lustre filesystem
working on lustre2 you can disconnect MD1000 from OSS1 and connect it
to MD3000 expansion ports. You can also connect OSS01 to MD3000
controller ports. This way you will get extra space available from
added MD1000 which you can use to configure new OSTs and add them to
lustre2 file system. Since both OSS1 and OSS2 can see each others OSTs
(thanks to MD3000) you can configure lustre failover on this servers.
If you will need more capacity in the future you can just connect
second MD1000 to your MD3000 controller.

In my cluster I have six (MD3000 MD1000 MD1000) triplets configured as
a single large lustre file system which provides around 180TB RAID6
usable space and it works pretty well providing very good aggregated
bandwidth.

If you will have more questions don''t hesitate to drop me an email. I
have a bit of experience (bad and good) with this Dell hardware and I
am happy to help.

Best regards,

Wojciech



2009/12/30 Nick Jennings <nick at
creativemotiondesign.com>:> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Everyone,
>
> ?We''ve been using an MD1000 as our storage array for close to a
year
> now, just hooked up to one OSS (LVM+ldiskfs). I recently ordered 2 more
> servers, one to be hooked up to the MD1000 to help distribute the load,
> the other to act as a lustre client (web node).
>
> ?The hosting company informs me that the MD1000 was never setup to
> operate in split mode (which I asked for in the beginning) so basically
> only one server can be connected to it.
>
> ?I now am faced with a tough call, we can''t bring the filesystem
down
> for any extended period of time (a few minutes is OK, though 0 downtime
> would be perfect!) and I''m not sure how to proceed in a way that
would
> make things cause the least amount of headache.
>
> ?The only thing I can think of is to set up a second MD1000 (configured
> for split mode) connect it to OSS2 (the new one which is not yet being
> used), add it to the Lustre filesystem and then somehow migrate the data
> from OSS1 (old MD1000) to OSS2 (new MD1000) ... then, bring OSS1
> offline, and connect it to the second partition of new MD1000 and bring
> that end online once more.
>
> ?I''ve never done anything like this and am not entirely sure if
this is
> the best method. Any suggestions, alternatives, docs or things to look
> out for would be greatly appreciated.
>
> Thanks,
> Nick
>
> - --
> Nick Jennings
> Director of Technology
> Creative Motion Design
> www.creativemotiondesign.com
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>
> iEYEARECAAYFAks6u+QACgkQ3WjKacHecdMqgwCfZorkD1w1ri3I2/M3APHIpxQI
> /68An0GvkWvR6F5vOY5zz9Ty2u23rtaO
> =Rurj
> -----END PGP SIGNATURE-----
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>


-- 
--
Wojciech Turek

Assistant System Manager

High Performance Computing Service
University of Cambridge
Email: wjt27 at cam.ac.uk
Tel: (+)44 1223 763517

Andreas Dilger

2009-Dec-31 21:43 UTC

head link

[Lustre-discuss] MD1000 woes and OSS migration suggestions

On 2009-12-30, at 17:34, Aaron Knister wrote:> Out of sheer curiosity, I was wondering when the "lfs migrate"
> feature mentioned on the lustre arch wiki might show up :)
> I can''t wait to add it to my lustre utility belt.
The plumbing for this feature is being implemented as part of the
HSM project.  The HSM functionality will be available in the 2.1
release (end 2010/early 2011), and online migration can be completed
after that time.
> On Dec 30, 2009, at 5:44 PM, Andreas Dilger wrote:
>> On 2009-12-29, at 19:33, Nick Jennings wrote:
>>> Hi Everyone,
>>
>> Hi Nick,
>>
>>> We''ve been using an MD1000 as our storage array for close
to a year
>>> now, just hooked up to one OSS (LVM+ldiskfs). I recently ordered 2
>>> more
>>> servers, one to be hooked up to the MD1000 to help distribute the
>>> load,
>>> the other to act as a lustre client (web node).
>>>
>>> The hosting company informs me that the MD1000 was never setup to
>>> operate in split mode (which I asked for in the beginning) so
>>> basically
>>> only one server can be connected to it.
>>>
>>> I now am faced with a tough call, we can''t bring the
filesystem down
>>> for any extended period of time (a few minutes is OK, though 0
>>> downtime
>>> would be perfect!) and I''m not sure how to proceed in a
way that
>>> would
>>> make things cause the least amount of headache.
>>>
>>> The only thing I can think of is to set up a second MD1000  
>>> (configured
>>> for split mode) connect it to OSS2 (the new one which is not yet  
>>> being
>>> used), add it to the Lustre filesystem and then somehow migrate the
>>> data
>>> from OSS1 (old MD1000) to OSS2 (new MD1000) ... then, bring OSS1
>>> offline, and connect it to the second partition of new MD1000 and
>>> bring
>>> that end online once more.
>>
>> There is a section in the manual about "manual data
migration", which
>> should let you move the data.  Note that this is not 100% transparent
>> to applications, but it is safe if you know either that files are
>> unused, or only open for reading.
>>
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Sr. Staff Engineer, Lustre Group
>> Sun Microsystems of Canada, Inc.
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Mag Gam

2010-Jan-01 22:53 UTC

head link

[Lustre-discuss] MD1000 woes and OSS migration suggestions

Does HP offer something similar to what you are saying Wojciech? It
sounds very impressive.


On Wed, Dec 30, 2009 at 8:55 PM, Wojciech Turek <wjt27 at cam.ac.uk>
wrote:> Hi Nick,
>
> I don''t think you should invest into new MD1000 brick just to make
it
> working in split mode.
> Split mode doesn''t give you much, except that you will have your
15
> disk split between two servers but individual servers won''t be
able to
> see other half of the MD1000 storage. This is not that great because
> you don''t get extra redundancy and failover functionality in
lustre.
> I think the best approach here will be to buy MD3000 RAID array
> enclosure (which is basically MD1000 + two built in raid controller
> modules). It costs around ?1.5k more than MD1000 but it is definitely
> worth it.
> MD3000 allows to connect up to two servers with fully redundant data
> paths from each server to any virtual disk configured on the MD3000
> controller. if you follow link below you find cabling diagram Figure
> 2-9. Cabling Two Hosts (with Dual HBAs) Using Redundant Data Paths
> Also you can connect maximum up to four server with non redundant data
> paths, Figure 2-6. Cabling Up to Four Hosts with Non redundant Data
> Paths
>
http://support.dell.com/support/edocs/systems/md3000/en/2ndGen/HOM/HTML/operate.htm
> In addition to that you can hook up extra two MD1000 enclosures to a
> single MD3000 array and they will be managed by MD3000 RAID
> controllers which will make your life much easier.
>
> In order to migrate your data from lustre file system
''lustre1''
> located on OSS1 I suggest to set up brand new lustre file system
> ''lustre2'' on OSS2 connected to MD3000 enclosure and then
using your
> third server acting as a lustre client mount both lustre file systems
> and copy data from lustre1 to lustre2. At some point you will need to
> make lustre1 quiescent so there is no new writes done to it, you can
> do that by deactivating all lustre1 OST on the MDS and then you can
> make a final rsync between lustre1 and lustre2. Once this is done you
> can umount lustre1 and lustre2 and then mount lustre2 back under
> lustre1 mount point. Once you have your production lustre filesystem
> working on lustre2 you can disconnect MD1000 from OSS1 and connect it
> to MD3000 expansion ports. You can also connect OSS01 to MD3000
> controller ports. This way you will get extra space available from
> added MD1000 which you can use to configure new OSTs and add them to
> lustre2 file system. Since both OSS1 and OSS2 can see each others OSTs
> (thanks to MD3000) you can configure lustre failover on this servers.
> If you will need more capacity in the future you can just connect
> second MD1000 to your MD3000 controller.
>
> In my cluster I have six (MD3000 MD1000 MD1000) triplets configured as
> a single large lustre file system which provides around 180TB RAID6
> usable space and it works pretty well providing very good aggregated
> bandwidth.
>
> If you will have more questions don''t hesitate to drop me an
email. I
> have a bit of experience (bad and good) with this Dell hardware and I
> am happy to help.
>
> Best regards,
>
> Wojciech
>
>
>
> 2009/12/30 Nick Jennings <nick at creativemotiondesign.com>:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Hi Everyone,
>>
>> ?We''ve been using an MD1000 as our storage array for close to
a year
>> now, just hooked up to one OSS (LVM+ldiskfs). I recently ordered 2 more
>> servers, one to be hooked up to the MD1000 to help distribute the load,
>> the other to act as a lustre client (web node).
>>
>> ?The hosting company informs me that the MD1000 was never setup to
>> operate in split mode (which I asked for in the beginning) so basically
>> only one server can be connected to it.
>>
>> ?I now am faced with a tough call, we can''t bring the
filesystem down
>> for any extended period of time (a few minutes is OK, though 0 downtime
>> would be perfect!) and I''m not sure how to proceed in a way
that would
>> make things cause the least amount of headache.
>>
>> ?The only thing I can think of is to set up a second MD1000 (configured
>> for split mode) connect it to OSS2 (the new one which is not yet being
>> used), add it to the Lustre filesystem and then somehow migrate the
data
>> from OSS1 (old MD1000) to OSS2 (new MD1000) ... then, bring OSS1
>> offline, and connect it to the second partition of new MD1000 and bring
>> that end online once more.
>>
>> ?I''ve never done anything like this and am not entirely sure
if this is
>> the best method. Any suggestions, alternatives, docs or things to look
>> out for would be greatly appreciated.
>>
>> Thanks,
>> Nick
>>
>> - --
>> Nick Jennings
>> Director of Technology
>> Creative Motion Design
>> www.creativemotiondesign.com
>>
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.9 (GNU/Linux)
>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>>
>> iEYEARECAAYFAks6u+QACgkQ3WjKacHecdMqgwCfZorkD1w1ri3I2/M3APHIpxQI
>> /68An0GvkWvR6F5vOY5zz9Ty2u23rtaO
>> =Rurj
>> -----END PGP SIGNATURE-----
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>
>
>
> --
> --
> Wojciech Turek
>
> Assistant System Manager
>
> High Performance Computing Service
> University of Cambridge
> Email: wjt27 at cam.ac.uk
> Tel: (+)44 1223 763517
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>

Lustre discuss - Dec 2009 - MD1000 woes and OSS migration suggestions

[Lustre-discuss] MD1000 woes and OSS migration suggestions

[Lustre-discuss] MD1000 woes and OSS migration suggestions

[Lustre-discuss] MD1000 woes and OSS migration suggestions

[Lustre-discuss] MD1000 woes and OSS migration suggestions

[Lustre-discuss] MD1000 woes and OSS migration suggestions

[Lustre-discuss] MD1000 woes and OSS migration suggestions