Michael Kluge
2012-Jan-22 19:04 UTC
[Lustre-discuss] MDS failover: SSD+DRDB or shared 15K-SAS-Storage RAID with approx. 10 disks
Hi, I have been asked, which one of the two I would chose for two MDS servers (active/passive). Whether I would like to have SSDs, maybe two (mirrored) in both servers and DRDB for synching, or a RAID controller that has a 15K disks. I have not done benchmarks on this topic myself and would like to ask if anyone has an idea or numbers? The cluster will be pretty small, about 50 clients. Regards, Michael -- Dr.-Ing. Michael Kluge Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih
Carlos Thomaz
2012-Jan-22 19:15 UTC
[Lustre-discuss] MDS failover: SSD+DRDB or shared 15K-SAS-Storage RAID with approx. 10 disks
Hi Michael, In my experience SSDs didn''t help much, since the MDS bottleneck is not only a disk problem rather than the entire lustre metadata mechanism. We''ve been configuring highly scalable systems, lots of them with a high IOPS demand, using RAID controllers and fast disks as you stated (usually SAS 15K RPM, usually RAID10). It''s difficult to to size it without knowing exactly your I/O pattern and workload, but if you would like more information about current systems and metadata benchmark on raid controllers, size and configuration, let us know. One remark about DRDB: I''ve seen customers using it, but IMHO, if Active/standby HA type configuration would be more reliable and will provide you a better resilience. Again, don''t know about your uptime and reliability needs, but the customers I''ve worked with that requires minimum downtime on production, always go for RAID controllers rather than DRDB replication. Regards, Carlos. -- Carlos Thomaz | Systems Architect Mobile: +1 (303) 519-0578 cthomaz at ddn.com | Skype ID: carlosthomaz DataDirect Networks, Inc. 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921 ddn.com <http://www.ddn.com/> | Twitter: @ddn_limitless <http://twitter.com/ddn_limitless> | 1.800.TERABYTE On 1/22/12 12:04 PM, "Michael Kluge" <Michael.Kluge at tu-dresden.de> wrote:>Hi, > >I have been asked, which one of the two I would chose for two MDS >servers (active/passive). Whether I would like to have SSDs, maybe two >(mirrored) in both servers and DRDB for synching, or a RAID controller >that has a 15K disks. I have not done benchmarks on this topic myself >and would like to ask if anyone has an idea or numbers? The cluster will >be pretty small, about 50 clients. > > >Regards, Michael > >-- >Dr.-Ing. Michael Kluge > >Technische Universit?t Dresden >Center for Information Services and >High Performance Computing (ZIH) >D-01062 Dresden >Germany > >Contact: >Willersbau, Room WIL A 208 >Phone: (+49) 351 463-34217 >Fax: (+49) 351 463-37773 >e-mail: michael.kluge at tu-dresden.de >WWW: http://www.tu-dresden.de/zih >_______________________________________________ >Lustre-discuss mailing list >Lustre-discuss at lists.lustre.org >http://lists.lustre.org/mailman/listinfo/lustre-discuss
Michael Kluge
2012-Jan-22 19:55 UTC
[Lustre-discuss] MDS failover: SSD+DRDB or shared 15K-SAS-Storage RAID with approx. 10 disks
Hi Carlos,> In my experience SSDs didn''t help much, since the MDS bottleneck is not > only a disk problem rather than the entire lustre metadata mechanism.Yes, but one does not need much space on the MDS and four SSDs (as MDT) are way cheaper than a RAID controller with 10 15K disks. So the question is basically how the DRDB latency will influence the MDT performance. I know sync/async makes a big difference here, but I have no idea about the performance impact of both or how the reliability is influenced.> One remark about DRDB: I''ve seen customers using it, but IMHO, if > Active/standby HA type configuration would be more reliable and will > provide you a better resilience. Again, don''t know about your uptime and > reliability needs, but the customers I''ve worked with that requires > minimum downtime on production, always go for RAID controllers rather than > DRDB replication.OK, thanks. That is a good information. So SSD+DRDB are considered to be the "cheap" solution. Even for small clusters? Regards, Michael> > Regards, > Carlos. > > > -- > Carlos Thomaz | Systems Architect > Mobile: +1 (303) 519-0578 > cthomaz at ddn.com | Skype ID: carlosthomaz > DataDirect Networks, Inc. > 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921 > ddn.com<http://www.ddn.com/> | Twitter: @ddn_limitless > <http://twitter.com/ddn_limitless> | 1.800.TERABYTE > > > > > > On 1/22/12 12:04 PM, "Michael Kluge"<Michael.Kluge at tu-dresden.de> wrote: > >> Hi, >> >> I have been asked, which one of the two I would chose for two MDS >> servers (active/passive). Whether I would like to have SSDs, maybe two >> (mirrored) in both servers and DRDB for synching, or a RAID controller >> that has a 15K disks. I have not done benchmarks on this topic myself >> and would like to ask if anyone has an idea or numbers? The cluster will >> be pretty small, about 50 clients. >> >> >> Regards, Michael >> >> -- >> Dr.-Ing. Michael Kluge >> >> Technische Universit?t Dresden >> Center for Information Services and >> High Performance Computing (ZIH) >> D-01062 Dresden >> Germany >> >> Contact: >> Willersbau, Room WIL A 208 >> Phone: (+49) 351 463-34217 >> Fax: (+49) 351 463-37773 >> e-mail: michael.kluge at tu-dresden.de >> WWW: http://www.tu-dresden.de/zih >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >-- Dr.-Ing. Michael Kluge Technische Universit?t Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax: (+49) 351 463-37773 e-mail: michael.kluge at tu-dresden.de WWW: http://www.tu-dresden.de/zih
Wojciech Turek
2012-Jan-22 23:44 UTC
[Lustre-discuss] MDS failover: SSD+DRDB or shared 15K-SAS-Storage RAID with approx. 10 disks
Hi Michael, As Carlos said the SSDs do not improve meta data rates to much due to the bottle necks in Lustre code itself (at least this is the case in Lustre-1.8) If you have a limited budget the DRBD solution seems like a good choice. I have been running production Lustre filesystem which uses DRBD mirrored MDT for 4 years now and never had problems with it. However if the budget is not tight and your are thinking about adding more filesystems in the future then you may consider an external RAID disk array for example with 24 small 2.5" SAS 15Kdisks. My bigger production Lustre filesystems use that type of storage and it does the job very well. For the information on both configurations and some matadata performance figures please read through my white papers: http://i.dell.com/sites/content/shared-content/solutions/en/Documents/lustre-storage-brick-white-paper.pdf http://i.dell.com/sites/content/business/solutions/hpcc/en/Documents/Lustre-HPC-Whitepaper-10082011.pdf Best regards, Wojciech On 22 January 2012 19:55, Michael Kluge <Michael.Kluge at tu-dresden.de> wrote:> Hi Carlos, > > > In my experience SSDs didn''t help much, since the MDS bottleneck is not > > only a disk problem rather than the entire lustre metadata mechanism. > > Yes, but one does not need much space on the MDS and four SSDs (as MDT) > are way cheaper than a RAID controller with 10 15K disks. So the > question is basically how the DRDB latency will influence the MDT > performance. I know sync/async makes a big difference here, but I have > no idea about the performance impact of both or how the reliability is > influenced. > > > One remark about DRDB: I''ve seen customers using it, but IMHO, if > > Active/standby HA type configuration would be more reliable and will > > provide you a better resilience. Again, don''t know about your uptime and > > reliability needs, but the customers I''ve worked with that requires > > minimum downtime on production, always go for RAID controllers rather > than > > DRDB replication. > > OK, thanks. That is a good information. So SSD+DRDB are considered to be > the "cheap" solution. Even for small clusters? > > > Regards, Michael > > > > > Regards, > > Carlos. > > > > > > -- > > Carlos Thomaz | Systems Architect > > Mobile: +1 (303) 519-0578 > > cthomaz at ddn.com | Skype ID: carlosthomaz > > DataDirect Networks, Inc. > > 9960 Federal Dr., Ste 100 Colorado Springs, CO 80921 > > ddn.com<http://www.ddn.com/> | Twitter: @ddn_limitless > > <http://twitter.com/ddn_limitless> | 1.800.TERABYTE > > > > > > > > > > > > On 1/22/12 12:04 PM, "Michael Kluge"<Michael.Kluge at tu-dresden.de> > wrote: > > > >> Hi, > >> > >> I have been asked, which one of the two I would chose for two MDS > >> servers (active/passive). Whether I would like to have SSDs, maybe two > >> (mirrored) in both servers and DRDB for synching, or a RAID controller > >> that has a 15K disks. I have not done benchmarks on this topic myself > >> and would like to ask if anyone has an idea or numbers? The cluster will > >> be pretty small, about 50 clients. > >> > >> > >> Regards, Michael > >> > >> -- > >> Dr.-Ing. Michael Kluge > >> > >> Technische Universit?t Dresden > >> Center for Information Services and > >> High Performance Computing (ZIH) > >> D-01062 Dresden > >> Germany > >> > >> Contact: > >> Willersbau, Room WIL A 208 > >> Phone: (+49) 351 463-34217 > >> Fax: (+49) 351 463-37773 > >> e-mail: michael.kluge at tu-dresden.de > >> WWW: http://www.tu-dresden.de/zih > >> _______________________________________________ > >> Lustre-discuss mailing list > >> Lustre-discuss at lists.lustre.org > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > -- > Dr.-Ing. Michael Kluge > > Technische Universit?t Dresden > Center for Information Services and > High Performance Computing (ZIH) > D-01062 Dresden > Germany > > Contact: > Willersbau, Room WIL A 208 > Phone: (+49) 351 463-34217 > Fax: (+49) 351 463-37773 > e-mail: michael.kluge at tu-dresden.de > WWW: http://www.tu-dresden.de/zih > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >-- Wojciech Turek Senior System Architect High Performance Computing Service University of Cambridge Email: wjt27 at cam.ac.uk Tel: (+)44 1223 763517 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120122/5de5eddd/attachment.html