Dam Thanh Tung
2009-Oct-07 03:24 UTC
[Lustre-discuss] Lustre-discuss Digest, Vol 45, Issue 6
> > > Date: Mon, 5 Oct 2009 15:42:25 +0100 > From: pg_lus at lus.for.sabi.co.UK (Peter Grandi) > Subject: Re: [Lustre-discuss] drbd slow I/O with lustre filesystem > To: Lustre discussion <lustre-discuss at lists.Lustre.org> > Message-ID: <19146.1489.532204.6707 at tree.ty.sabi.co.uk> > Content-Type: text/plain; charset=us-ascii >> RAID5 over RAID1? Nahh. Consider http://WWW.BAARF.com/ and that > the storage system of a Lustre pool over DRBD is ideally suited to > RAID10 (with each pair a DRBD resource). RAID5 may be contributing > to your speed problem below because of or being rebuilt/syncing > itself. > > Poor me, i don''t know it before, so now we can''t change anything on my raidpartition :( .> > After formatting them with lustre format ( using mkfs.lustre ) , > > i start to copy data to my drbd devices, but: > > > - Its I/O wait when i monitor by top or iostat is too hight, > > about 25% > > This is not much related to anything... After all you are doing a > lot of IO, and jumping around on the disk, doing a restore. >Could you please tell me in detail what do you mean is ? I don''t really understand it ?> > > - The copy speed from my web client to our OST using drbd > > devices is too low, only about 13MB/s although client and ost in > > is the same 1Gb Ethernet LAN. > > Too few details about this. Thigns to check: > > * Raw network speed: I like ''nuttcp'' to do check it. Using the > usual trick (larger send/receive buffers, jumbo frames, ...) may > help if there are issues. But then you were getting 70MB/s above. > http://lists.centos.org/pipermail/centos/2009-July/079505.html > * If you are using LVMN2 bad news. > http://archives.free.net.ph/message/20070815.091608.fff62ba9.en.html > * Using RAID5 as argued above may be detrimental. > * The DRBD must be configured to allow higher sync speeds: > http://www.ossramblings.com/drbd_defaults_too_slow > > http://www.linux-ha.org/DRBD/FAQ#head-e09d2c15ba7ff691ecd5d5d7b848a50d25a3c3eb > Your initial sync however seemed to run at 70MB/s so > I wonder. Maybe tuning the "unplug" waterkmark in DRBD > or if you have battery backup enabling no-flush mode. > http://archives.free.net.ph/message/20081219.085301.997727d2.en.html > > > When i tried using one OST without drbd, it worked quite well > > It might mean that it is mainly a DRBD issue. You might want to > get the latest DRBD versions, as some earlier versions. If you > have RHEL the ElRepo has got fairly recent ones. > > > So, could any one please tell me where the problem is ? In our > > drbd devices or because of lustre ? Is there anyone has the same > > problem with me ? :( > > All of the above probably -- max performance here means ensuring > that write requests are issued as fast as possible and back-to-back > packets/blocks are then possible both on the network and on the > storage system... > > http://www.gossamer-threads.com/lists/drbd/users/17991 > http://lists.linbit.com/pipermail/drbd-user/2007-August/007256.html > http://lists.linbit.com/pipermail/drbd-user/2009-January/011165.html > http://lists.linbit.com/pipermail/drbd-user/2009-January/011198.html >They are really great information, i checked it but will consider to using some of them ( i.e some drbd options like no-disk-flushes, no-md-flushes ... it''s maybe useful in speed tuning but i am not sure it won''t affect to my system stability ) Anyway, many thanks for all of them :)> > It may conceivably be quicker for you to load all your data first > on the primary storage half of the pair, and then reactivate the > secondary and let resync. > > I tried using that way but the speed increasing is not remarkable, about5-7MB My impression is that a problem is unlikely to originate in the> Lustre side, but more on the underlying layers mentioned above. > There is a fair bit of material on DRBD optimization, both on its > site, and more specifically around the MySQL community, where it > is very commonly used, and they care a lot about performance. > > > It''s also what i guessed, so i posted my questions to both lustre and drbdmailing list and luckily, i received some useful information and tips. After all, many thanks for you detail answer. I''m really appreciated it :) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20091006/97ba065a/attachment.html
Andreas Dilger
2009-Oct-07 18:19 UTC
[Lustre-discuss] Lustre-discuss Digest, Vol 45, Issue 6
On Oct 06, 2009 20:24 -0700, Dam Thanh Tung wrote:> > RAID5 over RAID1? Nahh. Consider http://WWW.BAARF.com/ and that > > the storage system of a Lustre pool over DRBD is ideally suited to > > RAID10 (with each pair a DRBD resource). RAID5 may be contributing > > to your speed problem below because of or being rebuilt/syncing > > itself. > > Poor me, i don''t know it before, so now we can''t change anything on my > raid partition :( .It is documented in the Lustre manual that the MDS should be running on RAID-1 or RAID-1+0. I would suggest to shut down your MDS, make sure your remote DRBD copy is up-to-date, then reformat the local storage into RAID-1+0, copy the remote DRBD mirror back to the local system, and then reformat the remote DRBD storage to RAID-1+0 also and copy it there. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.