Hi all, we are planning a new infrastructure based on gluster to be used by some mail servers and some web servers. We plan 4 server, with 6x 2TB SATA disks in RAID-5 hardware each. In a replicate-distribute volume we will have 20TB of available space. What do you suggest, a single XFS volume and then split webstorage and mailstorage by directory or do you suggest to create two different mount points with different replicate-distribute volume? any performance degradation making 2 or more volumes instead a single one? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120418/fd43d690/attachment.html>
On Wed, Apr 18, 2012 at 09:54:58PM +0200, Gandalf Corvotempesta wrote:> we are planning a new infrastructure based on gluster to be used by > some mail servers and some web servers. > > We plan 4 server, with 6x 2TB SATA disks in RAID-5 hardware each. > > In a replicate-distribute volume we will have 20TB of available space. > > What do you suggest, a single XFS volume and then split webstorage and > mailstorage by directory or do you suggest to create two different > mount points with different replicate-distribute volume? > > any performance degradation making 2 or more volumes instead a single > one?Much less than the performance degradation you'll get from using RAID-5! If you use 6 x 3TB disks in RAID10 you'll have almost the same amount of storage space, but far better performance, at almost the same price. I'd say this is very important for a mail server where you are constantly writing and deleting small files. RAID10 with a large stripe size should localise file I/O from one user to one disk (read) or two disks (write), so that there is spare seek capacity on the other 4 disks for concurrent accesses from other users. As for partitioning: note that you can create separate bricks on the same disk. e.g. if the RAID arrays are mounted on server1:/data, server2:/data etc then you can create your web volume out of server1:/data/web server2:/data/web ...etc and your mail volume out of server1:/data/mail server2:/data/mail ...etc This doesn't gain you any performance, but gives you a little more flexibility and manageability (e.g. you can more easily look at I/O usage patterns separately for the two volumes, or migrate your web volume onto other bricks without moving your mail volume) Regards, Brian.
That is correct. I have 2 servers set up that way and it works well. Larry Bates vitalEsafe, Inc. Sent from my iPhone, please forgive my typos ;-) On Apr 29, 2012, at 2:00 PM, gluster-users-request at gluster.org wrote:> Send Gluster-users mailing list submissions to > gluster-users at gluster.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > or, via email, send a message with subject or body 'help' to > gluster-users-request at gluster.org > > You can reach the person managing the list at > gluster-users-owner at gluster.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Gluster-users digest..." > > > Today's Topics: > > 1. Re: Bricks suggestions (Gandalf Corvotempesta) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sat, 28 Apr 2012 23:25:30 +0200 > From: Gandalf Corvotempesta <gandalf.corvotempesta at gmail.com> > Subject: Re: [Gluster-users] Bricks suggestions > To: Gluster-users at gluster.org > Message-ID: > <CAJH6TXjBaPiHzSTsvQxrLa4xCnVqtsAP=OP4hy7EJSqqnpQDew at mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > 2012/4/27 Gandalf Corvotempesta <gandalf.corvotempesta at gmail.com> >> >> So, what do you suggest? A simple RAID10? >> >> I have servers with 8 SATA disk, what do you suggest to 'merge' these disks to a bigger volume? >> >> I think that having "du" output wrong, is not a good solution. >> >> I'm also considering no raid at all. > For example, with 2 server and 8 SATA disk each, I can create a single XFS > filesystem for every disk and then creating a replicated bricks for each. > > For example: > > server1:brick1 => server2:brick1 > server1:brick2 => server2:brick2 > > and so on. > After that, I can use these bricks to create a distributed volume. > In case of a disk failure, I have to heal only on disk at time and not the > whole volume, right? >
>> Message: 6 >> Date: Mon, 30 Apr 2012 10:53:42 +0200 >> From: Gandalf Corvotempesta <gandalf.corvotempesta at gmail.com> >> Subject: Re: [Gluster-users] Bricks suggestions >> To: Brian Candler <B.Candler at pobox.com> >> Cc: Gluster-users at gluster.org >> Message-ID: >> >> <CAJH6TXh-jb=-Gus2yadhSbF94-dNdQ-iP04jH6H4wGSgNB8LHQ at mail.gmail.com> >> Content-Type: text/plain; charset="iso-8859-1" >> >> 2012/4/30 Brian Candler <B.Candler at pobox.com> >> >>> KO or OK? With a RAID controller (or software RAID), the RAID >>> subsystem >>> should quietly mark the failed drive as unusable and redirect all >>> operations >>> to the working drive. And you will have a way to detect this >>> situation, >>> e.g. /proc/mdstat for Linux software RAID. >>> >> >> KO. >> As you wrote, in a raid environment, the controller will detect a >> failed >> disk and redirect I/O to the working drive. >> >> With no RAID, is gluster smart enough to detect a disk failure and >> redirect >> all I/O to the other server? >> >> A disk can have a damed cluster, so only a portion of itself will >> became >> unusable. >> A raid controller is able to detect this, gluster will do the same >> or still >> try to reply >> with brokend data? >> >> So, do you suggest to use a RAID10 on each server? >> - disk1+disk2 raid1 >> - disk3+disk4 raid1 >> >> raid0 over these raid1 and then replicate it with gluster? >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: >> <http://gluster.org/pipermail/gluster-users/attachments/20120430/763be665/attachment-0001.htm> >> >> ------------------------------ > I have been running the following configuration for over 16 months > with no issues: > > Fluster V3.0.0 in two SuperMicro servers each with 8x2TB hard drives > configured as JBOD. I use Gluster to replicate each drive between > servers and the distribute across the drives giving me approx 16TB as > a single volume. I can pull a single drive an replace and then use > self heal to rebuild. I can shutdown or reboot a server and traffic > continues to the other server (good for kernel updates). I use > logdog > to alert me via email/text if a drive fails. > > I chose this config because it was 1) simplest, 2) maximized my disk > storage, 3) effectively resulted in a shared nothing RAID10 SAN-like > storage system, 4) minimized the amount of data movement during a > rebuild, 5) it didn't require any hardware RAID controllers which > would increase my cost. This config has worked for me exactly as > planned. > > I'm currently building a new server with 8x4TB drives and will be > replacing one of the existing servers in a couple of weeks. I will > force a self heal to populate if wit files from primary server. When > done I'll repeat process for the other server. > > Larry Bates > vitalEsafe, Inc. > > ------------------------------I should have added the other reasons for choosing this configuration: 6) Hardware RAID REQUIRES the use of hard drives that support TLER, which forces you to use Enterprise drives which are much more expensive than desktop drives, lastly 7) I'm an old-timer that has over 30 years of exprience doing this and I've seen almost every RAID5 array that was ever set up fail due to some "glitch" where the controller just decides that multiple drives have failed simultaneously. Sometimes it takes a couple of years, but I've seen a LOT of arrays fail this way, so I don't trust RAID5/RAID6 with vital data. Hardware RAID10 is OK, but that would have more than doubled my storage cost. My goal was highly available mid- performance (30MB/sec) storage that is immune to single device failures and that can be rebuilt quickly after a failure. Larry Bates vitalEsafe, Inc.