s.varadha rajan
2011-Mar-28 12:16 UTC
[Gluster-users] Any update pls : Req details for "config as like striping with fail over" for many servers
Hi, Can anybody help me out for my below requirement ? Regards, Varadharajan.S Hi, I would like to implement glusterfs 3.1 in my concern.We have around 10 servers and all are holding diff applications such as webserver(apache,Tomcat),vmware,DNS server....all the servers are having Diff disk capacity such as 1 TB,2 TB like that.All the systems are having Ubuntu 10.04 My Requirement: 1.Need to connect all the servers through glusterfs 3.1.x 2.If i configured as Replication method, i am not getting Disk space.So i need to configure as like striping method.but glusterfs doesn't provide, fail over for this. 3.for e.x if i take server1:/data(120GB),server2:/data(120GB),server3:/home/g1=2TB,server4:/home/g2=2TB......i want to connect all the servers.so that i can get big storage space for all.if i go for striping or distributed, if one server fails, i can't access the volume and get an error as "Transport end point not connected" Please let me know the solution and idea for config for this.i am searching in google for the past 10 days but no proper result. Regards, Varadharajan.S
Amar Tumballi
2011-Mar-28 12:37 UTC
[Gluster-users] Any update pls : Req details for "config as like striping with fail over" for many servers
> > Hi, > > I would like to implement glusterfs 3.1 in my concern.We have around 10 > servers > and all are holding diff applications such as > webserver(apache,Tomcat),vmware,DNS server....all the servers are having > Diff > disk capacity such as 1 TB,2 TB like that.All the systems are having > Ubuntu 10.04 > > My Requirement: > > 1.Need to connect all the servers through glusterfs 3.1.x >Connecting between different machines is not an issue, and 3.1.x onwards Gluster provides backward/forward compatibility, hence should not be an issue.> 2.If i configured as Replication method, i am not getting Disk space.So i > need > to configure as like striping method.but glusterfs doesn't provide, fail > over > for this. >Yes, thats true. We don't recommend stripe module in general, and I see that your backend sizes are different... Its preferred to keep them uniform.> 3.for e.x if i take > > server1:/data(120GB),server2:/data(120GB),server3:/home/g1=2TB,server4:/home/g2=2TB......i > want to connect all the servers.so that i can get big storage space for > all.if i > go for striping or distributed, if one server fails, i can't access the > volume > and get an error as "Transport end point not connected" > >Yes, stripe doesn't give high availability, and any one node failure results in this error.> Please let me know the solution and idea for config for this.i am searching > in > google for the past 10 days but no proper result. > >Best solution is define your storage need first. (ie, at least in how much of storage (in TB) you need). Once thats finalized, decide if your applications really need high availability (i guess 'yes' as this is the answer in most cases). Get new drives (SATA disks are cheaper now a days), and try to keep the backend uniform to avoid any possible issues which may arise later. Ideally GlusterFS can be configured to work for any scenario if you try to hand craft your volume file, but as those configs are not getting regular QA, and can contain corner case issues, we won't recommend such setup anymore. Regards, Amar
Burnash, James
2011-Mar-28 13:00 UTC
[Gluster-users] Any update pls : Req details for "config as like striping with fail over" for many servers
Hi Varadharajan. This is just my opinion, but I would say that the use case you've presented is not a good match for GlusterFS. Striping in this file system is not used for expanding the amount of filespace available - it is used for performance, and only when very very large files are being stored and accessed, and which may not otherwise fit on a single storage servers. Additionally, small file ( <1GB) access tends to be slow and relatively inefficient on GlusterFS - and unless you are serving very large files on your webserver, the rest of the applications you mentioned (with the exception of VMWare) tend to create and use small files. Please see this link: http://gluster.qotd.co/q/why-does-gluster-seem-so-slow-accessing-many-small-files/ You can't get high availability / failover without mirroring, and mirroring with non-uniform amounts of storage on the backend servers is not optimal for performance, even though it is possible. Hopefully this helps you understand the issues you face in trying to use the GlusterFS platform. James Burnash, Unix Engineering -----Original Message----- From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of s.varadha rajan Sent: Monday, March 28, 2011 8:16 AM To: gluster-users at gluster.org Subject: [Gluster-users] Any update pls : Req details for "config as like striping with fail over" for many servers Hi, Can anybody help me out for my below requirement ? Regards, Varadharajan.S Hi, I would like to implement glusterfs 3.1 in my concern.We have around 10 servers and all are holding diff applications such as webserver(apache,Tomcat),vmware,DNS server....all the servers are having Diff disk capacity such as 1 TB,2 TB like that.All the systems are having Ubuntu 10.04 My Requirement: 1.Need to connect all the servers through glusterfs 3.1.x 2.If i configured as Replication method, i am not getting Disk space.So i need to configure as like striping method.but glusterfs doesn't provide, fail over for this. 3.for e.x if i take server1:/data(120GB),server2:/data(120GB),server3:/home/g1=2TB,server4:/home/g2=2TB......i want to connect all the servers.so that i can get big storage space for all.if i go for striping or distributed, if one server fails, i can't access the volume and get an error as "Transport end point not connected" Please let me know the solution and idea for config for this.i am searching in google for the past 10 days but no proper result. Regards, Varadharajan.S DISCLAIMER: This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this in error, please immediately notify me and permanently delete the original and any copy of any e-mail and any printout thereof. E-mail transmission cannot be guaranteed to be secure or error-free. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. NOTICE REGARDING PRIVACY AND CONFIDENTIALITY Knight Capital Group may, at its discretion, monitor and review the content of all e-mail communications. http://www.knight.com
Jeff Darcy
2011-Mar-28 15:49 UTC
[Gluster-users] Any update pls : Req details for "config as like striping with fail over" for many servers
On 03/28/2011 08:16 AM, s.varadha rajan wrote:> I would like to implement glusterfs 3.1 in my concern.We have around 10 servers > and all are holding diff applications such as > webserver(apache,Tomcat),vmware,DNS server....all the servers are having Diff > disk capacity such as 1 TB,2 TB like that.All the systems are having > Ubuntu 10.04 > > My Requirement: > > 1.Need to connect all the servers through glusterfs 3.1.x > 2.If i configured as Replication method, i am not getting Disk space.So i need > to configure as like striping method.but glusterfs doesn't provide, fail over > for this. > 3.for e.x if i take > server1:/data(120GB),server2:/data(120GB),server3:/home/g1=2TB,server4:/home/g2=2TB......i > want to connect all the servers.so that i can get big storage space for all.if i > go for striping or distributed, if one server fails, i can't access the volume > and get an error as "Transport end point not connected" > > Please let me know the solution and idea for config for this.i am searching in > google for the past 10 days but no proper result.My recommendation would be to take advantage of the fact that you can serve multiple bricks from one physical machine. First, divide the space you'll be using on each machine into multiple bricks so that all bricks in the cluster will be of approximately equal size. Then, make replica pairs containing bricks *on different nodes* and then distribute across those. Mostly that's a matter of carefully controlling the order in which you specify those bricks in your "gluster volume create" command. The result should give you pretty good space utilization while still protecting against a single machine failure. Dividing up the space on a machine into several bricks can be a bit tricky. If there are several physical disks, that's great. If you use partitions, you will probably get poor performance as an even distribution of work among bricks (something DHT tries to achieve) will result in the disk heads thrashing between partitions. If you use plain old subdirectories you won't have that problem so much, but the free-space reporting will be a bit inaccurate and could cause problems when disks which are shared among several bricks become nearly full. It's probably the best option overall, though, since it's easy to do and will perform/behave pretty well the rest of the time.