Hi Sham, If your main concern is data redundancy, I would suggest you to go for erasure coded volume provided by gluster. Erasure coded (EC) volume or disperse volume can provide you redundancy without wasting too much storage. More on that you can find on https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Setting%20Up%20Volumes/ under the heading "Creating Dispersed Volumes" Overall setup depends on your infrastructure, hardware, number of nodes and network. Example. If you have 6 hard disk of 1TB each on 6 different nodes, you can setup a 4+2 (k+m) ec volume. k = data bricks and m = redundancy brick Here 2 is the number of redundancy. That means even if 2 hard drive fail , you can do IO on this volume. As soon as HD's come back, the data will be healed automatically. You will have 4TB of storage space while 2TB will be used for redundancy. You can have different number of redundancies like 4+1, 4+2, 8+4 etc. Ashish ----- Original Message ----- From: "Sham Arsiwala" <shamarsiwala at gmail.com> To: gluster-users at gluster.org Sent: Tuesday, August 9, 2016 10:38:59 AM Subject: [Gluster-users] Need help to design a data storage HI, I want to store data which is 135TB, and main requirement is data redundancy, i can scarify of data performance, but data should not be lost. ?can you help me in a design? -- -- Regards.: SHAM P. ARSIWALA. RHC{E-A-I} M.: 9099099855 _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://www.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160809/67cb5d18/attachment.html>
Gandalf Corvotempesta
2016-Aug-09 14:43 UTC
[Gluster-users] Need help to design a data storage
Il 09 ago 2016 10:06 AM, "Ashish Pandey" <aspandey at redhat.com> ha scritto:> If your main concern is data redundancy, I would suggest you to go forerasure coded volume provided by gluster.> Erasure coded (EC) volume or disperse volume can provide you redundancywithout wasting too much storage. is EC considered stable? Could also be used with sharding for VMs hosting? i would like to design a similiar cluster where data must be protected as much as i can and at the same time allowing me to add one node per time, not possible with EC disabled where nodes must be added in multiple of replica count. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160809/5d751bd4/attachment.html>
Gandalf Corvotempesta
2016-Aug-09 15:03 UTC
[Gluster-users] Need help to design a data storage
Il 09 ago 2016 10:06 AM, "Ashish Pandey" <aspandey at redhat.com> ha scritto:> If your main concern is data redundancy, I would suggest you to go forerasure coded volume provided by gluster. Anyway EC volumes has a lower redundancy level than standard replicated volumes. Let's assume a 9 nodes cluster with 12 disks on each node, redundancy set to 2 You have 9*12 = 108 disks/bricks with redundancy 2 you can loose up to 2 bricks/disks at the same time before loosing data. Using cheap sata disks (gluster is made to run on commodity hardware) loosing 3 disks over 108 in a very short time could happen frequently and this frequency grow as cluster grows With a standard replicated volume, with replica 3, you can loose up to 3 servers (not bricks) because each brick in a replica set must be on a different server. I think EC is something like raid6 (with more "parity") and standard replication is like raid10 but with 3 disks for each mirror. Raid10 is safer as you can loose as many disks as you want, if in different replica set, while raid 6 can loose up to 2 disks in the whole cluster Higher the number of disks, higher the probability of data loss with raid6/EC Am i missed something? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160809/4a126456/attachment.html>