Emmanuel Noobadmin
2010-Sep-28 17:26 UTC
[Gluster-users] Gluster scalability [Was: Adding new storage nodes to existing GlusterFS network]
After following Roland's thread (http://gluster.org/pipermail/gluster-users/2010-September/005311.html), I'm wondering if this means there's a limit to how scalable gluster is if we start small. It seems that every time a new brick is added, the scale and defrag script must be ran. Since we're going over the network, for those of us starting on low budget interconnect, i.e. Gigabit Ethernet, it would take a long while. Let's say I'm using 4x1.5TB drives for 4.5TB RAID 5 storage brick. Starting with four in replicate/distribute. So effectively 9TB of space for the gluster network. Now if we hit 90% capacity and add four new 4.5TB bricks. Am I correct to understand the scale and defrag script would cause say around 6TB of data to be spread around, twice since it's replicate and assuming the remaining 2TB get to stay where they were. If the network was able to sustain 30MB/s, that would take around 48 hours of continuous operation to complete. Since the cluster is unlikely to be idle and there is bound to be some overheads, would that be closer to 72hrs in reality? Now it seems to me that since the scale and defrag would redistribute the chunks all over the new nodes, the next set of four would take 2x (97~145hrs) as long since there are more data/files now. Then the next group of four would take 3x (146~220hrs) or about a week. At some point, it seems that adding a new set of nodes may cause a scale/defrag time so long that the organisation may have to add a new set before it finishes? It doesn't seem to make sense so what am I actually getting wrong?
Jeff Anderson-Lee
2010-Sep-28 17:34 UTC
[Gluster-users] Gluster scalability [Was: Adding new storage nodes to existing GlusterFS network]
On 9/28/2010 10:26 AM, Emmanuel Noobadmin wrote:> After following Roland's thread > (http://gluster.org/pipermail/gluster-users/2010-September/005311.html), > I'm wondering if this means there's a limit to how scalable gluster is > if we start small. > > It seems that every time a new brick is added, the scale and defrag > script must be ran. Since we're going over the network, for those of > us starting on low budget interconnect, i.e. Gigabit Ethernet, it > would take a long while. > > Let's say I'm using 4x1.5TB drives for 4.5TB RAID 5 storage brick. > Starting with four in replicate/distribute. So effectively 9TB of > space for the gluster network. Now if we hit 90% capacity and add four > new 4.5TB bricks. Am I correct to understand the scale and defrag > script would cause say around 6TB of data to be spread around, twice > since it's replicate and assuming the remaining 2TB get to stay where > they were. > > If the network was able to sustain 30MB/s, that would take around 48 > hours of continuous operation to complete. Since the cluster is > unlikely to be idle and there is bound to be some overheads, would > that be closer to 72hrs in reality? > > Now it seems to me that since the scale and defrag would redistribute > the chunks all over the new nodes, the next set of four would take 2x > (97~145hrs) as long since there are more data/files now. Then the next > group of four would take 3x (146~220hrs) or about a week. > > At some point, it seems that adding a new set of nodes may cause a > scale/defrag time so long that the organisation may have to add a new > set before it finishes? > > It doesn't seem to make sense so what am I actually getting wrong?In part it depends on your network infrastructure -- in particular your switches/routers. The 30MB/s you mentioned is (or should be) per interface. Yes, with more nodes there is more data to move around, but there are also more interfaces involved in moving the data. As long as you don't come close to saturating your switches/routers, it should (at least in theory) take roughly the same time regardless of how many nodes are involved (assuming that the nodes remain the same).