Claudio Baeza Retamal
2011-Feb-06 18:35 UTC
[Gluster-users] Gluster 3.1.1 issues over RDMA and HPC environment
Dear friends, I have several problems of stability, reliability in a small-middle sized cluster, my configuration is the following: 66 compute nodes (IBM idataplex, X5550, 24 GB RAM) 1 access node (front end) 1 master node (queue manager and monotoring) 2 server for I/O with GlusterFS configured in distributed mode (4 TB in total) All computer have a Mellanox ConnectX QDR (40 Gbps) dual port 1 Switch Qlogic 12800-180, 7 leaf of 24 ports each one and two double Spines QSFP plug Centos 5.5 and Xcat as cluster manager Ofed 1.5.1 Gluster 3.1.1 over inbiniband When the cluster is full loaded for applications which use heavily MPI in combination with other application which uses a lot of I/O to file system, GlusterFS do not work anymore. Also, when gendb uses interproscan bioinformatic applications with 128 o more jobs, GlusterFS death or disconnects clients randomly, so, some applicatios become shutdown due they do not see the file system. This do not happen with Gluster over tcp (ethernet 1 Gbps) and neither happen with Lustre 1.8.5 over infiniband, under same conditions Lustre work fine. My question is, exist any documenation where there are information more especific for GlusterFS tuning? Only I found basic information for configuring Gluster, but I do no have information more deep (i.e. for experts), I think must exist some option for manipulate this siuation on GlusterFS, moreover, other people should have the same problems, since we replicate the configuration in other site with the same results. Perhaps, the question is about the gluster scalability, how many clients is recommended for each gluster server when I use RDMA and infiniband fabric at 40 Gbps? I would appreciate any help, I want to use Gluster, but stability and reliability is very important for us. Perhaps claudio
Fabricio Cannini
2011-Feb-07 19:26 UTC
[Gluster-users] Gluster 3.1.1 issues over RDMA and HPC environment
Em Domingo 06 Fevereiro 2011, ?s 16:35:45, Claudio Baeza Retamal escreveu: Hi.> Dear friends, > > I have several problems of stability, reliability in a small-middle > sized cluster, my configuration is the following: > > 66 compute nodes (IBM idataplex, X5550, 24 GB RAM) > 1 access node (front end) > 1 master node (queue manager and monotoring) > 2 server for I/O with GlusterFS configured in distributed mode (4 TB in > total) > > All computer have a Mellanox ConnectX QDR (40 Gbps) dual port > 1 Switch Qlogic 12800-180, 7 leaf of 24 ports each one and two double > Spines QSFP plug > > Centos 5.5 and Xcat as cluster manager > Ofed 1.5.1 > Gluster 3.1.1 over inbinibandI have a smaller, but relatively similar setup, and am facing the same issues of Claudio. - 1 frontend node ( 2 intel xeon 5420 , 16gb ram DDR2 ECC , 4TB of raw disk space ) with 2 "Mellanox Technologies MT26418 [ConnectX VPI PCIe 2.0 5GT/s - IB DDR" - 1 storage node ( 2 intel xeon 5420 , 24gb ram DDR@ ECC, 8TB of raw disk space ) with 2 "Mellanox Technologies MT26418 [ConnectX VPI PCIe 2.0 5GT/s - IB DDR" - 22 compute nodes ( 2 intel xeon 5420 , 16gb ram DDR2 ECC , 750GB of raw disk space ) with 1 "InfiniBand: Mellanox Technologies MT25204 [InfiniHost III Lx HCA]" Each compute node has a /glstfs partition, with 615GB , serving a gluster volume of ~3.1TB in /scratch for all nodes and the frontend, using 3.0.5 stock debian squeeze 6.0 packages.> When the cluster is full loaded for applications which use heavily MPI > in combination with other application which uses a lot of I/O to file > system, GlusterFS do not work anymore. > Also, when gendb uses interproscan bioinformatic applications with 128 o > more jobs, GlusterFS death or disconnects clients randomly, so, some > applicatios become shutdown due they do not see the file system. > > This do not happen with Gluster over tcp (ethernet 1 Gbps) and neither > happen with Lustre 1.8.5 over infiniband, under same conditions Lustre > work fine. > > My question is, exist any documenation where there are information more > especific for GlusterFS tuning? > > Only I found basic information for configuring Gluster, but I do no have > information more deep (i.e. for experts), I think must exist some > option for manipulate this siuation on GlusterFS, moreover, other people > should have the same problems, since we replicate > the configuration in other site with the same results. > Perhaps, the question is about the gluster scalability, how many > clients is recommended for each gluster server when I use RDMA and > infiniband fabric at 40 Gbps? > > I would appreciate any help, I want to use Gluster, but stability and > reliability is very important for us. PerhapsI have "solved" it , by taking out of the executing queue the first node that was listed in the client file '/etc/glusterfs/glusterfs.vol'. And this what i *think* is the reason it worked: I can't find it now, but i saw in the 3.0 docs that " ... the first hostname found in the client config file acts as a lock server for the whole volume...". In other words, the first hostname found in the client config coordinates the locking/unlocking of files in the whole volume. This way, the node does not accepts any job, and can dedicate its processing power solely as a 'lock server'. it may well be the case that gluster is not yet as optimized for infiniband as it is for ethernet, too. I just can't say. I am also unable to find how i can specify something like this in the gluster config: "node n is a lock server for nodes a,b,c,d". Does anybody if is it possible? Hope it helps you somehow, and to improve gluster performance over IB/RDMA.
Anand Avati
2011-Feb-07 19:43 UTC
[Gluster-users] Gluster 3.1.1 issues over RDMA and HPC environment
Some commits into RDMA to improve stability has gone in post 3.1.1. Can you check if 3.1.2 has those issues as well? Avati On Sun, Feb 6, 2011 at 10:35 AM, Claudio Baeza Retamal < claudio at dim.uchile.cl> wrote:> Dear friends, > > I have several problems of stability, reliability in a small-middle sized > cluster, my configuration is the following: > > 66 compute nodes (IBM idataplex, X5550, 24 GB RAM) > 1 access node (front end) > 1 master node (queue manager and monotoring) > 2 server for I/O with GlusterFS configured in distributed mode (4 TB in > total) > > All computer have a Mellanox ConnectX QDR (40 Gbps) dual port > 1 Switch Qlogic 12800-180, 7 leaf of 24 ports each one and two double > Spines > QSFP plug > > Centos 5.5 and Xcat as cluster manager > Ofed 1.5.1 > Gluster 3.1.1 over inbiniband > > When the cluster is full loaded for applications which use heavily MPI in > combination with other application which uses a lot of I/O to file system, > GlusterFS do not work anymore. > Also, when gendb uses interproscan bioinformatic applications with 128 o > more jobs, GlusterFS death or disconnects clients randomly, so, some > applicatios become shutdown due they do not see the file system. > > This do not happen with Gluster over tcp (ethernet 1 Gbps) and neither > happen with Lustre 1.8.5 over infiniband, under same conditions Lustre work > fine. > > My question is, exist any documenation where there are information more > especific for GlusterFS tuning? > > Only I found basic information for configuring Gluster, but I do no have > information more deep (i.e. for experts), I think must exist some option > for manipulate this siuation on GlusterFS, moreover, other people should > have the same problems, since we replicate > the configuration in other site with the same results. > Perhaps, the question is about the gluster scalability, how many clients > is recommended for each gluster server when I use RDMA and infiniband fabric > at 40 Gbps? > > I would appreciate any help, I want to use Gluster, but stability and > reliability is very important for us. Perhaps > > > claudio > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >