PANICHI MASSIMILIANO
2012-May-03 15:28 UTC
[Gluster-users] Problem stressing an Apache Web server on GlusterFS volume
Hi, we are testing an Apache Web server on a GFS volume. Our goal is to buil an HA Reverse Proxy with pacemaker and GFS. We installed and configured four GFS nodes with distributed replica, each node with 10GB of storage. So we configured a distributed replica storage for 20GB of disk space. We configured an Apache web server using the GFS volume for saving logs (access_log and error_log). From another server we stressed the apache server calling a mod_perl script printing an HTML page with apache environment variables. We used httperf to stress the web server. So, we haven't load performance problems but when we tested the availability of GFS we faced with a volume mounting hang from the client. We tested rebooting one of the GFS nodes and when this happens the client hang in writing on the GFS volume and df doesn't respond. We used VM on VMWARE and all server are running Oracle Enterprise Linux 6.2 64bit with GlusterFS 3.2.6 (recompiled). The mount from the apache server mount -t glusterfs gfs01-dev:/VOLUME01 /opt/VOLUME01/ The files on all GFS nodes ---files on node 1---- 4 /VOLUME01/GFS01-DEV_1/proxy_logs/logs/error_log 33152 /VOLUME01/GFS01-DEV_1/proxy_logs/logs/access_log 33156 /VOLUME01/GFS01-DEV_1/proxy_logs/logs 33156 /VOLUME01/GFS01-DEV_1/proxy_logs 33160 /VOLUME01/GFS01-DEV_1 33164 /VOLUME01 ---files on node 2---- 4 /VOLUME01/GFS02-DEV_1/proxy_logs/logs/error_log 33216 /VOLUME01/GFS02-DEV_1/proxy_logs/logs/access_log 33220 /VOLUME01/GFS02-DEV_1/proxy_logs/logs 33220 /VOLUME01/GFS02-DEV_1/proxy_logs 33224 /VOLUME01/GFS02-DEV_1 33228 /VOLUME01 ---files on node 3---- 0 /VOLUME01/GFS03-DEV_1/proxy_logs/logs 0 /VOLUME01/GFS03-DEV_1/proxy_logs 0 /VOLUME01/GFS03-DEV_1/prova.4 0 /VOLUME01/GFS03-DEV_1 4 /VOLUME01 ---files on node 4---- 0 /VOLUME01/GFS04-DEV_1/proxy_logs/logs 0 /VOLUME01/GFS04-DEV_1/proxy_logs 0 /VOLUME01/GFS04-DEV_1/prova.4 0 /VOLUME01/GFS04-DEV_1 So, if I reboot node 01GFS mount hangs. df doesn't works.If I reboot node 02 I have performance problems. df works but slowly. When the problem occurs we much of the apache processes trying to log as we can see from the server-status Current Time: Thursday, 03-May-2012 11:52:34 CEST Restart Time: Thursday, 03-May-2012 11:44:56 CEST Parent Server Generation: 0 Server uptime: 7 minutes 37 seconds Total accesses: 178751 - Total Traffic: 134.8 MB CPU Usage: u33.7 s12.25 cu0 cs0 - 10.1% CPU load 391 requests/sec - 302.0 kB/second - 790 B/request 512 requests currently being processed, 0 idle workers LLLLLLCCLLLLLLLLLLLLLLLLCLLLLLLLLCLCLLLLLLCLLCCCLLLLLLLLCLLLLLLL LLLLLLLCLCLLLLLLLLLLLLLLLLCLLLLLLCLLWCLLLCLCCCLLLLCLLLLLLLCLLLLL LLLCLLLLLLLLLLLLLLCLCLLCLLLLLLLLLLLLLLLLLLLLLLLLWLLLLLLLLLLLLLLL LLLLLLLLLLLLLLLCLLLLLLLLLCLCLLLLLLLLLLLLLLLLLCWLLLLLLLLLLLLLLLLL LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL LLLLLLLLLLRLLLLWLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL LLLLLLLLLLLLLLLWLLLLLLLLLLLLLLLLLWLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL LLLLLLLLLLLLLLLLLLRLLLLLLLLLLLLLLLLLLLLLLLCLWLLLLLLLLLLLLLLLLLLL and the reply rate goes down ... [root at proxycoll02 src]# ./httperf -v --server=10......110 --port=80 --uri=/perl/stress_test.pl --num-conns=10000000 --rate=1000 httperf --verbose --client=0/1 --server=10.......110 --port=80 --uri=/perl/stress_test.pl --rate=1000 --send-buffer=4096 --recv-buffer=16384 --num-conns=10000000 --num-calls=1 httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE httperf: maximum number of open descriptors = 1024 reply-rate = 1001.1 reply-rate = 1000.3 reply-rate = 999.7 reply-rate = 1000.3 reply-rate = 1000.1 reply-rate = 1000.3 reply-rate = 1000.1 reply-rate = 998.1 reply-rate = 328.0 reply-rate = 17.6 reply-rate = 25.6 reply-rate = 25.6 reply-rate = 7.6 reply-rate = 0.0 reply-rate = 0.0 reply-rate = 0.0 reply-rate = 0.0 reply-rate = 0.0 reply-rate = 0.0 reply-rate = 0.0 reply-rate = 171.0 reply-rate = 230.6 reply-rate = 0.2 reply-rate = 0.0 reply-rate = 200.6 reply-rate = 200.6 reply-rate = 151.8 reply-rate = 49.0 reply-rate = 199.8 reply-rate = 0.2 reply-rate = 201.2 Furthermore, during the problem the client is swapping top - 16:26:07 up 22 min, 1 user, load average: 619.87, 190.81, 76.03 Tasks: 1069 total, 6 running, 1063 sleeping, 0 stopped, 0 zombie Cpu(s): 1.1%us, 10.3%sy, 0.0%ni, 83.0%id, 5.5%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 1016524k total, 1008364k used, 8160k free, 212k buffers Swap: 2064376k total, 2064376k used, 0k free, 3700k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3468 apache 20 0 229m 4248 1244 D 3.0 0.4 0:00.76 httpd 25 root 20 0 0 0 0 D 2.0 0.0 0:09.31 kswapd0 1806 root 20 0 4440m 36m 340 R 1.8 3.6 3:13.39 glusterfs 3487 apache 20 0 219m 3620 1416 D 1.4 0.4 0:00.20 httpd 3485 apache 20 0 214m 3616 1456 D 1.3 0.4 0:00.19 httpd 3481 apache 20 0 219m 3624 1420 D 1.1 0.4 0:00.16 httpd 3484 apache 20 0 217m 3580 1408 D 1.1 0.4 0:00.16 httpd 2483 root 20 0 15784 1244 248 R 1.0 0.1 0:03.63 top Is there something to investigate to better understand the hang problem or tuning parameters to solve ? Prima di stampare, pensa all'ambiente ** Think about the environment before printing Il presente messaggio, inclusi gli eventuali allegati, ha natura aziendale e potrebbe contenere informazioni confidenziali e/o riservate. Chiunque lo ricevesse per errore, ? pregato di avvisare tempestivamente il mittente e di cancellarlo. E? strettamente vietata qualsiasi forma di utilizzo, riproduzione o diffusione non autorizzata del contenuto di questo messaggio o di parte di esso. Pur essendo state assunte le dovute precauzioni per ridurre al minimo il rischio di trasmissione di virus, si suggerisce di effettuare gli opportuni controlli sui documenti allegati al presente messaggio. Non si assume alcuna responsabilit? per eventuali danni o perdite derivanti dalla presenza di virus. Per lo svolgimento delle attivit? di investimento nel Regno Unito, la societ? ? autorizzata da Banca d'Italia ed ? soggetta alla vigilanza limitata della Financial Services Authority. Maggiori informazioni in merito ai poteri di vigilanza della Financial Services Authority sono a disposizione previa richiesta.. Nel Regno Unito Intesa Sanpaolo S.p.A. opera attraverso la filiale di Londra, sita in 90 Queen Street, London EC4N 1SA, registrata in Inghilterra & Galles sotto No.FC016201, Branch No.BR000036 *** This email (including any attachment) is a corporate message and may contain confidential and/or privileged and/or proprietary information. If you have received this email in error, please notify the sender immediately, do not use or share it and destroy this email. Any unauthorised use, copying or disclosure of the material in this email or of parts hereof (including reliance thereon) is strictly forbidden. We have taken precautions to minimize the risk of transmitting software viruses but nevertheless advise you to carry out your own virus checks on any attachment of this message. We accept no liability for loss or damage caused by software viruses. For the conduct of investment business in the UK, the Company is authorised by Banca d?Italia and subject to limited regulation in the UK by the Financial Services Authority. Details about the extent of our regulation by the Financial Services Authority are available from us on request. In the UK Intesa Sanpaolo S.p.A. operates through its London Branch, located at 90 Queen Street, London EC4N 1SA. Registered in England & Wales under No.FC016201, Branch No.BR000036