Dear All, We recently reinstalled our computing cluster. We were using CentOS 5.3 (32 bits). It is now CentOS 6.3 (64 bits), installed from the CentOS 6.2 x64 CD, then upgraded to 6.3. We have some issues with the memory needs of our running jobs. They require much more than before, it may be due to the switch from 32 to 64 bits, but to me this cannot explain the whole difference. Here are our investigations. We used the following simple benchmark: 1. Run a python script and check the memory that it requires (field "VIRT" of the "top" command). This script is: ---- import time time.sleep(30) print("done") ---- 2. Similarly, run and check the memory of a simple bash script: ---- #!/bin/bash sleep 30 echo "done" ---- 3. Open a R session and check the memory used I asked 10 of our users to run these three things on their personal PCs. They are running different distributions (mainly ubuntu, slackware), half of them use a 32 bits system, the other half a 64 one. Here is a summary of the results: Bash script: Avg Min Max 32 bits 5400 4192 9024 64 bits 12900 10000 16528 Python script: Avg Min Max 32 bits 8500 5004 11132 64 bits 32800 30000 36336 R: Avg Min Max 32 bits 26900 21000 33452 64 bits 100200 93008 97496 (as a side remark, the difference between 32 and 64 is surprisingly big to me...). Then we ran the same things on our CentOS cluster, getting surprisingly high results. I installed a machine from scratch with the CentOS CD (6.2 x64) to be sure another component of the cluster was not playing a role. On this freshly installed machine I get the following results: SH: 103MB PYTHON: 114MB R: 200MB So, compared to the highest of our users (among the 64 bits ones), we have a ratio of ~7, ~3, ~2, respectively. It is very problematic for us because many jobs now cannot run properly, because they lack memory on most of our computing nodes. So we really cannot stand the situation... Do you see any reason for this? Do you have suggestions? Sincerely, J?r?mie
On 09/26/12 19:14, J?r?mie Dubois-Lacoste wrote:> Dear All,Hi!> We recently reinstalled our computing cluster. We were using CentOS > 5.3 (32 bits). It is now CentOS 6.3 (64 bits), installed from the > CentOS 6.2 x64 CD, then upgraded to 6.3. > > We have some issues with the memory needs of our running jobs. They > require much more than before, it may be due to the switch from 32 to > 64 bits, but to me this cannot explain the whole difference.it would seem that there is a malloc(glibc) behaviour ... i seen in other list an advice to use : export MALLOC_ARENA_MAX=1 export MALLOC_MMAP_THRESHOLD=131072 in order to decrease the used memory .. HTH, Adrian> > Here are our investigations. > > We used the following simple benchmark: > > 1. Run a python script and check the memory that > it requires (field "VIRT" of the "top" command). > This script is: > ---- > import time > time.sleep(30) > print("done") > ---- > > 2. Similarly, run and check the memory of a simple > bash script: > ---- > #!/bin/bash > sleep 30 > echo "done" > ---- > > 3. Open a R session and check the memory used > > > I asked 10 of our users to run these three things on their personal > PCs. They are running different distributions (mainly ubuntu, > slackware), half of them use a 32 bits system, the other half a 64 > one. Here is a summary of the results: > > Bash script: > Avg Min Max > 32 bits 5400 4192 9024 > 64 bits 12900 10000 16528 > > Python script: > Avg Min Max > 32 bits 8500 5004 11132 > 64 bits 32800 30000 36336 > > R: > Avg Min Max > 32 bits 26900 21000 33452 > 64 bits 100200 93008 97496 > > (as a side remark, the difference between 32 and 64 is surprisingly > big to me...). > > Then we ran the same things on our CentOS cluster, getting > surprisingly high results. I installed a machine from scratch with the > CentOS CD (6.2 x64) to be sure another component of the cluster was > not playing a role. On this freshly installed machine I get the > following results: > SH: 103MB > PYTHON: 114MB > R: 200MB > > So, compared to the highest of our users (among the 64 bits ones), we > have a ratio of ~7, ~3, ~2, respectively. > > > It is very problematic for us because many jobs now cannot run > properly, because they lack memory on most of our computing nodes. > So we really cannot stand the situation... > > Do you see any reason for this? Do you have suggestions? > > Sincerely, > > J?r?mie > _______________________________________________ > CentOS mailing list > CentOS at centos.org > http://lists.centos.org/mailman/listinfo/centos >
J?r?mie Dubois-Lacoste wrote:> Dear All, > > We recently reinstalled our computing cluster. We were using CentOS > 5.3 (32 bits). It is now CentOS 6.3 (64 bits), installed from the > CentOS 6.2 x64 CD, then upgraded to 6.3. > > We have some issues with the memory needs of our running jobs. They > require much more than before, it may be due to the switch from 32 to > 64 bits, but to me this cannot explain the whole difference.Why not? The numbers you post, unless I'm misreading them, are about twice what the 32-bit were. <nsip>> 3. Open a R session and check the memory used<snip>> Bash script: > Avg Min Max > 32 bits 5400 4192 9024 > 64 bits 12900 10000 165285400 * 2 = 10800 4192 * 2 = 8296 9024 * 2 = 18048> > Python script: > Avg Min Max > 32 bits 8500 5004 11132 > 64 bits 32800 30000 363368500 * 2 = 17000 5004 * 2 = 10007 11132 * 2 = 22264 So that ranges from 2-2.5 larger.> > R: > Avg Min Max > 32 bits 26900 21000 33452 > 64 bits 100200 93008 97496Same here, about 2 - 2.5 times larger. More larger variables. <snip>> Then we ran the same things on our CentOS cluster, getting > surprisingly high results. I installed a machine from scratch with the > CentOS CD (6.2 x64) to be sure another component of the cluster was > not playing a role. On this freshly installed machine I get the > following results: > SH: 103MB > PYTHON: 114MB > R: 200MB > > So, compared to the highest of our users (among the 64 bits ones), we > have a ratio of ~7, ~3, ~2, respectively. > > It is very problematic for us because many jobs now cannot run > properly, because they lack memory on most of our computing nodes. > So we really cannot stand the situation... > > Do you see any reason for this? Do you have suggestions? >First, what kind of compute cluster is this? Are you using something like torque, or what? Second, how much memory do you have in each of the nodes? And how many cores? mark
On 09/26/2012 09:14 AM, J?r?mie Dubois-Lacoste wrote:> 1. Run a python script and check the memory that > it requires (field "VIRT" of the "top" command).Don't use VIRT as a reference for memory used. RES is a better indication, but even that won't tell you anything useful about shared memory, and will lead you to believe that a process is using more memory than it is.