similar to: Condor cluster setup advice (pointers) needed

Displaying 20 results from an estimated 10000 matches similar to: "Condor cluster setup advice (pointers) needed"

2020 Apr 17
4
HPC question: torques replacement
Dear Experts, I know there are many HPC (high performance computing) experts on this list. I'd like to ask your advise. Almost two decades ago I chose to go with OpenPBS (turned down condor and other alternatives for whatever reason) for clusters and number crunchers I support for the Department at the university. It turned out to be not bad, long lived choice. At some point I smoothly
2020 Apr 17
0
HPC question: torques replacement
Hey Valeri - IIRC, midway (and maybe midway2?) use slurm for job scheduling. I don't know how many of your faculty use both your nodes and midway, but maybe consolidating on to a single scheduler would be easier for them? (also, it's been a while ... hi! ? ) Richard -----Original Message----- From: CentOS <centos-bounces at centos.org> On Behalf Of Valeri Galtsev Sent: Friday,
2009 Aug 30
1
Combining: R + Condor in 2009 ? (+foreach maybe?)
Hello dear R-help group (and David Smith from REvolution), I would like to perform parallel computing using R with Condor (hopefully using foreach or other recommended solutions, if available) for some "Embarrassingly parallel" problem. I will start by listing what I found so far, and then go on asking for help. So far I found the a manual by Xianhong Xie from Rnews_2005-2 (see page
2008 Apr 24
1
R and condor
Hello, I would be extremely grateful if anyone is able to provide any (rather obscure) advice on using R with Condor. I think I'm following Xianhong Xie's instructions (R News 5(2) 13-15) correctly, but my job just stays held in the queue (for days / months). I've checked condor_status to make sure there are plenty of machines available, but can't see any way to attack the
2006 Jan 24
1
Condor and R
Hi, I was wondering if anyone has successfully linked R against the Condor libraries so that R can be run as a Condor job in the "standard" (not "vanilla") universe. The advantage of this would be that due to checkpointing, jobs can be suspended and transferred to another node. There is a good overview by Xianhong Xie here:
2011 Nov 16
3
clustering
Hey folks, I just went through the archives trying to find some info on this but did not come up with much other than it seems there are a few experts here on the list. I have no experience with clustering and have just taken over a Stem Cell Research Lab that has a Grid Engine cluster. I have not yet dug into the details of Grid Engine (only been here a week now) but am just trying to get up
2011 Aug 01
0
Condor Cloud + oVirt Node
I've discussed this a little with Ian off list, but relating to: https://fedoraproject.org/wiki/Features/Condor_Cloud Condor Cloud aims at setting up a bunch of Fedora nodes to act as a mini-cloud managed via the Condor grid infrastructure. So the natural question was... Why not use oVirt Node for this rather than a full Fedora OS install? Pulling condor-cloud and condor RPMs into the core
2008 Apr 26
1
Xen and Torque
Dear Xen users. Have anyone tried to integrate Xen with Torque resource management system? Could you please help me with an advice for a system I''m developing that relies on torque. Let me describe the system first. The part of the system that talks with torque should request a certain amount on nodes of a cluster and launch there a virtual machine instance (one vm instance per host).
2011 Apr 25
1
return code 10 in the R documentation
Hi Everyone, I have group of R jobs that should be submitted to the condor when I submit the jobs to the condor, they don't run and when I checked the Sched Log files the jobs are exiting with status code 10. Previously, the jobs ran well on condor but now when I submit the jobs on condor they aren't running.Can anyone explain the meaning of this? Here is my submit file: # Submit file
2009 Jan 16
1
postfix relay and mail host for HPC cluster
I've so far been unable to make the following work: I have a small cluster with a master node ( called bayes.bc.edu on the public network, and called master.cl.bc.edu on the internal 10.0.0.0 network). and a number of nodes which are purely on the private network. I want the master to receive mail and deliver locally (or use .forward and alias rules) to messages sent from the nodes
2009 Sep 30
1
Managing random number generating, while using Condor parallel computing
Hello all, Recently I started playing with running R scripts on the Condor system in my institute. (For more on this, have a look at: Running Long R Jobs with Condor DAG by Xianhong Xie link: http://cran.r-project.org/doc/Rnews/Rnews_2005-2.pdf ) Might someone advice me about the following question: How should I handle the RNG (random number generation) in the running of parallel instances of R
2008 Nov 16
1
help.start() displays index.html in emacs (PR#13293)
Full_Name: Juergen Rose Version: 2.8.0 (2008-10-20) OS: Linux 2.6.27.4 x86_64 Intel Submission from: (NULL) (87.185.220.122) If I start as ordinary user rose R and help.start(), the help is displayed in emacs. If I do as the user root, file:///tmp/Rtmpyzlc7Y/.R/doc/html/index.html is shown as expected in a firefox windows. So it seems to be connected with my private configuration. But I can not
2015 Feb 19
0
Anyone using torque/pbs/munge?
CentOS 6.6 I've got two servers, server1 and hbs (honkin' big server). Both are running munge, and torque... *separately*. My problem is that I've got users who want to be able to submit from server1 to hbs. I see that munged can be pointed to an alternate keyfile... but is there any way to tell qsub what to use? (And yes, I got on the torque users' list, and I'm trying
2012 Mar 18
1
install R package on Unix cluster
Hi R users, Working from a PC, I am trying to install the spatstat package on a Unix cluster. I created the following PBS file to send a job array: #!/bin/bash -ue #PBS -m ae #PBS -M my email #PBS -J 1-45 #PBS -A my username #PBS -N job name #PBS -l resources #PBS -l walltime cd $PBS_O_WORKDIR module load R/2.14.1 R CMD INSTALL -l /path/to/library spatstat R CMD BATCH
2009 Jul 27
2
Simple resource manager?
I need to serialize computing job requests for two different multicore machines, and in some near future, for a cluster. I have worked with SGE but it requires NFS and other administrative steps, plus it seems a bit overkill for my needs. I guess some simpler queue managing engine may have been developed, possibly over SSH. Any pointers? TIA. -- Eduardo Grosclaude Universidad Nacional del
2012 Dec 04
2
SUGGESTION: Add get/setCores() to 'parallel' (and command line option --max-cores)
In the 'parallel' package there is detectCores(), which tries its best to infer the number of cores on the current machine. This is useful if you wish to utilize the *maximum* number of cores on the machine. Several are using this to set the number of cores when parallelizing, sometimes also hardcoded within 3rd-party scripts/package code, but there are several settings where you wish to
2015 May 27
1
serious problem with torque
On Wed, May 27, 2015 10:55 am, Zachary Giles wrote: > Mark, You might really want to compile torque from source (into an RPM > if you'd like) and redistribute that. Every version is a little wonky > and those of us that use(d) it often will poke around until we find a > version / patch-set that makes us happy and stick with that for a bit. > It's not an exact science and
2012 Jun 30
1
SSL_connect?? Because of master is not running?
My master is running 12.04 Version: 2.7.11-1ubuntu2 Depends: ruby1.8, puppetmaster-common (= 2.7.11-1ubuntu2) My client is 10.04 Version: 2.6.3-0ubuntu1~lucid1 Depends: puppet-common (= 2.6.3-0ubuntu1~lucid1), ruby1.8 I followed this tutorial to install Puppet on the client: http://shapeshed.com/setting-up-puppet-on-ubuntu-10-04/ (I didn''t need that tar ball because the "best
2010 Apr 07
6
Consecutive Jobs
Anyone know how to submit jobs to at or anything else that allows jobs submitted to a queue to be executed consecutively? I have a series of servers that submits a job via an ssh background job but I can only have one execute at any given time. Possibly some clever bash work? Thanks! jlc
2004 Feb 04
2
SAMBA and LDAP
Hello.. When i use the command Smbpasswd ?a carguillo. It adds this entry in LDAP. # carguello,Personas,NOVA dn: uid=carguello,ou=Personas,o=NOVA uid: carguello sambaSID: S-1-5-21-2532083711-3846753250-2864659012-2328 sambaPrimaryGroupSID: S-1-5-21-2532083711-3846753250-2864659012-513 sambaPwdCanChange: 1075936258 sambaPwdMustChange: 2147483647 sambaLMPassword: