Displaying 20 results from an estimated 10000 matches similar to: "Condor cluster setup advice (pointers) needed"
2020 Apr 17
4
HPC question: torques replacement
Dear Experts,
I know there are many HPC (high performance computing) experts on this
list. I'd like to ask your advise.
Almost two decades ago I chose to go with OpenPBS (turned down condor
and other alternatives for whatever reason) for clusters and number
crunchers I support for the Department at the university. It turned out
to be not bad, long lived choice. At some point I smoothly
2020 Apr 17
0
HPC question: torques replacement
Hey Valeri -
IIRC, midway (and maybe midway2?) use slurm for job scheduling. I don't know how many of your faculty use both your nodes and midway, but maybe consolidating on to a single scheduler would be easier for them?
(also, it's been a while ... hi! ? )
Richard
-----Original Message-----
From: CentOS <centos-bounces at centos.org> On Behalf Of Valeri Galtsev
Sent: Friday,
2009 Aug 30
1
Combining: R + Condor in 2009 ? (+foreach maybe?)
Hello dear R-help group (and David Smith from REvolution),
I would like to perform parallel computing using R with Condor (hopefully
using foreach or other recommended solutions, if available) for some
"Embarrassingly parallel" problem.
I will start by listing what I found so far, and then go on asking for help.
So far I found the a manual by Xianhong Xie from Rnews_2005-2 (see page
2008 Apr 24
1
R and condor
Hello,
I would be extremely grateful if anyone is able to provide any (rather obscure) advice on using R with Condor. I think I'm following Xianhong Xie's instructions (R News 5(2) 13-15) correctly, but my job just stays held in the queue (for days / months). I've checked condor_status to make sure there are plenty of machines available, but can't see any way to attack the
2006 Jan 24
1
Condor and R
Hi,
I was wondering if anyone has successfully linked R against the
Condor libraries so that R can be run as a Condor job in the
"standard" (not "vanilla") universe. The advantage of this would be
that due to checkpointing, jobs can be suspended and transferred to
another node. There is a good overview by Xianhong Xie here:
2011 Nov 16
3
clustering
Hey folks,
I just went through the archives trying to find some info on this but
did not come up with much other than it seems there are a few experts
here on the list.
I have no experience with clustering and have just taken over a Stem
Cell Research Lab that has a Grid Engine cluster. I have not yet dug
into the details of Grid Engine (only been here a week now) but am
just trying to get up
2011 Aug 01
0
Condor Cloud + oVirt Node
I've discussed this a little with Ian off list, but relating to:
https://fedoraproject.org/wiki/Features/Condor_Cloud
Condor Cloud aims at setting up a bunch of Fedora nodes to act as a
mini-cloud managed via the Condor grid infrastructure. So the natural
question was... Why not use oVirt Node for this rather than a full
Fedora OS install?
Pulling condor-cloud and condor RPMs into the core
2008 Apr 26
1
Xen and Torque
Dear Xen users.
Have anyone tried to integrate Xen with Torque resource management system?
Could you please help me with an advice for a system I''m developing that
relies on torque.
Let me describe the system first.
The part of the system that talks with torque should request a certain
amount on nodes of a cluster and launch there a virtual machine instance
(one vm instance per host).
2011 Apr 25
1
return code 10 in the R documentation
Hi Everyone,
I have group of R jobs that should be submitted to the condor when I submit
the jobs to the condor, they don't run and when I checked the Sched Log
files the jobs are exiting with status code 10. Previously, the jobs ran
well on condor but now when I submit the jobs on condor they aren't
running.Can anyone explain the meaning of this?
Here is my submit file:
# Submit file
2009 Jan 16
1
postfix relay and mail host for HPC cluster
I've so far been unable to make the following work:
I have a small cluster with a master node ( called bayes.bc.edu on
the public network, and called master.cl.bc.edu on the internal
10.0.0.0 network). and a number of nodes which are purely on the
private network.
I want the master to receive mail and deliver locally (or use .forward
and alias rules) to messages sent from the nodes
2009 Sep 30
1
Managing random number generating, while using Condor parallel computing
Hello all,
Recently I started playing with running R scripts on the Condor system in my
institute.
(For more on this, have a look at:
Running Long R Jobs with Condor DAG
by Xianhong Xie
link: http://cran.r-project.org/doc/Rnews/Rnews_2005-2.pdf
)
Might someone advice me about the following question:
How should I handle the RNG (random number generation) in the running of
parallel instances of R
2008 Nov 16
1
help.start() displays index.html in emacs (PR#13293)
Full_Name: Juergen Rose
Version: 2.8.0 (2008-10-20)
OS: Linux 2.6.27.4 x86_64 Intel
Submission from: (NULL) (87.185.220.122)
If I start as ordinary user rose R and help.start(), the help is displayed in
emacs. If I do as the user root, file:///tmp/Rtmpyzlc7Y/.R/doc/html/index.html
is shown as expected in a firefox windows. So it seems to be connected with my
private configuration. But I can not
2015 Feb 19
0
Anyone using torque/pbs/munge?
CentOS 6.6
I've got two servers, server1 and hbs (honkin' big server). Both are
running munge, and torque... *separately*. My problem is that I've got
users who want to be able to submit from server1 to hbs. I see that munged
can be pointed to an alternate keyfile... but is there any way to tell
qsub what to use?
(And yes, I got on the torque users' list, and I'm trying
2012 Mar 18
1
install R package on Unix cluster
Hi R users,
Working from a PC, I am trying to install the spatstat package on a Unix cluster. I created the following PBS file to send a job array:
#!/bin/bash -ue
#PBS -m ae
#PBS -M my email
#PBS -J 1-45
#PBS -A my username
#PBS -N job name
#PBS -l resources
#PBS -l walltime
cd $PBS_O_WORKDIR
module load R/2.14.1
R CMD INSTALL -l /path/to/library spatstat
R CMD BATCH
2009 Jul 27
2
Simple resource manager?
I need to serialize computing job requests for two different multicore
machines, and in some near future, for a cluster. I have worked with
SGE but it requires NFS and other administrative steps, plus it seems
a bit overkill for my needs. I guess some simpler queue managing
engine may have been developed, possibly over SSH. Any pointers? TIA.
--
Eduardo Grosclaude
Universidad Nacional del
2012 Dec 04
2
SUGGESTION: Add get/setCores() to 'parallel' (and command line option --max-cores)
In the 'parallel' package there is detectCores(), which tries its best
to infer the number of cores on the current machine. This is useful
if you wish to utilize the *maximum* number of cores on the machine.
Several are using this to set the number of cores when parallelizing,
sometimes also hardcoded within 3rd-party scripts/package code, but
there are several settings where you wish to
2015 May 27
1
serious problem with torque
On Wed, May 27, 2015 10:55 am, Zachary Giles wrote:
> Mark, You might really want to compile torque from source (into an RPM
> if you'd like) and redistribute that. Every version is a little wonky
> and those of us that use(d) it often will poke around until we find a
> version / patch-set that makes us happy and stick with that for a bit.
> It's not an exact science and
2012 Jun 30
1
SSL_connect?? Because of master is not running?
My master is running 12.04
Version: 2.7.11-1ubuntu2
Depends: ruby1.8, puppetmaster-common (= 2.7.11-1ubuntu2)
My client is 10.04
Version: 2.6.3-0ubuntu1~lucid1
Depends: puppet-common (= 2.6.3-0ubuntu1~lucid1), ruby1.8
I followed this tutorial to install Puppet on the client: http://shapeshed.com/setting-up-puppet-on-ubuntu-10-04/
(I didn''t need that tar ball because the "best
2010 Apr 07
6
Consecutive Jobs
Anyone know how to submit jobs to at or anything else that allows jobs
submitted to a queue to be executed consecutively?
I have a series of servers that submits a job via an ssh background
job but I can only have one execute at any given time.
Possibly some clever bash work?
Thanks!
jlc
2004 Feb 04
2
SAMBA and LDAP
Hello..
When i use the command
Smbpasswd ?a carguillo.
It adds this entry in LDAP.
# carguello,Personas,NOVA
dn: uid=carguello,ou=Personas,o=NOVA
uid: carguello
sambaSID: S-1-5-21-2532083711-3846753250-2864659012-2328
sambaPrimaryGroupSID: S-1-5-21-2532083711-3846753250-2864659012-513
sambaPwdCanChange: 1075936258
sambaPwdMustChange: 2147483647
sambaLMPassword: