Displaying 20 results from an estimated 2000 matches similar to: "OpenMPI not compiled with Torque support"
2020 Apr 17
4
HPC question: torques replacement
Dear Experts,
I know there are many HPC (high performance computing) experts on this
list. I'd like to ask your advise.
Almost two decades ago I chose to go with OpenPBS (turned down condor
and other alternatives for whatever reason) for clusters and number
crunchers I support for the Department at the university. It turned out
to be not bad, long lived choice. At some point I smoothly
2007 Oct 04
1
Rmpi_0.5-4 and OpenMPI questions
Many thanks to Dr Yu for updating Rmpi for R 2.6.0, and for starting to make
the changes to support Open MPI.
I have just built the updated Debian package of Rmpi (i.e. r-cran-rmpi) under
R 2.6.0 but I cannot convince myself yet whether it works or not. Simple
tests work. E.g. on my Debian testing box, with Rmpi installed directly
using Open Mpi 1.2.3-2 (from Debian) and using 'r' from
2008 Apr 26
1
Xen and Torque
Dear Xen users.
Have anyone tried to integrate Xen with Torque resource management system?
Could you please help me with an advice for a system I''m developing that
relies on torque.
Let me describe the system first.
The part of the system that talks with torque should request a certain
amount on nodes of a cluster and launch there a virtual machine instance
(one vm instance per host).
2015 May 27
1
serious problem with torque
On Wed, May 27, 2015 10:55 am, Zachary Giles wrote:
> Mark, You might really want to compile torque from source (into an RPM
> if you'd like) and redistribute that. Every version is a little wonky
> and those of us that use(d) it often will poke around until we find a
> version / patch-set that makes us happy and stick with that for a bit.
> It's not an exact science and
2020 Apr 17
0
HPC question: torques replacement
Hey Valeri -
IIRC, midway (and maybe midway2?) use slurm for job scheduling. I don't know how many of your faculty use both your nodes and midway, but maybe consolidating on to a single scheduler would be easier for them?
(also, it's been a while ... hi! ? )
Richard
-----Original Message-----
From: CentOS <centos-bounces at centos.org> On Behalf Of Valeri Galtsev
Sent: Friday,
2015 May 27
5
serious problem with torque
Hi, folks,
The other admin updated torque without testing it on one machine, and
we had Issues. The first I knew was when a user reported qstat
returning
socket_connect_unix failed: 15137
socket_connect_unix failed: 15137
socket_connect_unix failed: 15137
qstat: cannot connect to server (null) (errno=15137) could not connect to
trqauthd
Attempting to restart the pbs_server did the same.
2015 May 27
2
serious problem with torque
Johnny Hughes wrote:
> On 05/27/2015 09:07 AM, m.roth at 5-cent.us wrote:
>> Hi, folks,
>>
>> The other admin updated torque without testing it on one machine, and
>> we had Issues. The first I knew was when a user reported qstat
>> returning
>> socket_connect_unix failed: 15137
>> socket_connect_unix failed: 15137
>> socket_connect_unix
2008 Oct 22
1
torque/psb & snow library
Hello all;
I'm trying to execute parallel jobs trough library snow on a cluster built
through torque/PSB. I'm succesfully obtaining the cluster with:
>system("cat $PBS_NODEFILE > cluster.txt")
>mycluster <- scan(file="cluster.txt",what="character")
>cl <- makeSOCKcluster(mycluster)
The only problem, at the moment, is that if I use
2004 Feb 20
4
GridEngine-OpenSSH integration
Hi,
GridEngine (http://gridengine.sunsource.net, aka. SGE) is an opensource
batch system for clusters. They have an integration with SSH:
http://gridengine.sunsource.net/project/gridengine/howto/qrsh_ssh.html
The idea is that instead of using a modified rsh/rshd, they wanted to
OpenSSH. However, in order to provide full job control, they need to add a
few hooks in OpenSSH. Question:
- Is it OK
2015 May 27
0
serious problem with torque
Mark, You might really want to compile torque from source (into an RPM
if you'd like) and redistribute that. Every version is a little wonky
and those of us that use(d) it often will poke around until we find a
version / patch-set that makes us happy and stick with that for a bit.
It's not an exact science and newer / higher versions are not always better.
As for the downgrade comment:
2013 Dec 09
3
compat-openmpi issues after upgrade to CentOS 6.5
Just wondering if anyone can shed some light into an issue we are having
with compat-openmpi after upgrading CentOS to version 6.5
Some of our cluster applications are dependent on an older version of
OpenMPI, so we are using compat-openmpi. Up to CentOS 6.4 this was
version 1.4.3:
% /usr/lib64/compat-openmpi/bin/mpirun -V
mpirun (Open MPI) 1.4.3
but after the upgrade to CentOS 6.5 it
2015 Dec 17
2
boost-openmpi problems in 7.2
After the 7.2 upgrade?boost-openmpi-1.53.0-25 was installed, along
with?openmpi-1.10.0-10. ?The old openmpi was then replaced with?compat-
openmpi16-1.6.4-10. ?All fine.
Except boost-openmpi has a dependency on the old libmpi.so.1 and the
new openmpi has libmpi.so.12:
# ldd libboost_mpi-mt.so.1.53.0?
linux-vdso.so.1 =>??(0x00007ffe8c182000)
libboost_serialization-mt.so.1.53.0 =>
2009 Mar 27
2
Installing openmpi & lam for use with R
I am trying to install the R package "Rmpi" which needs libmpi. I've
installed openmpi and lam in Centos 5.2:
[root at rab45-1 /]# rpm -qv openmpi
openmpi-1.2.5-5.el5
openmpi-1.2.5-5.el5
[root at rab45-1 /]# rpm -qv lam
lam-7.1.2-14.el5
lam-7.1.2-14.el5
But I get the following error message when trying to install Rmpi:
/usr/bin/ld: skipping incompatible /usr/lib/lam/lib/libmpi.so
2015 May 27
0
serious problem with torque
On 05/27/2015 09:07 AM, m.roth at 5-cent.us wrote:
> Hi, folks,
>
> The other admin updated torque without testing it on one machine, and
> we had Issues. The first I knew was when a user reported qstat
> returning
> socket_connect_unix failed: 15137
> socket_connect_unix failed: 15137
> socket_connect_unix failed: 15137
> qstat: cannot connect to server (null)
2015 May 27
0
serious problem with torque
On Wed, May 27, 2015 9:46 am, m.roth at 5-cent.us wrote:
> Johnny Hughes wrote:
>> On 05/27/2015 09:07 AM, m.roth at 5-cent.us wrote:
>>> Hi, folks,
>>>
>>> The other admin updated torque without testing it on one machine,
>>> and
>>> we had Issues. The first I knew was when a user reported qstat
>>> returning
>>>
2012 Aug 31
2
OpenMPI I/O not working
Hi list,
It appears there is a problem with the OpenMPI I/O library on CentOS 6.2
& 6.3 (package openmpi-1.5.4-1.el6.x86_64).
When I compile the attached program it ends up in the error path since
MPI_File_open returns 16. The corresponding (unhelpful) message is:
MPI_ERR_OTHER: known error not in list
I couldn't find any pointers on the net and the same program works with
OpenMPI
2015 Dec 21
1
boost-openmpi problems in 7.2
Sorry to take so long to reply ...
On Thu, 2015-12-17 at 11:53 -0500, Tony Schreiner wrote:
> Did you load the compat-openmpi environment module?
>
> module load mpi/compat-openmpi16-x86_64
Yes, but you can't load both mpi/openmpi-x86_64 and?mpi/compat-
openmpi16-x86_64 as they are labelled as conflicting.
As I said, if you load just mpi/compat-openmpi16-x86_64 it can't find
2004 Oct 12
1
Sun Gridengine and Centos
G'day.
I'm having a little trouble getting Gridengine 5.3 running
properly on a cluster using Centos 3.3.
Everything appears okay, but my jobs hang in the queue
with the gridengine complaining that node is overloaded,
but there is nothing (unusual) running on the node and
the load is 0.
Has anyone else run into similar problems?
-geoff
2017 Jun 19
1
Rmpi, openMPI editions.
Greetings.
I see a warning message while compiling OpenMPI and would appreciate
it if you tell me what it means.
This warning happens with any OpenMPI > 1.6.5. Even before starting a
cluster, just "sessionInfo" triggers this warning.
I'm pasting in the message from R-3.3.2 (this is MRO).
Do the R parallel package cluster functions violate the warnings described here?
>
2019 Mar 07
2
Dynamically allow users with OpenSSH?
Peter and Jason, thanks for your replies on this.
I was able to accomplish this with a combination of Peter's solution
and setting "AuthorizedKeysFile none" as suggested in the Stack
Overflow question.
On Wed, Mar 6, 2019 at 2:30 PM Peter Moody <mindrot at hda3.com> wrote:
>
> why aren't the authorized keys/principals commands sufficient?
>
> $ getent group