Displaying 20 results from an estimated 6000 matches similar to: "Large Corosync/Pacemaker clusters"
2011 Nov 23
1
Corosync init-script broken on CentOS6
Hello all,
I am trying to create a corosync/pacemaker cluster using CentOS 6.0.
However, I'm having a great deal of difficulty doing so.
Corosync has a valid configuration file and an authkey has been generated.
When I run /etc/init.d/corosync I see that only corosync is started.
>From experience working with corosync/pacemaker before, I know that
this is not enough to have a functioning
2012 Nov 02
3
lctl ping of Pacemaker IP
Greetings!
I am working with Lustre-2.1.2 on RHEL 6.2. First I configured it
using the standard defaults over TCP/IP. Everything worked very
nicely usnig a real, static --mgsnode=a.b.c.x value which was the
actual IP of the MGS/MDS system1 node.
I am now trying to integrate it with Pacemaker-1.1.7. I believe I
have most of the set-up completed with a particular exception. The
"lctl
2011 May 10
3
DRBD, Xen, HVM and live migration
Hi,
I want to combine all the above mentioned technologies.
The Linbit pages warn not to use the drbd: VBD with HVM DomUs.
This page however:
http://publications.jbfavre.org/virtualisation/cluster-xen-corosync-pacemaker-drbd-ocfs2.en
(thank you Jean), simply puts two DRBD devices in dual primary mode and
starts Xen DomUs while pointing to the DRBD devices with phy: in the
DomU config files.
2011 Jan 19
8
Xen on two node DRBD cluster with Pacemaker
Hi all,
could somebody point me to what is considered a sound way to offer Xen guests
on a two node DRBD cluster in combination with Pacemaker? I prefer block
devices over images for the DomU''s. I understand that for live migration DRBD
8.3 is needed, but I''m not sure as to what kind of resource
agents/technologies are advised (LVM,cLVM, ...) and what kind of DRBD config
2007 Nov 16
5
Lustre Debug level
Hi,
Lustre manual 1.6 v18 says that that in production lustre debug level
should be set to fairly low. Manual also says that I can verify that
level by running following commands:
# sysctl portals.debug
This gives ne following error
error: ''portals.debug'' is an unknown key
cat /proc/sys/lnet/debug
gives output:
ioctl neterror warning error emerg ha config console
cat
2012 Mar 05
12
Cluster xen
Bonjour,
J''aimerai mettre en place un cluster sous Xen ou XenServer avec 2
serveurs dell R 710.
J''aimerai pouvoir monter un cluster en utilisant l''espace disque entiere
des 2 serveurs cumulés ainsi que la mémoire
Quelles sont vos retour d''expériences et vos configurations?
Merci d''avance
Cordialement
Mat
2007 Mar 20
15
How to bypass failed OST without blocking?
Hi
I want my lustre do such things during OST failed: if some file
has stripe data on th failed OST, any operation on the file will
return IO error without blocking, and also at this moment I can
create and read/write new file or read/write files which have no stripe
data on the failed OST without blocking.
What should I do ? How to configure?
thanks!
swin
-------------- next part
2011 Sep 29
1
CentOS 6: corosync and pacemaker won't stop (patch)
Hi,
I cannot 'halt' my CentOS 6 servers while running corosync+pacemaker.
I believe the runlevels used to stop corosync and pacemaker are not in the
correct order and create the infinite "Waiting for corosync services to
unload..." loop thing.
This is my first time with this cluster technology but apparently pacemaker
has to be stopped /before/ corosync.
Applying the following
2007 Oct 15
3
iptables rules for lustre 1.6.x and MGS recovery procedures
Hi,
I would like to know what TCP/UDP ports should i keep open in my
firewall policies on my MGS server such that I can have my MGS server
fire-walled. Also if in a event of loss of MGT would it be possible
to recreate the MGT without loosing data or bringing the filesystem
down (i.e. by using cached information from MDT''s and OST''s)
Thanks
Anand
2010 Aug 17
18
write RPC & congestion
Hi, thanks for previous help.
I have some question about Lustre RPC and the sequence of events that
occur during large concurrent write() involving many processes and large
data size per process. I understand there is a mechanism of flow
control by credits, but I''m a little unclear on how it works in general
after reading the "networking & io protocol" white paper.
Is
2008 Mar 07
2
Multihomed question: want Lustre over IB andEthernet
Chris,
Perhaps you need to perform some write_conf like command. I''m not sure if this is needed in 1.6 or not.
Shane
----- Original Message -----
From: lustre-discuss-bounces at lists.lustre.org <lustre-discuss-bounces at lists.lustre.org>
To: lustre-discuss <lustre-discuss at lists.lustre.org>
Sent: Fri Mar 07 12:03:17 2008
Subject: Re: [Lustre-discuss] Multihomed
2012 Dec 11
4
Configuring Xen + DRBD + Corosync + Pacemaker
Hi everyone,
I need some help to setup my configuration failover system.
My goal is to have a redundance system using Xen + DRBD + Corosync +
Pacemaker
On Xen I will have one virtual machine. When this computer has network
down, I will do a Live migration to the second computer.
The first configuration I will need is a crossover cable, won''t I? It is
really necessary? Ok, I did it. eth0
2017 Feb 10
2
NUT configuration complicated by Stonith/Fencing cabling
Roger,
Thanks for your reply.
As I understand it, for reliable fencing a node cannot be responsible for fencing itself, as it may not be functioning properly. Hence my "cross over" setup. The direct USB connection from Webserver1 to UPS-Webserver2 means that Webserver1 can fence (cut the power to) Webserver2 if the cluster software decides that it is necessary. If my UPSes were able to
2008 Feb 14
9
how do you mount mountconf (i.e. 1.6) lustre on your servers?
As any of you using version 1.6 of Lustre knows, Lustre servers can now
be started simply my mounting the devices it is using. Even
an /etc/fstab entry can be used if you can have the mount delayed until
the network is started.
Given this change, you have also notices that we have eliminated the
initscript for Lustre that used to exist for releases prior to 1.6.
I''d like to take a
2018 Jul 05
5
two 2-node clusters or one 4-node cluster?
Hello,
I'm planning migration of current two clusters based on CentOS 6.x with
Cman/Rgmanager going to CentOS 7.x and Corosync/Pacemaker.
As the clusters and their services are on the same subnet, and there no
particular security concerns differentiating them, I'm also evaluating the
option to transform the two clusters into a unique 4-node one during the
upgrade.
Currently I'm
2008 Feb 04
32
Luster clients getting evicted
on our cluster that has been running lustre for about 1 month. I have
1 MDT/MGS and 1 OSS with 2 OST''s.
Our cluster uses all Gige and has about 608 nodes 1854 cores.
We have allot of jobs that die, and/or go into high IO wait, strace
shows processes stuck in fstat().
The big problem is (i think) I would like some feedback on it that of
these 608 nodes 209 of them have in dmesg
2012 Nov 26
2
Status of STONITH support in the puppetlabs corosync module?
Greetings -
Hoping to hear from hunner or one of the other maintainers of the
puppetlabs corosync module - there is a note on the git project page that
there is currently no way to configure STONITH. Is this information
current?
If so, has anybody come up with a simple method of managing STONITH with
corosync via puppet?
--
You received this message because you are subscribed to the
2013 Mar 18
1
lustre showing inactive devices
I installed 1 MDS , 2 OSS/OST and 2 Lustre Client. My MDS shows:
[code]
[root at MDS ~]# lctl list_nids
10.94.214.185 at tcp
[root at MDS ~]#
[/code]
On Lustre Client1:
[code]
[root at lustreclient1 lustre]# lfs df -h
UUID bytes Used Available Use% Mounted on
lustre-MDT0000_UUID 4.5G 274.3M 3.9G 6%
/mnt/lustre[MDT:0]
2013 Mar 18
1
OST0006 : inactive device
I installed 1 MDS , 2 OSS/OST and 2 Lustre Client. My MDS shows:
[code]
[root at MDS ~]# lctl list_nids
10.94.214.185 at tcp
[root at MDS ~]#
[/code]
On Lustre Client1:
[code]
[root at lustreclient1 lustre]# lfs df -h
UUID bytes Used Available Use% Mounted on
lustre-MDT0000_UUID 4.5G 274.3M 3.9G 6% /mnt/lustre[MDT:0]
lustre-OST0000_UUID
2010 Jul 08
5
No space left on device on not full filesystem
Hello,
We have running lustre 1.8.1 and have met "No space lest on device"
error when uploading 500 Gb small files (less then 100 Kb each).
The problem seems to depends on the number of files. If we remove one
file, we can create one new file, even with Gb size; but if we haven''t
remove something we can''t create even very little file, as an example
using touch