similar to: Large Corosync/Pacemaker clusters

Displaying 20 results from an estimated 6000 matches similar to: "Large Corosync/Pacemaker clusters"

2011 Nov 23
1
Corosync init-script broken on CentOS6
Hello all, I am trying to create a corosync/pacemaker cluster using CentOS 6.0. However, I'm having a great deal of difficulty doing so. Corosync has a valid configuration file and an authkey has been generated. When I run /etc/init.d/corosync I see that only corosync is started. >From experience working with corosync/pacemaker before, I know that this is not enough to have a functioning
2012 Nov 02
3
lctl ping of Pacemaker IP
Greetings! I am working with Lustre-2.1.2 on RHEL 6.2. First I configured it using the standard defaults over TCP/IP. Everything worked very nicely usnig a real, static --mgsnode=a.b.c.x value which was the actual IP of the MGS/MDS system1 node. I am now trying to integrate it with Pacemaker-1.1.7. I believe I have most of the set-up completed with a particular exception. The "lctl
2011 May 10
3
DRBD, Xen, HVM and live migration
Hi, I want to combine all the above mentioned technologies. The Linbit pages warn not to use the drbd: VBD with HVM DomUs. This page however: http://publications.jbfavre.org/virtualisation/cluster-xen-corosync-pacemaker-drbd-ocfs2.en (thank you Jean), simply puts two DRBD devices in dual primary mode and starts Xen DomUs while pointing to the DRBD devices with phy: in the DomU config files.
2011 Jan 19
8
Xen on two node DRBD cluster with Pacemaker
Hi all, could somebody point me to what is considered a sound way to offer Xen guests on a two node DRBD cluster in combination with Pacemaker? I prefer block devices over images for the DomU''s. I understand that for live migration DRBD 8.3 is needed, but I''m not sure as to what kind of resource agents/technologies are advised (LVM,cLVM, ...) and what kind of DRBD config
2007 Nov 16
5
Lustre Debug level
Hi, Lustre manual 1.6 v18 says that that in production lustre debug level should be set to fairly low. Manual also says that I can verify that level by running following commands: # sysctl portals.debug This gives ne following error error: ''portals.debug'' is an unknown key cat /proc/sys/lnet/debug gives output: ioctl neterror warning error emerg ha config console cat
2012 Mar 05
12
Cluster xen
Bonjour, J''aimerai mettre en place un cluster sous Xen ou XenServer avec 2 serveurs dell R 710. J''aimerai pouvoir monter un cluster en utilisant l''espace disque entiere des 2 serveurs cumulés ainsi que la mémoire Quelles sont vos retour d''expériences et vos configurations? Merci d''avance Cordialement Mat
2007 Mar 20
15
How to bypass failed OST without blocking?
Hi I want my lustre do such things during OST failed: if some file has stripe data on th failed OST, any operation on the file will return IO error without blocking, and also at this moment I can create and read/write new file or read/write files which have no stripe data on the failed OST without blocking. What should I do ? How to configure? thanks! swin -------------- next part
2011 Sep 29
1
CentOS 6: corosync and pacemaker won't stop (patch)
Hi, I cannot 'halt' my CentOS 6 servers while running corosync+pacemaker. I believe the runlevels used to stop corosync and pacemaker are not in the correct order and create the infinite "Waiting for corosync services to unload..." loop thing. This is my first time with this cluster technology but apparently pacemaker has to be stopped /before/ corosync. Applying the following
2007 Oct 15
3
iptables rules for lustre 1.6.x and MGS recovery procedures
Hi, I would like to know what TCP/UDP ports should i keep open in my firewall policies on my MGS server such that I can have my MGS server fire-walled. Also if in a event of loss of MGT would it be possible to recreate the MGT without loosing data or bringing the filesystem down (i.e. by using cached information from MDT''s and OST''s) Thanks Anand
2010 Aug 17
18
write RPC & congestion
Hi, thanks for previous help. I have some question about Lustre RPC and the sequence of events that occur during large concurrent write() involving many processes and large data size per process. I understand there is a mechanism of flow control by credits, but I''m a little unclear on how it works in general after reading the "networking & io protocol" white paper. Is
2008 Mar 07
2
Multihomed question: want Lustre over IB andEthernet
Chris, Perhaps you need to perform some write_conf like command. I''m not sure if this is needed in 1.6 or not. Shane ----- Original Message ----- From: lustre-discuss-bounces at lists.lustre.org <lustre-discuss-bounces at lists.lustre.org> To: lustre-discuss <lustre-discuss at lists.lustre.org> Sent: Fri Mar 07 12:03:17 2008 Subject: Re: [Lustre-discuss] Multihomed
2012 Dec 11
4
Configuring Xen + DRBD + Corosync + Pacemaker
Hi everyone, I need some help to setup my configuration failover system. My goal is to have a redundance system using Xen + DRBD + Corosync + Pacemaker On Xen I will have one virtual machine. When this computer has network down, I will do a Live migration to the second computer. The first configuration I will need is a crossover cable, won''t I? It is really necessary? Ok, I did it. eth0
2017 Feb 10
2
NUT configuration complicated by Stonith/Fencing cabling
Roger, Thanks for your reply. As I understand it, for reliable fencing a node cannot be responsible for fencing itself, as it may not be functioning properly. Hence my "cross over" setup. The direct USB connection from Webserver1 to UPS-Webserver2 means that Webserver1 can fence (cut the power to) Webserver2 if the cluster software decides that it is necessary. If my UPSes were able to
2008 Feb 14
9
how do you mount mountconf (i.e. 1.6) lustre on your servers?
As any of you using version 1.6 of Lustre knows, Lustre servers can now be started simply my mounting the devices it is using. Even an /etc/fstab entry can be used if you can have the mount delayed until the network is started. Given this change, you have also notices that we have eliminated the initscript for Lustre that used to exist for releases prior to 1.6. I''d like to take a
2018 Jul 05
5
two 2-node clusters or one 4-node cluster?
Hello, I'm planning migration of current two clusters based on CentOS 6.x with Cman/Rgmanager going to CentOS 7.x and Corosync/Pacemaker. As the clusters and their services are on the same subnet, and there no particular security concerns differentiating them, I'm also evaluating the option to transform the two clusters into a unique 4-node one during the upgrade. Currently I'm
2008 Feb 04
32
Luster clients getting evicted
on our cluster that has been running lustre for about 1 month. I have 1 MDT/MGS and 1 OSS with 2 OST''s. Our cluster uses all Gige and has about 608 nodes 1854 cores. We have allot of jobs that die, and/or go into high IO wait, strace shows processes stuck in fstat(). The big problem is (i think) I would like some feedback on it that of these 608 nodes 209 of them have in dmesg
2012 Nov 26
2
Status of STONITH support in the puppetlabs corosync module?
Greetings - Hoping to hear from hunner or one of the other maintainers of the puppetlabs corosync module - there is a note on the git project page that there is currently no way to configure STONITH. Is this information current? If so, has anybody come up with a simple method of managing STONITH with corosync via puppet? -- You received this message because you are subscribed to the
2013 Mar 18
1
lustre showing inactive devices
I installed 1 MDS , 2 OSS/OST and 2 Lustre Client. My MDS shows: [code] [root at MDS ~]# lctl list_nids 10.94.214.185 at tcp [root at MDS ~]# [/code] On Lustre Client1: [code] [root at lustreclient1 lustre]# lfs df -h UUID bytes Used Available Use% Mounted on lustre-MDT0000_UUID 4.5G 274.3M 3.9G 6% /mnt/lustre[MDT:0]
2013 Mar 18
1
OST0006 : inactive device
I installed 1 MDS , 2 OSS/OST and 2 Lustre Client. My MDS shows: [code] [root at MDS ~]# lctl list_nids 10.94.214.185 at tcp [root at MDS ~]# [/code] On Lustre Client1: [code] [root at lustreclient1 lustre]# lfs df -h UUID bytes Used Available Use% Mounted on lustre-MDT0000_UUID 4.5G 274.3M 3.9G 6% /mnt/lustre[MDT:0] lustre-OST0000_UUID
2010 Jul 08
5
No space left on device on not full filesystem
Hello, We have running lustre 1.8.1 and have met "No space lest on device" error when uploading 500 Gb small files (less then 100 Kb each). The problem seems to depends on the number of files. If we remove one file, we can create one new file, even with Gb size; but if we haven''t remove something we can''t create even very little file, as an example using touch